Adam Fisch (@adamjfisch) Twitter Tweets • TwiCopy

Aviral Kumar

a year ago

This work was led by the amazing Amrith Setlur during his internship at Google Research. With Chirag Nagpal, Adam Fisch, Xinyang (Young) Geng, Jacob Eisenstein, Rishabh Agarwal, Alekh Agarwal, and Jonathan Berant.

thumb_up_off_alt7

chat_bubble_outline1

repeat2

shareShare

Checkout our new paper on Recursive Transformers. Great having Sangmin here at Google DeepMind to lead it! Particularly excited about the potential for continuous depth wise batching for much better early-exiting batch throughout.

thumb_up_off_alt31

chat_bubble_outline2

repeat4

shareShare

Anastasios Nikolas Angelopoulos

@ml_angelopoulos

a year ago

🚨 New Textbook on Conformal Prediction 🚨 arxiv.org/abs/2411.11824 “The goal of this book is to teach the reader about the fundamental technical arguments that arise when researching conformal prediction and related questions in distribution-free inference. Many of these

thumb_up_off_alt425

chat_bubble_outline12

repeat90

shareShare

Stephen Bates

@stats_stephen

a year ago

Important topic, but this is more of a quick-start guide. For cutting-edge research on LLM evals, see these papers using Prediction-Powered Inference to incorporate synthetic data and model predictions for narrower CIs. 👇 Gemini already knows about them!

thumb_up_off_alt54

chat_bubble_outline2

repeat6

shareShare

Jonathan Berant

@jonathanberant

7 months ago

Hi ho! New work: arxiv.org/pdf/2503.14481 With amazing collabs Jacob Eisenstein Reza Aghajani Adam Fisch dheeru dua Fantine Huot ✈️ ICLR 25 Mirella Lapata Vicky Zayats Some things are easier to learn in a social setting. We show agents can learn to faithfully express their beliefs (along... 1/3

Hi ho!

New work: arxiv.org/pdf/2503.14481
With amazing collabs <a href="/jacobeisenstein/">Jacob Eisenstein</a> <a href="/jdjdhekchbdjd/">Reza Aghajani</a> <a href="/adamjfisch/">Adam Fisch</a> <a href="/ddua17/">dheeru dua</a> <a href="/fantinehuot/">Fantine Huot ✈️ ICLR 25</a> <a href="/mlapata/">Mirella Lapata</a> <a href="/vicky_zayats/">Vicky Zayats</a>

Some things are easier to learn in a social setting. We show agents can learn to faithfully express their beliefs (along... 1/3

thumb_up_off_alt61

chat_bubble_outline2

repeat17

shareShare

Adam Fisch

@adamjfisch

4 months ago

Accepted to COLM 2025!

thumb_up_off_alt19

chat_bubble_outline1

repeat1

shareShare

Deedy

@deedydas

4 months ago

Google DeepMind just dropped this new LLM model architecture called Mixture-of-Recursions. It gets 2x inference speed, reduced training FLOPs and ~50% reduced KV cache memory. Really interesting read. Has potential to be a Transformers killer.

thumb_up_off_alt3,3K

chat_bubble_outline76

repeat447

shareShare

Sangmin Bae

@raymin0223

4 months ago

Thanks for sharing our work, Deedy MoR is a new arch that upgrades Recursive Transformers and Early-Exiting algorithms. Simple pretraining with router, and faster inference speed and lower KV caches! Post for details and codes will be released very soon. Stay tuned! ☺️

thumb_up_off_alt44

chat_bubble_outline0

repeat12

shareShare

Reza Bayat

@reza_byt

4 months ago

📄 New Paper Alert! ✨ 🚀Mixture of Recursions (MoR): Smaller models • Higher accuracy • Greater throughput Across 135 M–1.7 B params, MoR carves a new Pareto frontier: equal training FLOPs yet lower perplexity, higher few‑shot accuracy, and more than 2x throughput.

thumb_up_off_alt237

chat_bubble_outline2

repeat55

shareShare

Yujin Kim

@yujin301300

3 months ago

Introducing our new work: 🚀Mixture-of-Recursions! 🪄We propose a novel framework that dynamically allocates recursion depth per token. 🪄MoR is an efficient architecture with fewer params, reduced KV cache memory, and 2× greater throughput— maintaining comparable performance!

thumb_up_off_alt328

chat_bubble_outline9

repeat58

shareShare

Yujin Kim

@yujin301300

3 months ago

Huge thanks ❤️ to my awesome co-first authors Sangmin Bae and Reza Bayat @ ICML, and to all our collaborators and supervisors who made this possible: Sungnyun Kim , Jen Ha @ ICML 2025, Tal Schuster, Adam Fisch, Hrayr Harutyunyan, Ziwei Ji, Aaron Courville, and Se-Young Yun.

thumb_up_off_alt9

chat_bubble_outline0

repeat2

shareShare

Sangmin Bae

@raymin0223

3 months ago

✨Huge thanks for interest in Mixture-of-Recursions! Codes are officially out! It's been a long journey exploring Early-exiting with Recursive Architecture. I'll soon post my 👨‍🎓PhD thesis on Adaptive Computation too! Code: github.com/raymin0223/mix… Paper: arxiv.org/abs/2507.10524

thumb_up_off_alt263

chat_bubble_outline6

repeat60

shareShare

Sangmin Bae

@raymin0223

3 months ago

🏋️‍♂️This unified MoR framework has very good performance and faster speeds. Check it out and ask any questions! Huge thanks to my awesome co-authors: Yujin Kim Reza Bayat Sungnyun Kim Jen Ha @ ICML 2025 Tal Schuster Adam Fisch Hrayr Harutyunyan Ziwei Ji Aaron Courville Se-Young Yun! 🥰

thumb_up_off_alt6

chat_bubble_outline1

repeat1

shareShare