
Sewon Min
@sewon__min
Incoming faculty @Berkeley_EECS @berkeley_ai || Research scientist at @allen_ai || PhD from @uwcse @uwnlp
ID: 928289778960707584
http://sewonmin.com 08-11-2017 15:55:04
972 Tweet
11,11K Followers
739 Following

BAIR faculty Stuart Russell, Dan Klein, Alane Suhr, Ken Goldberg, and Sewon Min weigh in on the future of LLMs, synthetic data, and the road ahead ⬇️ alumni.berkeley.edu/california-mag…

using a fraction of the compute and building best-in-class models is such aura Danqi Chen #ICLR25




🚨Announcing SCALR @ COLM 2025 — Call for Papers!🚨 The 1st Workshop on Test-Time Scaling and Reasoning Models (SCALR) is coming to Conference on Language Modeling in Montreal this October! This is the first workshop dedicated to this growing research area. 🌐 scalr-workshop.github.io



Tensor Templar The important breakthrough is that a lot of the “RL just works” noise has little to do with RL and has more to do with “qwen mid-training for math and coding makes the model very receptive for the same jumps on math and coding over and over”

Thrilled to announce that I will be joining UT Austin Computer Science at UT Austin as an assistant professor in fall 2026! I will continue working on language models, data challenges, learning paradigms, & AI for innovation. Looking forward to teaming up with new students & colleagues! 🤠🤘


Congratulations to University of Washington #UWAllen Ph.D. grads Ashish Sharma & Sewon Min, Association for Computing Machinery Doctoral Dissertation Award honorees! Sharma won for #AI tools for mental health; Min received honorable mention for efficient, flexible language models. #ThisIsUW news.cs.washington.edu/2025/06/04/all…

🎓 Congrats to Ashish Sharma, University of Washington on receiving the ACM Doctoral Dissertation Award for his dissertation, "Human-AI Collaboration to Support Mental Health and Well Being." 👏 Honorable Mentions: Alexander Kelley, University of Illinois Sewon Min, UC Berkeley



I always found it puzzling how language models learn so much from next-token prediction, while video models learn so little from next frame prediction. Maybe it's because LLMs are actually brain scanners in disguise. Idle musings in my new blog post: sergeylevine.substack.com/p/language-mod…

fwiw, I think Prof. Percy Liang and the CS336 team nailed this: Sutton’s Bitter Lesson is often misinterpreted as “scale is all that matters” and/or “algorithms don’t matter.” The more accurate – and useful – interpretation is: what matters are the algorithms that scale.



Thanks, Will Knight, for covering our work!!

Some updates 🚨 I finished my Ph.D at Allen School in June 2025! After a year at AI2 as a Research Scientist, I am joining CMU Language Technologies Institute | @CarnegieMellon & Machine Learning Dept. at Carnegie Mellon (courtesy) as an Assistant Professor in Fall 2026. The journey, acknowledgments & recruiting in 🧵
