
Elan Rosenfeld
@elanrosenfeld
Researcher @GoogleAI, PhD @CarnegieMellon
ID: 1045447599090794497
http://cs.cmu.edu/~elan 27-09-2018 22:58:25
332 Tweet
1,1K Followers
193 Following




🗣️ “Next-token predictors can’t plan!” ⚔️ “False! Every distribution is expressible as product of next-token probabilities!” 🗣️ In work w/ Gregor Bachmann , we carefully flesh out this emerging, fragmented debate & articulate a key new failure. 🔴 arxiv.org/abs/2403.06963





If you are submitting to NeurIPS: reward yourself with a talk after! If you are not: no excuses not to attend this talk! Either way, join us this week at CARE to hear Sorawit (James) Saengkyongam speak about representation learning for extrapolation. portal.valencelabs.com/events/post/id…







I learned a lot about LLM training dynamics during this project, led by Sara Kangaslahti. Surprisingly, we can find meaningful + interpretable breakthroughs in model capabilities which are non-obvious from just the aggregated loss. Check out the thread by Naomi Saphra for details!


Official results are in - Gemini achieved gold-medal level in the International Mathematical Olympiad! 🏆 An advanced version was able to solve 5 out of 6 problems. Incredible progress - huge congrats to Thang Luong and the team! deepmind.google/discover/blog/…