Rosie Zhao (@rosieyzh) 's Twitter Profile
Rosie Zhao

@rosieyzh

PhD student with @hseas ML Foundations Group. Previously @mcgillu.

ID: 1189002232366284800

linkhttps://rosieyzh.github.io/ calendar_today29-10-2019 02:13:28

36 Tweet

469 Followers

496 Following

Depen Morwani (@depen_morwani) 's Twitter Profile Photo

Excited to share our recent work at #NeurIPS2023 on the nature of Simplicity Bias (SB) in 1-Hidden Layer Neural Networks (NNs) with Jatin Batra Prateek Jain Praneeth Netrapalli. SB is known to be one of the reasons behind brittleness of neural networks towards distribution shift (1/5)

Alexandre L.-Piché (@alexpiche_) 's Twitter Profile Photo

Introducing ReSearch: An iterative self-reflection algorithm that enhances LLM's self-restraint abilities: • Encouraging abstention when uncertain • Producing accurate, informative content when confident Result: Significant accuracy boost for Llama2 7B Chat and Mistral 7B! 🚀

Sham Kakade (@shamkakade6) 's Twitter Profile Photo

1/n Introducing SOAP (ShampoO with Adam in the Preconditioner's eigenbasis): A deep learning optimization algorithm that applies Adam in Shampoo's eigenbasis. SOAP outperforms both AdamW and Shampoo in language model pretraining.

1/n Introducing SOAP (ShampoO with Adam in the Preconditioner's eigenbasis): A deep learning optimization algorithm that applies Adam in Shampoo's eigenbasis. SOAP outperforms both AdamW and Shampoo in language model pretraining.
Naomi Saphra hiring a lab 🧈🪰 (@nsaphra) 's Twitter Profile Photo

Ever looked at LLM skill emergence and thought 70B parameters was a magic number? Our new paper shows sudden breakthroughs are samples from bimodal performance distributions across seeds. Observed accuracy jumps abruptly while the underlying accuracy DISTRIBUTION changes slowly!

Ever looked at LLM skill emergence and thought 70B parameters was a magic number? Our new paper shows sudden breakthroughs are samples from bimodal performance distributions across seeds. Observed accuracy jumps abruptly while the underlying accuracy DISTRIBUTION changes slowly!
Association for Computing Machinery (@theofficialacm) 's Twitter Profile Photo

Meet the recipients of the 2024 ACM A.M. Turing Award, Andrew G. Barto and Richard S. Sutton! They are recognized for developing the conceptual and algorithmic foundations of reinforcement learning. Please join us in congratulating the two recipients! bit.ly/4hpdsbD

David Alvarez Melis (@elmelis) 's Twitter Profile Photo

🚨 New preprint! TL;DR: Backtracking is not the "holy grail" for smarter LLMs. It’s praised for helping models “fix mistakes” and improve reasoning—but is it really the best use of test-time compute? 🤔

Eran Malach (@eranmalach) 's Twitter Profile Photo

How does RL improve performance on math reasoning? Studying RL from pretrained models is hard, as behavior depends on choice of base model. 🚨 In our new work, we train models *from scratch* to study the effect of the data mix on the behavior of RL. arxiv.org/abs/2504.07912

How does RL improve performance on math reasoning? Studying RL from pretrained models is hard, as behavior depends on choice of base model. 🚨 In our new work, we train models *from scratch* to study the effect of the data mix on the behavior of RL. arxiv.org/abs/2504.07912
Rosie Zhao (@rosieyzh) 's Twitter Profile Photo

Excited to be attending 🇸🇬#ICLR2025! My DMs are open, please reach out to chat about LLM reasoning/optimization/training dynamics! Will be presenting a study on diagonal preconditioning optimizers for LLM pretraining (arxiv.org/abs/2407.07972) and SOAP (arxiv.org/abs/2409.11321)

Excited to be attending 🇸🇬#ICLR2025! My DMs are open, please reach out to chat about LLM reasoning/optimization/training dynamics!

Will be presenting a study on diagonal preconditioning optimizers for LLM pretraining (arxiv.org/abs/2407.07972) and SOAP (arxiv.org/abs/2409.11321)
Bingbin Liu (@bingbinl) 's Twitter Profile Photo

Excited to announce MOSS, our ICML workshop focused on discoveries at small scale! We believe there's tremendous potential & creativity in research done with limited resources and would love to hear your ideas. The submission (due May 22nd) can literally be a Jupyter notebook! :)

Nived Rajaraman (@nived_rajaraman) 's Twitter Profile Photo

Announcing the first workshop on Foundations of Post-Training (FoPT) at COLT 2025! 📝 Soliciting abstracts/posters exploring theoretical & practical aspects of post-training and RL with language models! │ 🗓️ Deadline: May 19, 2025

Announcing the first workshop on Foundations of Post-Training (FoPT) at COLT 2025!

📝 Soliciting abstracts/posters exploring theoretical & practical aspects of post-training and RL with language models!
│
🗓️ Deadline: May 19, 2025
Antonio Orvieto (@orvieto_antonio) 's Twitter Profile Photo

Adam is similar to many algorithms, but cannot be effectively replaced by any simpler variant in LMs. The community is starting to get the recipe right, but what is the secret sauce? Robert M. Gower 🇺🇦 and I found that it has to do with the beta parameters and variational inference.

Adam is similar to many algorithms, but cannot be effectively replaced by any simpler variant in LMs.
The community is starting to get the recipe right, but what is the secret sauce?

<a href="/gowerrobert/">Robert M. Gower 🇺🇦</a> and I found that it has to do with the beta parameters and variational inference.