Han Guo
@hanguo97
PhD Student @MIT_CSAIL | Past: @LTIatCMU @MITIBMLab @UNCNLP, @SFResearch, @BaiduResearch | Machine Learning, NLP.
ID: 769279457387540480
http://han-guo.info 26-08-2016 21:04:50
2,2K Tweet
2,2K Followers
4,4K Following
Happy to share that we have two papers got accepted by NeurIPS Conference 2025 as #Spotlight papers! 1. 👼Angles Don’t Lie: Unlocking Training-Efficient RL from a Model’s Own Signals TL;DR: Token angles—the model’s self-generated signals—can reveal how well it grasps the data. By
Modular Manifolds: managed metrics (ie: Muon) meets manifolds, making matrix magnitudes manageable Or M^11 as I like to call it. Check out this great post by Jeremy Bernstein! It introduces some cool new ideas but also doubles as a great intro to optimization beyond Adam.
Introducing LLM.Q: Quantized LLM training in pure CUDA/C++! With LLM.Q, you can train your own LLM on consumer GPUs with natively quantized matmuls, on single workstations. No datacenter required. Inspired by Andrej Karpathy's llm.c, but natively quantized.
Excited to introduce Dreamer 4, an agent that learns to solve complex control tasks entirely inside of its scalable world model! 🌎🤖 Dreamer 4 pushes the frontier of world model accuracy, speed, and learning complex tasks from offline datasets. co-led with Wilson Yan
Even with full-batch gradients, DL optimizers defy classical optimization theory, as they operate at the *edge of stability.* With Alex Damian, we introduce "central flows": a theoretical tool to analyze these dynamics that makes accurate quantitative predictions on real NNs.