
Zhiyuan Li
@zhiyuanli_
Assistant Professor @TTIC_Connect. Previously Postdoc @Stanford and PhD @PrincetonCS. Deep Learning Theory.
ID: 760226120335781895
http://zhiyuanli.ttic.edu 01-08-2016 21:30:06
69 Tweet
1,1K Followers
308 Following




Exciting new work led by amazing Kaiyue Wen on theoretical justification for the recent popular WSD schedule! This is based an interesting and novel assumption of training loss called "River Valley", which is useful to explain hidden progress in large learning rate training.




Don't miss the poster presentation for this by Nishanth Dikkala at #ICLR2025 tomorrow to learn more about our work on looped Transformers for reasoning! Poster #272: Hall 3 + 2B. Sat 26th April, 10am - 12:30pm Singapore time


Excited to share our new method ✏️PENCIL! It decouples space complexity from time complexity in LLM reasoning, by allowing model to recursively erase and generate thoughts. Joint work w. my student Chenxiao Yang , along with Nati Srebro Bartom and David McAllester.