Simon Shaolei Du (@simonshaoleidu) 's Twitter Profile
Simon Shaolei Du

@simonshaoleidu

Assistant Professor @uwcse. Postdoc @the_IAS. PhD in machine learning @mldcmu.

ID: 913981622193664000

linkhttp://simonshaoleidu.com calendar_today30-09-2017 04:19:34

497 Tweet

7,7K Followers

2,2K Following

Avinandan Bose (@avibose22) 's Twitter Profile Photo

🧠 Your LLM should model how you think, not reduce you to preassigned traits πŸ“’ Introducing LoRe: a low-rank reward modeling framework for personalized RLHF ❌ Demographic grouping/handcrafted traits βœ… Infers implicit preferences βœ… Few-shot adaptation πŸ“„ arxiv.org/abs/2504.14439

🧠 Your LLM should model how you think, not reduce you to preassigned traits
πŸ“’ Introducing LoRe: a low-rank reward modeling framework for personalized RLHF
❌ Demographic grouping/handcrafted traits
βœ… Infers implicit preferences
βœ… Few-shot adaptation
πŸ“„ arxiv.org/abs/2504.14439
Shane Gu (@shaneguml) 's Twitter Profile Photo

Famous LLM researcher Bruce Lee quote: "I fear not the LLM who has practiced 10,000 questions once, but I fear the LLM who has practiced one question 10,000 times."

Famous LLM researcher Bruce Lee quote: "I fear not the LLM who has practiced 10,000 questions once, but I fear the LLM who has practiced one question 10,000 times."
Simon Shaolei Du (@simonshaoleidu) 's Twitter Profile Photo

Even with the same vision encoder, generative VLMs (LLaVA) can extract more information than CLIP. Why? Check out our #ACL2025NLP paper led by Siting Li : arxiv.org/pdf/2411.05195

Simon Shaolei Du (@simonshaoleidu) 's Twitter Profile Photo

PPO vs. DPO? πŸ€” Our new paper proves that it depends on whether your models can represent the optimal policy and/or reward. Paper: arxiv.org/abs/2505.19770 Led by Ruizhe Shi Minhak Song

Allen School (@uwcse) 's Twitter Profile Photo

Congratulations to University of Washington #UWAllen Ph.D. grads Ashish Sharma & Sewon Min, Association for Computing Machinery Doctoral Dissertation Award honorees! Sharma won for #AI tools for mental health; Min received honorable mention for efficient, flexible language models. #ThisIsUW news.cs.washington.edu/2025/06/04/all…

Yiping Wang (@ypwang61) 's Twitter Profile Photo

I'll present StoryEval tomorrow at CVPR, happy to catch up with new and old friends! πŸ“ExHall D, Poster #284 ⌚10.30am - 12.30 pm at 6.14

Avinandan Bose (@avibose22) 's Twitter Profile Photo

🚨 Code is live! Check out LoRe – a modular, lightweight codebase for personalized reward modeling from user preferences. πŸ“¦ Few-shot personalization πŸ“Š Benchmarks: TLDR, PRISM, PersonalLLM πŸ‘‰ github.com/facebookresear… Huge thanks to AI at Meta for open-sourcing this research πŸ™Œ

Paresh Chaudhary (@pareshrc) 's Twitter Profile Photo

1/6 Current AI agent training methods fail to capture diverse behaviors needed for human-AI cooperation. GOAT (Generative Online Adversarial Training) uses online adversarial training to explore a pre-trained generative model's latent space to generate realistic yet challenging