Audrey Huang (@auddery) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

New work on understanding preference fine-tuning/RLHF -- we analyze online and offline preference fine-tuning methods via the theoretical tool of dataset coverage and reveal the importance of online unlabeled data. Plus, a new algorithm! (1/n)

thumb_up_off_alt93

chat_bubble_outline1

repeat14

shareShare

Dylan Foster 🐢

@canondetortugas

a year ago

New preprint: Is Behavior Cloning All You Need? Understanding Horizon in Imitation Learning We show that good old fashioned behavior cloning enjoys horizon-independent sample complexity for imitation learning—provided you use the log loss! arxiv.org/abs/2407.15007 Thread below

thumb_up_off_alt86

chat_bubble_outline2

repeat28

shareShare

Dylan Foster 🐢

@canondetortugas

7 months ago

Given a high-quality verifier, language model accuracy can be improved by scaling inference-time compute (e.g., w/ repeated sampling). When can we expect similar gains without an external verifier? New paper: Self-Improvement in Language Models: The Sharpening Mechanism

thumb_up_off_alt255

chat_bubble_outline3

repeat49

shareShare

Dylan Foster 🐢

@canondetortugas

7 months ago

Check out the paper for more details: arxiv.org/abs/2412.01951 Joint work w/ Audrey Huang (Audrey Huang), Adam Block, Dhruv Rohatgi, Cyril Zhang (Cyril Zhang), Max Simchowitz (Max Simchowitz), Jordan Ash (Jordan Ash), and Akshay Krishnamurthy

thumb_up_off_alt18

chat_bubble_outline1

repeat2

shareShare

Dylan Foster 🐢

@canondetortugas

5 months ago

Our work on language model self-improvement will appear as an Oral at ICLR! See you in Singapore! x.com/canondetortuga…

thumb_up_off_alt94

chat_bubble_outline1

repeat12

shareShare

yingzhen

@liyzhen2

3 months ago

#AISTATS2025 day 3 keynote by Akshay Krishnamurthy about how to do theory research on inference time compute 👍 AISTATS Conference

#AISTATS2025 day 3 keynote by Akshay Krishnamurthy about how to do theory research on inference time compute 👍
<a href="/aistats_conf/">AISTATS Conference</a>

thumb_up_off_alt136

chat_bubble_outline0

repeat7

shareShare

Dylan Foster 🐢

@canondetortugas

3 months ago

Akshay presenting InferenceTimePessimism, a new alternative to BoN sampling for scaling test-time compute. From our recent paper here: arxiv.org/abs/2503.21878

thumb_up_off_alt67

chat_bubble_outline2

repeat8

shareShare

Dylan Foster 🐢

@canondetortugas

3 months ago

Is Best-of-N really the best we can do for language model inference? New algo & paper: 🚨InferenceTimePessimism🚨 Led by the amazing Audrey Huang (Audrey Huang) with Adam Block, Qinghua Liu, Nan Jiang (Nan Jiang), and Akshay Krishnamurthy. Appearing at ICML '25. 1/11

Is Best-of-N really the best we can do for language model inference?

New algo & paper: 🚨InferenceTimePessimism🚨

Led by the amazing Audrey Huang (<a href="/auddery/">Audrey Huang</a>) with Adam Block, Qinghua Liu, Nan Jiang (<a href="/nanjiang_cs/">Nan Jiang</a>), and Akshay Krishnamurthy. Appearing at ICML '25.

1/11

thumb_up_off_alt192

chat_bubble_outline2

repeat24

shareShare

Nived Rajaraman

@nived_rajaraman

3 months ago

Announcing the first workshop on Foundations of Post-Training (FoPT) at COLT 2025! 📝 Soliciting abstracts/posters exploring theoretical & practical aspects of post-training and RL with language models! │ 🗓️ Deadline: May 19, 2025

thumb_up_off_alt74

chat_bubble_outline1

repeat24

shareShare

Dylan Foster 🐢

@canondetortugas

3 months ago

RL and post-training play a central role in giving language models advanced reasoning capabilities, but many algorithmic and scientific questions remain unanswered. Join us at FoPT @ COLT '25 to explore pressing emerging challenges and opportunities for theory to bring clarity.

thumb_up_off_alt64

chat_bubble_outline1

repeat7

shareShare

Audrey Huang

Gate.io

Yuda Song @ ICLR 2025

Dylan Foster 🐢

Dylan Foster 🐢

Dylan Foster 🐢

Dylan Foster 🐢

yingzhen

Dylan Foster 🐢

Dylan Foster 🐢

Nived Rajaraman

Dylan Foster 🐢