Irena Gao (@irena_gao) Twitter Tweets • TwiCopy

Irena Gao

@irena_gao

9 months ago

To appear at #ICLR2025! We provide tests to check if LLM inference providers serve the same models as expected.

thumb_up_off_alt49

chat_bubble_outline1

repeat8

shareShare

Luke Bailey

@lukebailey181

9 months ago

Understanding the current landscape of AI agents is important for predicting future trends in this area. Read our paper if you want a snapshot of deployed AI agents.

thumb_up_off_alt8

chat_bubble_outline0

repeat1

shareShare

SAIL is delighted to announce Carlos Carlos Guestrin, the Fortinet Founders Professor of Computer Science, as the next Director of Stanford AI Lab. Carlos is a talented researcher and leader, known for his work on explainability, graphs, compilation, and boosted trees in AI.

SAIL is delighted to announce Carlos <a href="/guestrin/">Carlos Guestrin</a>, the Fortinet Founders Professor of Computer Science, as the next Director of <a href="/StanfordAILab/">Stanford AI Lab</a>. Carlos is a talented researcher and leader, known for his work on explainability, graphs, compilation, and boosted trees in AI.

thumb_up_off_alt96

chat_bubble_outline6

repeat25

shareShare

CLS

@chengleisi

8 months ago

Reasoning is all the rage these days. If you want to save some time and get to the crux of how to enable reasoning in LLMs, here’s a list of 10 recent papers that I find most informative, along with my notes: (Full thread in doc: docs.google.com/document/d/1TW…) 1/11

thumb_up_off_alt805

chat_bubble_outline7

repeat122

shareShare

Anikait Singh

@anikait_singh_

8 months ago

Personalization in LLMs is crucial for meeting diverse user needs, yet collecting real-world preferences at scale remains a significant challenge. Introducing FSPO, a simple framework leveraging synthetic preference data to adapt new users with meta-learning for open-ended QA! 🧵

thumb_up_off_alt133

chat_bubble_outline1

repeat11

shareShare

Eddie Vendrow

@edwardvendrow

7 months ago

Very excited to share *GSM8K-Platinum*, a revised version of the GSM8K test set! If you’re using GSM8K, I highly recommend you switch to GSM8K-Platinum! We built it as a drop-in replacement for the GSM8K test set. Check it out: huggingface.co/datasets/madry…

thumb_up_off_alt40

chat_bubble_outline1

repeat10

shareShare

James Zou

@james_y_zou

7 months ago

💡The key idea of #textgrad is to optimize by backpropagating textual gradients produced by #LLM. Paper: nature.com/articles/s4158… Code: github.com/zou-group/text… Amazing job by Mert Yuksekgonul leading this project w/ fantastic collaborators Federico Bianchi Joseph Boen Sheng Liu

thumb_up_off_alt46

chat_bubble_outline1

repeat7

shareShare

Stanford AI Lab

@stanfordailab

6 months ago

Heading to #ICLR2025 ? Make sure to check out the amazing research led by our students here at the Stanford AI Lab! ai.stanford.edu/blog/iclr-2025/

thumb_up_off_alt56

chat_bubble_outline1

repeat26

shareShare

Irena Gao

@irena_gao

6 months ago

I'll be presenting this work Friday at the #ICLR2025 afternoon poster session (3-5:30PM, Poster #243). Come chat!

thumb_up_off_alt12

chat_bubble_outline1

repeat1

shareShare

Amber Xie

@amberxie_

6 months ago

Introducing ✨Latent Diffusion Planning✨ (LDP)! We explore how to use expert, suboptimal, & action-free data. To do so, we learn a diffusion-based *planner* that forecasts latent states, and an *inverse-dynamics model* that extracts actions. w/ Oleg Rybkin Dorsa Sadigh Chelsea Finn

thumb_up_off_alt345

chat_bubble_outline2

repeat39

shareShare

Fahim Tajwar

@fahimtajwar10

5 months ago

RL with verifiable reward has shown impressive results in improving LLM reasoning, but what can we do when we do not have ground truth answers? Introducing Self-Rewarding Training (SRT): where language models provide their own reward for RL training! 🧵 1/n

thumb_up_off_alt819

chat_bubble_outline20

repeat136

shareShare

Qinan Yu

@qinan_yu

5 months ago

🎀 fine-grained, interpretable representation steering for LMs! meet RePS — Reference-free Preference Steering! 1⃣ outperforms existing methods on 2B-27B LMs, nearly matching prompting 2⃣ supports both steering and suppression (beat system prompts!) 3⃣ jailbreak-proof (1/n)

thumb_up_off_alt212

chat_bubble_outline1

repeat35

shareShare

Omar Shaikh

@oshaikh13

4 months ago

What if LLMs could learn your habits and preferences well enough (across any context!) to anticipate your needs? In a new paper, we present the General User Model (GUM): a model of you built from just your everyday computer use. 🧵

thumb_up_off_alt181

chat_bubble_outline12

repeat57

shareShare

Yijia Shao

@echoshao8899

4 months ago

🚨 70 million US workers are about to face their biggest workplace transmission due to AI agents. But nobody asks them what they want. While AI races to automate everything, we took a different approach: auditing what workers want vs. what AI can do across the US workforce.🧵

thumb_up_off_alt280

chat_bubble_outline6

repeat47

shareShare

Shirley Wu

@shirleyyxwu

4 months ago

Even the smartest LLMs can fail at basic multiturn communication Ask for grocery help → without asking where you live 🤦‍♀️ Ask to write articles → assumes your preferences 🤷🏻‍♀️ ⭐️CollabLLM (top 1%; oral ICML Conference) transforms LLMs from passive responders into active collaborators.

thumb_up_off_alt140

chat_bubble_outline6

repeat43

shareShare

Yutong Zhang

@zhangyt0704

4 months ago

AI companions aren’t science fiction anymore 🤖💬❤️ Thousands are turning to AI chatbots for emotional connection – finding comfort, sharing secrets, and even falling in love. But as AI companionship grows, the line between real and artificial relationships blurs. 📰 “Can A.I.

thumb_up_off_alt174

chat_bubble_outline4

repeat53

shareShare

CLS

@chengleisi

4 months ago

Are AI scientists already better than human researchers? We recruited 43 PhD students to spend 3 months executing research ideas proposed by an LLM agent vs human experts. Main finding: LLM ideas result in worse projects than human ideas.

thumb_up_off_alt553

chat_bubble_outline10

repeat162

shareShare

Eric Zelikman

@ericzelikman

2 months ago

x.com/i/article/1954…

thumb_up_off_alt251

chat_bubble_outline11

repeat43

shareShare