Eric Zelikman (@ericzelikman) Twitter Tweets • TwiCopy

Check out our new work: Generalization from context often outperforms generalization from finetuning. And you might get the best of both worlds by spending extra compute at train-time.

thumb_up_off_alt202

chat_bubble_outline3

repeat20

shareShare

Eric Zelikman

@ericzelikman

4 months ago

seems like a big theme lately (e.g. also "RL for Reasoning w/ One Training Example") is that approaches don't get nearly enough bang for each training point's buck - cool!

thumb_up_off_alt92

chat_bubble_outline4

repeat8

shareShare

40% with just 1 try per task: SWE-agent-LM-32B is the new #1 open source model on SWE-bench Verified. We built it by synthesizing a ton of agentic training data from 100+ Python repos. Today we’re open-sourcing the toolkit that made it happen: SWE-smith.

thumb_up_off_alt638

chat_bubble_outline25

repeat132

shareShare

Eric Zelikman

@ericzelikman

4 months ago

NaN sample efficiency x.com/AndrewZ4573249…

thumb_up_off_alt86

chat_bubble_outline3

repeat0

shareShare

Eric Zelikman

@ericzelikman

3 months ago

fun note: heiner once described my env config as "the final boss of python venv issues" -- has been mostly issue free for a few months now, thanks mostly to uv 🤞

thumb_up_off_alt86

chat_bubble_outline5

repeat0

shareShare

noahdgoodman

@noahdgoodman

2 months ago

It turns out that a lot of the most interesting behavior of LLMs can be explained without knowing anything about architecture or learning algorithms. Here we predict the rise (and fall) of in-context learning using hierarchical Bayesian methods.

thumb_up_off_alt105

chat_bubble_outline4

repeat19

shareShare

Eric Zelikman

@ericzelikman

2 months ago

thumb_up_off_alt2,2K

chat_bubble_outline81

repeat142

shareShare

Eric Zelikman

@ericzelikman

2 months ago

building reasoning agents w/ Yuchen He Qian Huang was so fun, and the next paradigm will be even cooler -- agents will solve far harder problems far faster

thumb_up_off_alt290

chat_bubble_outline15

repeat39

shareShare

Shirley Wu

@shirleyyxwu

2 months ago

CollabLLM won #ICML2025 ✨Outstanding Paper Award along with 6 other works! icml.cc/virtual/2025/a… 🫂 Absolutey honored and grateful for coauthors Microsoft Research Stanford AI Lab and friends who made this happen! 🗣️ Welcome people to our presentations about CollabLLM tomorrow

thumb_up_off_alt196

chat_bubble_outline18

repeat26

shareShare

Kaiyu Yang

@kaiyuyang4

a month ago

🚀 Excited to share that the Workshop on Mathematical Reasoning and AI (MATH‑AI) will be at NeurIPS 2025! 📅 Dec 6 or 7 (TBD), 2025 🌴 San Diego, California

thumb_up_off_alt216

chat_bubble_outline7

repeat41

shareShare

Eric Zelikman

@ericzelikman

a month ago

x.com/i/article/1954…

thumb_up_off_alt251

chat_bubble_outline11

repeat43

shareShare

Eric Zelikman

@ericzelikman

23 days ago

thank you Igor Babuschkin for all your inspiration and vision for xAI -- we're all lucky you cared so deeply about a world where AI benefits humanity

thank you <a href="/ibab/">Igor Babuschkin</a> for all your inspiration and vision for xAI -- we're all lucky you cared so deeply about a world where AI benefits humanity

thumb_up_off_alt248

chat_bubble_outline27

repeat35

shareShare

Shirley Wu

@shirleyyxwu

23 days ago

✨ Optimas is fully open-sourced at github.com/snap-stanford/… 🏋🏻 Contribute by: 1) Submitting PRs for your own compound AI systems. 2) Extending the optimization infra. 3) Creating issues (we're happy to help optimize your systems, explore why they work or fail, and go from there

thumb_up_off_alt125

chat_bubble_outline3

repeat21

shareShare

Diyi Yang

@diyi_yang

22 days ago

🚨 Yanzhe Zhang's new work uses a search-based framework that evolves both sides of the game to find privacy risks --- alternating between improving attacker & defender instructions in privacy-critical LLM agent interactions!!! Attack agents evolve from blunt data grabs to

🚨 <a href="/StevenyzZhang/">Yanzhe Zhang</a>'s new work uses a search-based framework that evolves both sides of the game to find privacy risks --- alternating between improving attacker & defender instructions in privacy-critical LLM agent interactions!!!

Attack agents evolve from blunt data grabs to

thumb_up_off_alt63

chat_bubble_outline5

repeat12

shareShare

Andrej Karpathy

@karpathy

21 days ago

I am (slowly) re-reading the Tolkien legendarium (of which Lord of the Rings is a small part). The whole body of work is so incredible and there's nothing else like it... it dilutes other worlds of fiction. Wait - your story doesn't have a comprehensive history/mythology spanning

thumb_up_off_alt7,7K

chat_bubble_outline657

repeat681

shareShare

Eric Zelikman

Eric Zelikman

Eric Zelikman

Stephanie Chan

Eric Zelikman

John Yang

Eric Zelikman

Eric Zelikman

noahdgoodman

Eric Zelikman

Eric Zelikman

Shirley Wu

Kaiyu Yang

Eric Zelikman

Eric Zelikman

Shirley Wu

Diyi Yang

Andrej Karpathy