Ulyana Piterbarg (@ulyanapiterbarg) 's Twitter Profile
Ulyana Piterbarg

@ulyanapiterbarg

reasoning, decision-making, + open-endedness 🗺️ | PhDing at @CILVRatNYU & interning on Llama Team @AIatMeta, prev @MITCoCoSci @ClimateMachine

ID: 1316742412904136704

linkhttps://upiterbarg.github.io calendar_today15-10-2020 14:07:25

124 Tweet

536 Followers

499 Following

clem 🤗 (@clementdelangue) 's Twitter Profile Photo

We need better agent evaluations! Glad to have collaborated with Meta Super Intelligence Lab to release Gaia2 and ARE! GPT5 (high) from OpenAI is leading on execution, search, ambiguity, adaptability and noise. Kimi-K2 from Kimi.ai is leading open weight. Full

We need better agent evaluations! Glad to have collaborated with <a href="/Meta/">Meta</a> Super Intelligence Lab to release Gaia2 and ARE! 

GPT5 (high) from <a href="/OpenAI/">OpenAI</a> is leading on execution, search, ambiguity, adaptability and noise.

Kimi-K2 from <a href="/Kimi_Moonshot/">Kimi.ai</a> is leading open weight.

Full
Gabriel Synnaeve (@syhw) 's Twitter Profile Photo

(🧵) Today, we release Meta Code World Model (CWM), a 32-billion-parameter dense LLM that enables novel research on improving code generation through agentic reasoning and planning with world models. ai.meta.com/research/publi…

Natasha Butt (@natashaeve4) 's Twitter Profile Photo

🔥New preprint: Soft Tokens, Hard Truths Introduces the first scalable continuous-token RL method for LLMs - no reference CoTs needed; scales to hundreds of thought tokens. Best to train soft, infer hard! Pass@1 parity ⚖️, Pass@32 gains 📈& better robustness 🛡️ vs. hard CoT 1/🧵

🔥New preprint: Soft Tokens, Hard Truths
Introduces the first scalable continuous-token RL method for LLMs - no reference CoTs needed; scales to hundreds of thought tokens. Best to train soft, infer hard! Pass@1 parity ⚖️, Pass@32 gains 📈&amp; better robustness 🛡️ vs. hard CoT
1/🧵
Kenneth Stanley (@kenneth0stanley) 's Twitter Profile Photo

So much controversy triggered by claims about how humans learn. But the deeper question is why the way we learn works. We need to understand the why to know in what way “how” matters. Nature is inspiration for AI, not prescription. The crux of the whole debate is abstraction.

Tim Rocktäschel (@_rockt) 's Twitter Profile Photo

Proud to announce that Dr Laura Ruis defended her PhD thesis titled "Understanding and Evaluating Reasoning in Large Language Models" last week 🥳. Massive thanks to Noah Goodman and Emine Yilmaz for examining! As is customary, Laura received a personal mortarboard from

Proud to announce that Dr <a href="/LauraRuis/">Laura Ruis</a> defended her PhD thesis titled "Understanding and Evaluating Reasoning in Large Language Models" last week 🥳. Massive thanks to Noah Goodman and Emine Yilmaz for examining! As is customary, Laura received a personal mortarboard from
Ulyana Piterbarg (@ulyanapiterbarg) 's Twitter Profile Photo

I’m presenting our work on mid-training code LMs for more exploratory and human-like software development with (nearly) zero human supervision at #COLM today (4.30-6.30pm, poster #47)! Our approach uses existing pretraining corpora, linters, and LMs as filterers + post-hoc

I’m presenting our work on mid-training code LMs for more exploratory and human-like software development with (nearly) zero human supervision at #COLM today (4.30-6.30pm, poster #47)!

Our approach uses existing pretraining corpora, linters, and LMs as filterers + post-hoc
Saining Xie (@sainingxie) 's Twitter Profile Photo

three years ago, DiT replaced the legacy unet with a transformer-based denoising backbone. we knew the bulky VAEs would be the next to go -- we just waited until we could do it right. today, we introduce Representation Autoencoders (RAE). >> Retire VAEs. Use RAEs. 👇(1/n)

three years ago, DiT replaced the legacy unet with a transformer-based denoising backbone. we knew the bulky VAEs would be the next to go -- we just waited until we could do it right.

today, we introduce Representation Autoencoders (RAE).

&gt;&gt; Retire VAEs. Use RAEs. 👇(1/n)
Damek (@damekdavis) 's Twitter Profile Photo

In this note w/ Ben Recht we look at RL problems with 0/1 rewards, showing that popular methods maximize the average (transformed) probability of correctly answering a prompt x: max_θ 𝔼ₓ h(Prob(correct ∣ x; θ)) for certain functions h. Weirdly, h is arcsin(√t) in GRPO.

In this note w/ <a href="/beenwrekt/">Ben Recht</a> we look at RL problems with 0/1 rewards, showing that popular methods maximize the average (transformed) probability of correctly answering a prompt x:

max_θ 𝔼ₓ h(Prob(correct ∣ x; θ))

for certain functions h. Weirdly, h is arcsin(√t) in GRPO.
Homanga Bharadhwaj (@mangahomanga) 's Twitter Profile Photo

I'll be joining the faculty Johns Hopkins University late next year as a tenure-track assistant professor in JHU Computer Science Looking for PhD students to join me tackling fun problems in robot manipulation, learning from human data, understanding+predicting physical interactions, and beyond!

I'll be joining the faculty <a href="/JohnsHopkins/">Johns Hopkins University</a> late next year as a tenure-track assistant professor in <a href="/JHUCompSci/">JHU Computer Science</a> 

Looking for PhD students to join me tackling fun problems in robot manipulation, learning from human data, understanding+predicting physical interactions, and beyond!
Alex Gu @ iclr (@minimario1729) 's Twitter Profile Photo

✂️Introducing ProofOptimizer: a training and inference recipe for proof shortening! 😰AI-written formal proofs can be long and unreadable: Seed-Prover's proof of IMO '25 P1 is 16x longer in Lean vs. English. Our 7B shortens proofs generated by SoTA models by over 50%! 🧵⬇️

✂️Introducing ProofOptimizer: a training and inference recipe for proof shortening! 

😰AI-written formal proofs can be long and unreadable: Seed-Prover's proof of IMO '25 P1 is 16x longer in Lean vs. English. Our 7B shortens proofs generated by SoTA models by over 50%! 

🧵⬇️
Simon Guo 🦝 (@simonguozirui) 's Twitter Profile Photo

Wrote a 1-year retrospective with Alex L Zhang on KernelBench and the journey toward automated GPU/CUDA kernel generations! Since my labmates (Anne Ouyang, Simran Arora, William Hu) and I first started working towards this vision around last year’s @GPU_mode hackathon, we have

Wrote a 1-year retrospective with <a href="/a1zhang/">Alex L Zhang</a> on KernelBench and the journey toward automated GPU/CUDA kernel generations!

Since my labmates (<a href="/anneouyang/">Anne Ouyang</a>, <a href="/simran_s_arora/">Simran Arora</a>, <a href="/_williamhu/">William Hu</a>) and I first started working towards this vision around last year’s @GPU_mode hackathon, we have
Shane Gu (@shaneguml) 's Twitter Profile Photo

Hot take: DAgger (Ross 2011) should be the first paper you read to get into RL, instead of Sutton's book. Maybe also read scheduled sampling (Bengio 2015). And before RL, study supervised learning thoroughly.

Hot take: DAgger (Ross 2011) should be the first paper you read to get into RL, instead of Sutton's book. Maybe also read scheduled sampling (Bengio 2015). And before RL, study supervised learning thoroughly.
Cong Lu (@cong_ml) 's Twitter Profile Photo

Are you interested in Open-Endedness and AI for Science? 🧪 I'm hiring a Student Researcher at Google DeepMind for a 6-month role. Join us to work on building agents capable of novel scientific discoveries! 🔬 Reach out if this sounds like you, and apply here 👇

Mikayel Samvelyan (@_samvelyan) 's Twitter Profile Photo

I’m hiring a Student Researcher at Google DeepMind. This research role centers on topics of open-ended self-improvement and discovery with LLM agents. 📍 Location: London 🗓️ Duration: 6 months, 100% 🚀 Start date: June or July 2026 Apply now using the links below👇

I’m hiring a Student Researcher at <a href="/GoogleDeepMind/">Google DeepMind</a>. This research role centers on topics of open-ended self-improvement and discovery with LLM agents.

📍 Location: London
🗓️ Duration: 6 months, 100%
🚀 Start date: June or July 2026

Apply now using the links below👇
Cong Lu (@cong_ml) 's Twitter Profile Photo

So excited to share our work on the new frontier for generally capable agents in 3D worlds!! Some of the most exciting showcases of agentic capabilities in the blog post, including on entire unseen Genie generated worlds 😜