Stanford AI Lab (@stanfordailab) Twitter Tweets • TwiCopy

Francis Engelmann

5 months ago

What makes a good 3D scene representation? Instead of meshes or Gaussians, we propose Superquadrics to decompose 3D scenes into extremely compact representations ➡️ check out our paper for exciting use-cases in robotics🤖 and GenAI🚀 super-dec.github.io w/ Elisabetta Fedele Marc Pollefeys

thumb_up_off_alt362

chat_bubble_outline6

repeat70

shareShare

Rylan Schaeffer

@rylanschaeffer

5 months ago

A bit late to the party, but our paper on predictable inference-time / test-time scaling was accepted to #icml2025 🎉🎉🎉 TLDR: Best of N was shown to exhibit power (polynomial) law scaling (left), but maths suggest one should expect exponential scaling (center). We show how to

thumb_up_off_alt106

chat_bubble_outline4

repeat14

shareShare

Youssef Allouah

@ys_alh

4 months ago

Excited our paper "Certified Unlearning for Neural Networks" is accepted at ICML 2025! We introduce a method for provable machine unlearning-- truly "forgetting" data without restrictive assumptions like convexity. Paper: arxiv.org/abs/2506.06985 Code: github.com/stair-lab/cert…

thumb_up_off_alt100

chat_bubble_outline3

repeat18

shareShare

Stanford AI Lab

@stanfordailab

4 months ago

In Los Angeles for RSS 2025? 🤖 🌴Be sure to check out the great work by students from the Stanford AI Lab! ai.stanford.edu/blog/rss-2025/

thumb_up_off_alt20

chat_bubble_outline2

repeat2

shareShare

Mayee Chen

@mayeechen

4 months ago

LLMs often generate correct answers but struggle to select them. Weaver tackles this by combining many weak verifiers (reward models, LM judges) into a stronger signal using statistical tools from Weak Supervision—matching o3-mini-level accuracy with much cheaper models! 📊

thumb_up_off_alt223

chat_bubble_outline15

repeat33

shareShare

Christopher Agia

@agiachris

4 months ago

What makes data “good” for robot learning? We argue: it’s the data that drives closed-loop policy success! Introducing CUPID 💘, a method that curates demonstrations not by "quality" or appearance, but by how they influence policy behavior, using influence functions. (1/6)

thumb_up_off_alt104

chat_bubble_outline5

repeat15

shareShare

Sanjana Srivastava

@sanjana__z

4 months ago

🤖 Household robots are becoming physically viable. But interacting with people in the home requires handling unseen, unconstrained, dynamic preferences, not just a complex physical domain. We introduce ROSETTA: a method to generate reward for such preferences cheaply. 🧵⬇️

thumb_up_off_alt128

chat_bubble_outline4

repeat27

shareShare

Anjiang Wei

@anjiangw

4 months ago

We introduce CodeARC, a new benchmark for evaluating LLMs’ inductive reasoning. Agents must synthesize functions from I/O examples—no natural language, just reasoning. 📄 arxiv.org/pdf/2503.23145 💻 github.com/Anjiang-Wei/Co… 🌐 anjiang-wei.github.io/CodeARC-Websit… #LLM #Reasoning #LLM4Code #ARC

thumb_up_off_alt88

chat_bubble_outline3

repeat29

shareShare

Hong-Xing "Koven" Yu

@koven_yu

4 months ago

#ICCV2025 🤩3D world generation is cool, but it is cooler to play with the worlds using 3D actions 👆💨, and see what happens! — Introducing *WonderPlay*: Now you can create dynamic 3D scenes that respond to your 3D actions from a single image! Web: kyleleey.github.io/WonderPlay/ 🧵1/7

thumb_up_off_alt175

chat_bubble_outline5

repeat37

shareShare

Marcel Torné

@marceltornev

4 months ago

Very happy to share that our work on learning long-history policies received the Best Paper Award from the Workshop on Learned Robot Representations Robotics: Science and Systems ! 🤖🥳 Check out our paper if you haven't already! long-context-dp.github.io Thank you to all the organizers and

thumb_up_off_alt81

chat_bubble_outline2

repeat11

shareShare

Stanford Engineering

@stanfordeng

4 months ago

Stanford Engineering’s fourth decade, 1955-1964, was a period of transformation. New departments were formed, computing entered the classroom, the Stanford “Dish” was completed, and Stanford AI Lab began shaping the future of AI. engineering100.stanford.edu/stories/a-peri…

thumb_up_off_alt11

chat_bubble_outline0

repeat2

shareShare

Stefano Ermon

@stefanoermon

4 months ago

Huge milestone from the team! A blazing-fast diffusion LLM built for chat, delivering real-time performance at commercial scale. If you liked Mercury Coder for code, you'll love this for conversation.

thumb_up_off_alt177

chat_bubble_outline8

repeat27

shareShare

Hancheng Cao

@caohancheng

4 months ago

Check out our latest work analyzing 21 million human–LLM conversations from Microsoft Bing Copilot and WildChat to uncover prototypical ways people interact with AI in real-world settings! arxiv.org/pdf/2505.16023

thumb_up_off_alt29

chat_bubble_outline0

repeat6

shareShare

Kanishk Gandhi

@gandhikanishk

4 months ago

New Paper: Can we collect human chains-of-thoughts by asking them to think out loud? In our new paper we automate and study this protocol with 5,000 human reasoning traces from 640 people solving Countdown problems. 1/5

thumb_up_off_alt36

chat_bubble_outline3

repeat3

shareShare

Ekdeep Singh Lubana

@ekdeepl

4 months ago

🚨New paper! We know models learn distinct in-context learning strategies, but *why*? Why generalize instead of memorize to lower loss? And why is generalization transient? Our work explains this & *predicts Transformer behavior throughout training* without its weights! 🧵 1/

thumb_up_off_alt334

chat_bubble_outline9

repeat63

shareShare

Stanford AI Lab

@stanfordailab

4 months ago

Robot learning has largely focused on standard platforms—but can it embrace robots of all shapes and sizes? In Xiaomeng Xu's latest blog post, we show how data-driven methods bring unconventional robots to life, enabling capabilities that traditional designs and control can't

Robot learning has largely focused on standard platforms—but can it embrace robots of all shapes and sizes? In <a href="/XiaomengXu11/">Xiaomeng Xu</a>'s latest blog post, we show how data-driven methods bring unconventional robots to life, enabling capabilities that traditional designs and control can't

thumb_up_off_alt110

chat_bubble_outline3

repeat22

shareShare

Annie Chen

@_anniechen_

4 months ago

How should an RL agent leverage expert data to improve sample efficiency? Imitation losses can overly constrain an RL policy. In RL via Implicit Imitation Guidance, we show how to use expert data to guide more efficient *exploration*, avoiding pitfalls of imitation-augmented RL

thumb_up_off_alt231

chat_bubble_outline6

repeat35

shareShare

Surya Ganguli

@suryaganguli

4 months ago

A great Quanta Magazine article on our theory of creativity in convolutional diffusion models lead by Mason Kamb. See also our paper with new results in version 2: arxiv.org/abs/2412.20292 to be presented as an oral at ICML Conference #icml25 thx Webb Wright !

thumb_up_off_alt59

chat_bubble_outline3

repeat8

shareShare

Diyi Yang

@diyi_yang

4 months ago

Our study led by CLS reveals an “ideation–execution gap” 😲 Ideas from LLMs may sound novel, but when experts spend 100+ hrs executing them, they flop: 💥 👉 human‑generated ideas outperform on novelty, excitement, effectiveness & overall quality!

thumb_up_off_alt142

chat_bubble_outline5

repeat25

shareShare

Peng Qi

@qi2peng2

4 months ago

Seven years ago, I co-led a paper called 𝗛𝗼𝘁𝗽𝗼𝘁𝗤𝗔 that has motivated and facilitated many #AI #Agents research works since. Today, I'm asking that you stop using HotpotQA blindly for agents research in 2025 and beyond. In my new blog post, I revisit the brief history of

thumb_up_off_alt222

chat_bubble_outline5

repeat44

shareShare