Scott Reed (@scott_e_reed) 's Twitter Profile
Scott Reed

@scott_e_reed

Research Scientist at NVIDIA working on generalist embodied agent research

ID: 4527505582

linkhttp://scottreed.info calendar_today18-12-2015 18:51:42

1,1K Tweet

16,16K Followers

516 Following

Dimitris Papailiopoulos (@dimitrispapail) 's Twitter Profile Photo

I tested phi-4-reasoning on my early grad lin algebra (private) final exam at UW-Madison. It scored 100% on the first run.. Two years ago I speculated nothing useful could run locally anytime soon. I was wrong. Kids can now have a free, grad level TA, running on their PC

Xuxin Cheng (@xuxin_cheng) 's Twitter Profile Photo

Meet 𝐀𝐌𝐎 — our universal whole‑body controller that unleashes the 𝐟𝐮𝐥𝐥  kinematic workspace of humanoid robots to the physical world. AMO is a single policy trained with RL + Hybrid Mocap & Trajectory‑Opt. Accepted to #RSS2025. Try our open models & more 👉

Arthur Allshire (@arthurallshire) 's Twitter Profile Photo

our policy is just joystick conditioned -- pull it back towards a chair, and it knows to sit. push it forward, it knows to stand. We call this contextual humanoid control please see more results and paper at videomimic.net

Jim Fan (@drjimfan) 's Twitter Profile Photo

The Physical Turing Test: your house is a complete mess after a Sunday hackathon. On Monday night, you come home to an immaculate living room and a candlelight dinner. And you couldn't tell whether a human or a machine had been there. Deceptively simple, insanely hard. It is the

Department of State (@statedept) 's Twitter Profile Photo

Today Deputy Secretary Christopher Landau welcomed the first group of Afrikaner refugees fleeing persecution from their native South Africa. We stand with these refugees, many of them farmers and former business owners, as they build a better future for themselves and their children here in the

Tong Zhang (@tongzha22057330) 's Twitter Profile Photo

🤖 Can a humanoid robot hold extreme single-leg poses like Bruce Lee's Kick or the Swallow Balance? 🤸 💥 YES. Meet HuB: Learning Extreme Humanoid Balance 🔗 Project website: hub-robot.github.io

ib (@indian_bronson) 's Twitter Profile Photo

In another 25 years, we’ll have fully corrected the mistakes the Boomers made. It’ll be like pulling up carpets and removing drop ceilings to see the beautiful hardwood or beams and plaster they covered up for some reason.

Scott Reed (@scott_e_reed) 's Twitter Profile Photo

Looks very promising! It is indeed unsatisfying that contemporary VLA policies tend to use a single step of context. I would also be curious if this can improve language following and convergence speed compared to discrete token VLA models, which seems to be a weak point of

Jesse Zhang (@jesse_y_zhang) 's Twitter Profile Photo

Given only successful trajectories, how do we learn to reward unsuccessful rollouts and generalize across tasks? We train with video rewinding, instruction augmentation, and OXE data! For rewinding, we randomly reverse videos to learn to predict decreasing rewards. (3/N)

Joel Jang (@jang_yoel) 's Twitter Profile Photo

Introducing 𝐃𝐫𝐞𝐚𝐦𝐆𝐞𝐧! We got humanoid robots to perform totally new 𝑣𝑒𝑟𝑏𝑠 in new environments through video world models. We believe video world models will solve the data problem in robotics. Bringing the paradigm of scaling human hours to GPU hours. Quick 🧵

Edward Johns (@ed__johns) 's Twitter Profile Photo

A few years ago, humanoids with legs walking around the ICRA exhibition was the new thing. This time, it’s the year of the hands! Tons and tons of humanoid hands! #ICRA2025

Brendan O'Donoghue (@bodonoghue85) 's Twitter Profile Photo

One cool feature that diffusion models for images have is the ability to do 'inpainting', where the user can mask out some part of an image and the diffusion model can fill it in based on a prompt. Turns out something very similar can be done with text diffusion!

One cool feature that diffusion models for images have is the ability to do 'inpainting', where the user can mask out some part of an image and the diffusion model can fill it in based on a prompt. Turns out something very similar can be done with text diffusion!
Remi Cadene (@remicadene) 's Twitter Profile Photo

Meet HopeJr, a full humanoid robot lowering the barrier to entry! Capable of walking, manipulating many objects, open-source and costs under $3000 🤯 Designed by Rob Knight and Hugging Face 👇

Kevin Frans (@kvfrans) 's Twitter Profile Photo

The resulting framework is simple -- train an optimality-conditioned diffusion policy, where optimality should be a monotonic function of advantage. During test time, we can dynamically interpolate between w=0 (base policy) and w=infinity (greedy policy).

The resulting framework is simple -- train an optimality-conditioned diffusion policy, where optimality should be a monotonic function of advantage. 

During test time, we can dynamically interpolate between w=0 (base policy) and w=infinity (greedy policy).
Ruijie Zheng (@ruijie_zheng12) 's Twitter Profile Photo

How does FLARE work? FLARE adds a few learnable "future tokens" to the policy denoising network alongside state and action tokens. These latent tokens will then be used to predict observation latents H steps ahead (H = action chunk size), enabling implicit future reasoning!

How does FLARE work? FLARE adds a few learnable "future tokens" to the policy denoising network alongside state and action tokens. These latent tokens will then be used to predict observation latents H steps ahead (H = action chunk size), enabling implicit future reasoning!
Tairan He (@tairanhe99) 's Twitter Profile Photo

Cool and solid work. The vision-pro humanoid teleop setup is we did with OmniH2O (omni.human2humanoid.com), but this work used MoE distillation, and better lidar odometry on G1 robot. Excited to see people pushing the limits of humanoid whole-body teleop!