Haoyu Xiong (@haoyu_xiong_) 's Twitter Profile
Haoyu Xiong

@haoyu_xiong_

Currently @Stanford. Incoming PhD @MIT EECS. Opinions my own. #robotlearning

ID: 1159552879453020160

linkhttp://haoyu-x.github.io calendar_today08-08-2019 19:52:04

868 Tweet

2,2K Followers

2,2K Following

Mihir Prabhudesai (@mihirp98) 's Twitter Profile Photo

🚨 The era of infinite internet data is ending, So we ask: šŸ‘‰ What’s the right generative modelling objective when data—not compute—is the bottleneck? TL;DR: ā–¶ļøCompute-constrained? Train Autoregressive models ā–¶ļøData-constrained? Train Diffusion models Get ready for 🤿 1/n

🚨 The era of infinite internet data is ending, So we ask:

šŸ‘‰ What’s the right generative modelling objective when data—not compute—is the bottleneck?

TL;DR:

ā–¶ļøCompute-constrained? Train Autoregressive models

ā–¶ļøData-constrained? Train Diffusion models

Get ready for 🤿  1/n
Mihir Prabhudesai (@mihirp98) 's Twitter Profile Photo

Extrapolating this trend to robotics, i believe if one is doing sim2real they should prefer Autoregressive > Diffusion (compute bottleneck). But if they are doing real world training then Autoregressive < Diffusion (data bottleneck).. We don't empirically validate this for

Haoyu Xiong (@haoyu_xiong_) 's Twitter Profile Photo

Just reread the tidybot2.github.io docs today, what an incredible tutorial for building a robot system. Honestly, you could set up an entire new robot lab just by following it, Jimmy Wu even gave you the link of the screwdriver he used šŸ˜‚

Tim Schneider (@timschneider94) 's Twitter Profile Photo

Pushing for #icra but still missing real robot experiments? 😰 Skip the ROS headaches — get your Franka robot running in minutes with franky! 🦾 Super beginner-friendly, Pythonic, and fast to set up. šŸ”— github.com/TimSchneider42… Intelligent Autonomous Systems Group Jan Peters šŸ§µšŸ‘‡

Pushing for #icra but still missing real robot experiments? 😰
Skip the ROS headaches — get your Franka robot running in minutes with franky! 🦾
Super beginner-friendly, Pythonic, and fast to set up.
šŸ”— github.com/TimSchneider42…
<a href="/ias_tudarmstadt/">Intelligent Autonomous Systems Group</a> <a href="/Jan_R_Peters/">Jan Peters</a>

šŸ§µšŸ‘‡
MetaStoneAI (@themetastoneai) 's Twitter Profile Photo

šŸš€Ā IntroducingĀ XBaiĀ o4:a milestone in our 4th-generation open-source technology based on parallel test time scaling! In its medium mode,Ā XBaiĀ o4Ā now fully outperformsĀ OpenAIāˆ’o3āˆ’mini.šŸ“ˆ šŸ”—Open-source weights: huggingface.co/MetaStoneTec/Xā€¦āœ… Github link: github.com/MetaStone-AI/X…

šŸš€Ā IntroducingĀ XBaiĀ o4:a milestone in our 4th-generation open-source technology based on parallel test time scaling!
In its medium mode,Ā XBaiĀ o4Ā now fully outperformsĀ OpenAIāˆ’o3āˆ’mini.šŸ“ˆ

šŸ”—Open-source weights: huggingface.co/MetaStoneTec/Xā€¦āœ…
Github link: github.com/MetaStone-AI/X…
Shawn Shen (@shawn_shen_oix) 's Twitter Profile Photo

Now with our internet clip search tool, curating a training dataset does not need weeks, it’s within seconds, you get the exact clips within the video, and perfectly labeled. Eg, training a world model needs PoV data, here is how:

Yanjie Ze (@zeyanjie) 's Twitter Profile Photo

Excited to open-source GMR: General Motion Retargeting. Real-time human-to-humanoid retargeting on your laptop. Supports diverse motion formats & robots. Unlock whole-body humanoid teleoperation (e.g., TWIST). video with šŸ”Š

Google DeepMind (@googledeepmind) 's Twitter Profile Photo

What if you could not only watch a generated video, but explore it too? 🌐 Genie 3 is our groundbreaking world model that creates interactive, playable environments from a single text prompt. From photorealistic landscapes to fantasy realms, the possibilities are endless. 🧵

Lili (@lchen915) 's Twitter Profile Photo

Self-Questioning Language Models: LLMs that learn to generate their own questions and answers via asymmetric self-play RL. There is no external training data – the only input is a single prompt specifying the topic.

Self-Questioning Language Models: LLMs that learn to generate their own questions and answers via asymmetric self-play RL.

There is no external training data – the only input is a single prompt specifying the topic.
Skywork (@skywork_ai) 's Twitter Profile Photo

Matrix-Game 2.0 — The FIRST open-source, real-time, long-sequence interactive world model Last week, DeepMind's Genie 3 shook the AI world with real-time interactive world models. But... it wasn't open-sourced. Today, Matrix-Game 2.0 changed the game. šŸš€ 25FPS. Minutes-long

Jiafei Duan (@djiafei) 's Twitter Profile Photo

Reasoning is central to purposeful action. Today we introduce MolmoAct — a fully open Action Reasoning Model (ARM) for robotics. Grounded in large-scale pre-training with action reasoning data, every predicted action is interpretable and user-steerable via visual trace. We are

kaan doğrusöz (@kaandogrusoz) 's Twitter Profile Photo

Demos demos demos! Reminiscing on how we got started when we publicly demo’ed our first VLA autonomously folding tshirts in @YCombinator’s demoday, with the first version of Isaac we built in our living room with Evan Wineland We’ve learned a lot since then.

Zhanyi S (@s_zhanyi) 's Twitter Profile Photo

How to prevent behavior cloning policies from drifting OOD on long horizon manipulation tasks? Check out Latent Policy Barrier (LPB), a plug-and-play test-time optimization method that keeps BC policies in-distribution with no extra demo or fine-tuning: project-latentpolicybarrier.github.io

Haoyu Xiong (@haoyu_xiong_) 's Twitter Profile Photo

I’m happy to share that I’ve recently moved to Boston to start my PhD MIT CSAIL! Excited to hack on some cool robots! Let me know if you are around, let’s chat about AI, robotics, food in Boston, or anything else!

Bonnie Li (@bonniesjli) 's Twitter Profile Photo

We can now train AI inside the mind of another AI. 🤯 šŸŒ Our world model, Genie 3, imagines and generates new worlds on the fly. šŸ¤– Our embodied agent, Sima, is dropped in and learns to navigate them autonomously. The entire loop—from the environment to the action—is generated

Ken Liu (@kenziyuliu) 's Twitter Profile Photo

New paper! We explore a radical paradigm for AI evals: assessing LLMs on *unsolved* questions. Instead of contrived exams where progress ≠ value, we eval LLMs on organic, unsolved problems via reference-free LLM validation & community verification. LLMs solved ~10/500 so far:

New paper! We explore a radical paradigm for AI evals: assessing LLMs on *unsolved* questions.

Instead of contrived exams where progress ≠ value, we eval LLMs on organic, unsolved problems via reference-free LLM validation &amp; community verification. LLMs solved ~10/500 so far:
Binghao Huang (@binghao_huang) 's Twitter Profile Photo

How does high-fidelity tactile simulation help robots nail the last millimeter? We’re releasing VT-Refine, accepted to CoRL: a real-to-sim-to-real visuo-tactile policy using a GPU-parallel tactile sim for our piezoresistive skin FlexiTac. Then fine-tuning a diffusion policy with

Haoyu Xiong (@haoyu_xiong_) 's Twitter Profile Photo

Working with Homanga has been a wonderful experience. My first paper was published with him a few years ago. Don’t miss to apply to homanga’s lab if you’re interested in robot learning!