Lili Yu (Neurips24) (@liliyu_lili) 's Twitter Profile
Lili Yu (Neurips24)

@liliyu_lili

AI Research Scientist @AIatMeta (FAIR)
Multimodal: Megabyte, Chameleon, Transfusion
Phd @MIT

ID: 2994229563

calendar_today23-01-2015 15:07:11

96 Tweet

1,1K Followers

310 Following

Physical Intelligence (@physical_int) 's Twitter Profile Photo

Many of you asked for code & weights for π₀, we are happy to announce that we are releasing π₀ and pre-trained checkpoints in our new openpi repository! We tested the model on a few public robots, and we include code for you to fine-tune it yourself.

Beidi Chen (@beidichen) 's Twitter Profile Photo

⏰📢After years of working on long-context efficiency, I’ve started to doubt if it’s truly necessary (Many of you have probably noticed the decline of interest in long llms). Despite strong models like Gemini, short-context + retrieval often do the trick—faster, cheaper, and

Lucas Beyer (bl16) (@giffmana) 's Twitter Profile Photo

Yoooo I didn't expect this amount of generalization on robots to happen this soon! Pi team is on fire, had to update my timelines a bit :) Also, I spotted what seems to be Lili Yu (ICLR2025)'s first public appearance as part of Physical Intelligence, big congrats to both.

Lili Yu (Neurips24) (@liliyu_lili) 's Twitter Profile Photo

Flying to Singapore 🇸🇬 for ICLR 2026 this week! Looking forward to catching up with friends and discuss multimodal modeling or our new π-0.5 physicalintelligence.company/blog/pi05 (and other Visual language action models).

Lili Yu (Neurips24) (@liliyu_lili) 's Twitter Profile Photo

I am presenting transfusion and Mixture of Mamba at ICLR 2025. Transfusion poster: Friday, April 25, 2025, 10:00 AM - 12:30 PM Transfusion Oral: Friday, April 25, 2025, 3:54 PM - 4:06 PM, Garnet 212-213 Mixture-of-mamba: Sunday, April 27, 2025, 11:10 AM - 12:10 PM

Jiuhai Chen (@jiuhaic) 's Twitter Profile Photo

🚀 Introducing BLIP3-o: A Family of Fully Open Unified Multimodal Models arxiv.org/pdf/2505.09568 🔓 Attempting to unlock GPT-4o’s image generation. Open source everything! Including 25 million pre-training data!

🚀 Introducing BLIP3-o: A Family of Fully Open Unified Multimodal Models arxiv.org/pdf/2505.09568
🔓 Attempting to unlock GPT-4o’s image generation.
Open source everything! 
Including 25 million pre-training data!
Lili Yu (Neurips24) (@liliyu_lili) 's Twitter Profile Photo

Checkout π0.5 + KI: a new VLA that train fast, run fast & generalize better. A good VLA needs to learn motor control fast without forgetting how to think and reason with common language+visual. Our new recipe combines: • Discrete tokens (π0-FAST) for rapid motor learning •

Victoria X Lin (@victorialinml) 's Twitter Profile Photo

Let's talk about Mixture-of-Transformers (MoT) and heterogeneous omni-model training. 1. Inspired by prior architectures consisting of modality-specific parameters—such as Flamingo, CogVLM, BEIT-3, and MoMA—MoT (arxiv.org/abs/2411.04996) pushes this idea further by using

Physical Intelligence (@physical_int) 's Twitter Profile Photo

Our models need to run in real time on real robots, but inference with big VLAs takes a long time. We developed Real-Time Action Chunking (RTC) to enable real-time inference with flow matching for the π0 and π0.5 VLAs! More in the thread👇

Karl Pertsch (@karlpertsch) 's Twitter Profile Photo

We’re releasing the RoboArena today!🤖🦾 Fair & scalable evaluation is a major bottleneck for research on generalist policies. We’re hoping that RoboArena can help! We provide data, model code & sim evals for debugging! Submit your policies today and join the leaderboard! :) 🧵