Zhuo Xu (@drzhuoxu) 's Twitter Profile
Zhuo Xu

@drzhuoxu

Research Scientist @GoogleDeepMind, PhD @Berkeley, previously @Tsinghua_Uni.

ID: 1726882175386345472

linkhttps://drzhuoxu.github.io/ calendar_today21-11-2023 08:36:24

32 Tweet

155 Followers

128 Following

Wenhao Yu (@stacormed) 's Twitter Profile Photo

“A picture is worth a thousand words”, can VLMs also read robot actions better in images than in words? We introduce PIVOT to explore this idea and enable a VLM to zero-shot “find a place to sit down and do writing” by navigating a robot to the room with the light on :)

Zhuo Xu (@drzhuoxu) 's Twitter Profile Photo

Our interesting findings from exploring the sampling based planning in the era of large VLMs — pivot-prompt.github.io

Jeff Dean (@jeffdean) 's Twitter Profile Photo

Gemini 1.5 Pro - A highly capable multimodal model with a 10M token context length Today we are releasing the first demonstrations of the capabilities of the Gemini 1.5 series, with the Gemini 1.5 Pro model. One of the key differentiators of this model is its incredibly long

Gemini 1.5 Pro - A highly capable multimodal model with a 10M token context length

Today we are releasing the first demonstrations of the capabilities of the Gemini 1.5 series, with the Gemini 1.5 Pro model.  One of the key differentiators of this model is its incredibly long
Cheng Chi (@chichengcc) 's Twitter Profile Photo

Can we collect robot data without any robots? Introducing Universal Manipulation Interface (UMI) An open-source $400 system from Stanford University designed to democratize robot data collection 0 teleop -> autonomously wash dishes (precise), toss (dynamic), and fold clothes (bimanual)

Toru (@toruo_o) 's Twitter Profile Photo

Achieving bimanual dexterity with RL + Sim2Real! toruowo.github.io/bimanual-twist/ TLDR - We train two robot hands to twist bottle lids using deep RL followed by sim-to-real. A single policy trained with simple simulated bottles can generalize to drastically different real-world objects.

Tony Z. Zhao (@tonyzzhao) 's Twitter Profile Photo

Introducing 𝐀𝐋𝐎𝐇𝐀 𝐔𝐧𝐥𝐞𝐚𝐬𝐡𝐞𝐝 🌋 - Pushing the boundaries of dexterity with low-cost robots and AI. Google DeepMind Finally got to share some videos after a few months. Robots are fully autonomous filmed in one continuous shot. Enjoy!

Ayzaan Wahid (@ayzwah) 's Twitter Profile Photo

For the past year we've been working on ALOHA Unleashed 🌋 @GoogleDeepmind - pushing the scale and dexterity of tasks on our ALOHA 2 fleet. Here is a thread with some of the coolest videos! The first task is hanging a shirt on a hanger (autonomous 1x)

Lucas Beyer (bl16) (@giffmana) 's Twitter Profile Photo

✨PaliGemma report will hit arxiv tonight. We tried hard to make it interesting, and not "here model. sota results. kthxbye." So here's some of the many interesting ablations we did, check the paper tomorrow for more! 🧶

✨PaliGemma report will hit arxiv tonight.

We tried hard to make it interesting, and not "here model. sota results. kthxbye."

So here's some of the many interesting ablations we did, check the paper tomorrow for more!

🧶
Zipeng Fu (@zipengfu) 's Twitter Profile Photo

Introduce Mobility VLA - Google's foundation model for navigation - started as my intern project: - Gemini 1.5 Pro for high-level image & text understanding - topological graphs for low-level navigation - supports multimodal instructions co-lead Zhuo Xu, Lewis Chiang, Jie Tan

lmarena.ai (formerly lmsys.org) (@lmarena_ai) 's Twitter Profile Photo

Exciting News from Chatbot Arena! Google DeepMind's new Gemini 1.5 Pro (Experimental 0801) has been tested in Arena for the past week, gathering over 12K community votes. For the first time, Google Gemini has claimed the #1 spot, surpassing GPT-4o/Claude-3.5 with an impressive

Exciting News from Chatbot Arena!

<a href="/GoogleDeepMind/">Google DeepMind</a>'s new Gemini 1.5 Pro (Experimental 0801) has been tested in Arena for the past week, gathering over 12K community votes.

For the first time, Google Gemini has claimed the #1 spot, surpassing GPT-4o/Claude-3.5 with an impressive
Demis Hassabis (@demishassabis) 's Twitter Profile Photo

Never seen a competitive leaderboard that I didn't like 😀 Congrats to the Gemini team on ranking no.1 🏆 with our latest improved Gemini 1.5 Pro developer preview model, which you can try on AI studio now!

Robotics Papers (@oww) 's Twitter Profile Photo

Imagined Potential Games: A Framework for Simulating, Learning and Evaluating Interactive Behaviors. arxiv.org/abs/2411.03669

Jason Ma (@jasonma2020) 's Twitter Profile Photo

Excited to finally share Generative Value Learning (GVL), my Google DeepMind project on extracting universal value functions from long-context VLMs via in-context learning! We discovered a simple method to generate zero-shot and few-shot values for 300+ robot tasks and 50+

Zhuo Xu (@drzhuoxu) 's Twitter Profile Photo

Excited to announce the What Bimanuals Can Do (WBCD) competition at ICRA 2025! We carefully designed challenging and commercially valuable tasks, will provide 15 state of the art bimanual robots and $200k total robot/cash awards! Visit the website to learn more and register ASAP!

Excited to announce the What Bimanuals Can Do (WBCD) competition at ICRA 2025! We carefully designed challenging and commercially valuable tasks, will provide 15 state of the art bimanual robots and $200k total robot/cash awards! Visit the website to learn more and register ASAP!
FrodoBots (@frodobots) 's Twitter Profile Photo

Announcing the 2nd Earth Rover Challenge: an "AI vs Gamers" global navigation competition (to be held #ICRA2025 in May in Atlanta) Co-organized with researchers from Deepmind, Meta & academia A thread 🧵 - 1/n