Tai Wang (@wangtai97) Twitter Tweets • TwiCopy

OpenMMLab

2 years ago

🥳MMDetection3D Release v1.2.0! - Support 3D Occupancy Prediction task and TPVFormer, a Camera-only LIDAR Semantic Segmentation model, and support LIDAR Semantic Segmentation on NuScenes Dataset. - Support BEVFusion, a LiDAR and Camera fusion model. #cv #AI

thumb_up_off_alt37

chat_bubble_outline1

repeat10

shareShare

OpenDriveLab

@opendrivelab

2 years ago

🔥Introducing the Largest 3D Occupancy Prediction Benchmark in autonomous driving: medium.com/@opendrivelab/… Check out our GitHub (github.com/OpenDriveLab/O…) for more details on the dataset, leaderboard, and upcoming challenge in 2024 (opendrivelab.com/AD24Challenge.…).

thumb_up_off_alt24

chat_bubble_outline0

repeat7

shareShare

Runsen Xu

@runsen_xu

2 years ago

🌪️ Despite the ongoing super Typhoon Saola in Hong Kong, I'm excited to introduce PointLLM! 🌈🔍 It's a multi-modal large language model that understands point clouds. 1/4🧵 🔗 Demo: http://101.230.144.196 📄 Paper: arxiv.org/abs/2308.16911 💻 Code: github.com/OpenRobotLab/P…

thumb_up_off_alt212

chat_bubble_outline11

repeat42

shareShare

Jiangmiao Pang

@pangjiangmiao

2 years ago

🔥Unified Human-Scene Interaction via Prompted Chain-of-Contacts🔥 #UniHSI Code: github.com/OpenRobotLab/U… Project Page: xizaoqu.github.io/unihsi/ ArXiv: arxiv.org/abs/2309.07918 Hugging Face: huggingface.co/papers/2309.07…

thumb_up_off_alt113

chat_bubble_outline4

repeat27

shareShare

Jiangmiao Pang

@pangjiangmiao

2 years ago

Seeking a policy that can empower your robot to traverse any terrain? Our Hybrid Internal Model achieves this easily and only costs 1 hour. Key Insight: Estimating environmental dynamics with the robot's response. ArXiv: arxiv.org/abs/2312.11460 Code: github.com/OpenRobotLab/H…

thumb_up_off_alt114

chat_bubble_outline1

repeat21

shareShare

Tai Wang

@wangtai97

2 years ago

Welcome to try the embodied AI track! We make a preliminary attempt with the multi-view 3D visual grounding benchmark. Cannot wait to see innovative submissions beating our baselines and top our benchmark! Thanks the great efforts from OpenDriveLab & other organizers!

thumb_up_off_alt15

chat_bubble_outline1

repeat2

shareShare

AK

@_akhaliq

a year ago

Learning H-Infinity Locomotion Control Stable locomotion in precipitous environments is an essential capability of quadruped robots, demanding the ability to resist various external disturbances. However, recent learning-based policies only use basic domain randomization to

thumb_up_off_alt122

chat_bubble_outline2

repeat24

shareShare

Jiangmiao Pang

@pangjiangmiao

a year ago

We are excited to introduce #GRUtopia! The first simulated **city-scale** interactive 3D society designed for various robots that serve humans! Integrate GRScenes, GRResidents, and GRBench! 100k+ Interactive scenes, 89 scene categories~ Code: github.com/OpenRobotLab/G…

thumb_up_off_alt52

chat_bubble_outline1

repeat9

shareShare

AK

@_akhaliq

a year ago

LLaVA-3D A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness Recent advancements in Large Multimodal Models (LMMs) have greatly enhanced their proficiency in 2D visual understanding tasks, enabling them to effectively process and understand images and videos.

thumb_up_off_alt246

chat_bubble_outline1

repeat44

shareShare

Gao Jiawei

@winstongu_

a year ago

Imagine a future where you can ask humanoid robots to clean your room, but some items, like heavy sofas, are too challenging for just one robot to move. Introducing CooHOI, a learning-based framework designed for the cooperative transportation of objects by multiple humanoid

thumb_up_off_alt324

chat_bubble_outline9

repeat61

shareShare

Runsen Xu

@runsen_xu

a year ago

How to achieve 3D perception without reconstructed point clouds or additional training, using only generalizable 2D and language foundation models? At #CoRL2024, we introduce VLM-Grounder, a zero-shot VLM agent for 3D visual grounding. Paper: arxiv.org/abs/2410.13860 with codes.

thumb_up_off_alt159

chat_bubble_outline1

repeat39

shareShare

Jiangmiao Pang

@pangjiangmiao

10 months ago

Excited to introduce the Perceptive Internal Model (PIM) for Humanoid Robots! The first policy simultaneously for: - Go up and down stairs, jump gaps, and 50cm high platforms. - Indoor and outdoor scenarios. - Unitree H1 and Fourier GR-1 robots. Paper: arxiv.org/abs/2411.14386

thumb_up_off_alt86

chat_bubble_outline3

repeat20

shareShare

Elgce

@benqingwei

7 months ago

🫰Thrilled to introduce HOMIE: Humanoid Loco-Manipulation with Isomorphic Exoskeleton Cockpit. Website: homietele.github.io Code: github.com/OpenRobotLab/O… YouTube: youtu.be/FxkGmjyMc5g 😀 HOMIE consists of a novel RL-based training framework and a self-designed hardware

thumb_up_off_alt229

chat_bubble_outline11

repeat53

shareShare

Yixin Chen

@_yixinchen

7 months ago

📢📢📢Excited to announce the 5th Workshop on 3D Scene Understanding for Vision, Graphics, and Robotics at #CVPR2025! Expect our awesome speakers and challenges on multi-modal 3D scene understanding and reasoning. 🎉🎉🎉#CVPR2025 Learn more at scene-understanding.com.

thumb_up_off_alt41

chat_bubble_outline3

repeat10

shareShare

Wenzhe Cai

@wenzhec7616

4 months ago

🤖Can we build a generalized robot navigation policy without any real-robot data? 👏We introduce the NavDP, which can zero-shot adapt to different robots in the open world. Website: wzcai99.github.io/navigation-dif… Github: github.com/wzcai99/NavDP/ Arxiv: arxiv.org/abs/2505.08712

thumb_up_off_alt19

chat_bubble_outline1

repeat3

shareShare

AK

@_akhaliq

3 months ago

StreamVLN Streaming Vision-and-Language Navigation via SlowFast Context Modeling

thumb_up_off_alt82

chat_bubble_outline2

repeat20

shareShare