Ang Cao (@angcao3) 's Twitter Profile
Ang Cao

@angcao3

Ph.D. at university of Michigan, CSE

ID: 1172611603906232321

calendar_today13-09-2019 20:43:01

67 Tweet

399 Followers

376 Following

Andrew Owens (@andrewhowens) 's Twitter Profile Photo

At #CVPR2024: Tactile-augmented Radiance Fields! We probe a scene with a touch sensor and localize each sample within a NeRF. We use diffusion to estimate the tactile signals for the points we didn't touch. x.com/_YimingDou/sta… w/ Yiming Dou, Antonio Loquercio, Fengyu Yang, Yi Liu

Ziyang Chen (@czyangchen) 's Twitter Profile Photo

These spectrograms look like images, but can also be played as a sound! We call these images that sound. How do we make them? Look and listen below to find out, and to see more examples!

Ayush Shrivastava (@ayshrv) 's Twitter Profile Photo

We present Global Matching Random Walks, a simple self-supervised approach to the Tracking Any Point (TAP) problem, accepted to #ECCV2024. We train a global matching transformer to find cycle consistent tracks through video via contrastive random walks (CRW).

Linyi Jin (@jin_linyi) 's Twitter Profile Photo

Introducing 👀Stereo4D👀 A method for mining 4D from internet stereo videos. It enables large-scale, high-quality, dynamic, *metric* 3D reconstructions, with camera poses and long-term 3D motion trajectories. We used Stereo4D to make a dataset of over 100k real-world 4D scenes.

David Novotny (@davnov134) 's Twitter Profile Photo

We are releasing uCO3D! Built to supercharge 3D GenAI and digital-twin models, this evolution of CO3D features more and higher-quality object videos from 1k categories, 3D Gaussian Splats, and streamlined OSS tools. 💻Data&code: github.com/facebookresear… 📄Paper:

Jianyuan Wang (@jianyuan_wang) 's Twitter Profile Photo

Introducing VGGT (CVPR'25), a feedforward Transformer that directly infers all key 3D attributes from one, a few, or hundreds of images, in seconds! No expensive optimization needed, yet delivers SOTA results for: ✅ Camera Pose Estimation ✅ Multi-view Depth Estimation ✅ Dense

Chris Rockwell (@_crockwell) 's Twitter Profile Photo

Ever wish YouTube had 3D labels? 🚀Introducing🎥DynPose-100K🎥, an Internet-scale collection of diverse videos annotated with camera pose! Applications include camera-controlled video generation🤩and learned dynamic pose estimation😯 Download: huggingface.co/datasets/nvidi…

Ang Cao (@angcao3) 's Twitter Profile Photo

Can we train a 3D-language multimodality Transformer using 2D VLMs and rendering loss? Sasha (Alexander) Sax will present our new #icml25 paper on Wednesday 2pm at Hall B2-B3 W200. Please come and check! Project Page: liftgs.github.io

Tiange Luo (@tiangeluo) 's Twitter Profile Photo

Introducing Visual Test-time Scaling for GUI Agent Grounding (ICCV'25, completed prior to the release of OpenAI-O3) When "thinking with images", the key chanlleging is designing the action in pixels space. We can zoom into regions of varying sizes and shapes, apply image