Yinbo Chen (@yinbochen) 's Twitter Profile
Yinbo Chen

@yinbochen

PhD @UCSanDiego
previous: @MetaAI @AdobeResearch @Tsinghua_uni

ID: 1302897665605214208

linkhttp://yinboc.github.io/ calendar_today07-09-2020 09:13:12

18 Tweet

563 Followers

53 Following

Yinbo Chen (@yinbochen) 's Twitter Profile Photo

Glad to share our work LIIF, has been accepted as #CVPR2021 Oral. We learn to generate images in a continuous representation, which can be presented in an arbitrary resolution (extrapolate to ×30). video & code: yinboc.github.io/liif/

Yanjie Ze (@zeyanjie) 's Twitter Profile Photo

How about 3D pre-training for motor control? We use Video Autoencoder to learn *3D* self-supervised representations from large-scale videos for RL. yanjieze.com/3d4rl Such 3D representations learn faster and transfer better sim-to-real than 2D pre-training such as MoCo.

Xiaolong Wang (@xiaolonw) 's Twitter Profile Photo

🏗️ Policy Adaptation from Foundation Model Feedback #CVPR2023 geyuying.github.io/PAFF/ Instead of using foundation model as a pre-trained encoder (generator), we use it as a Teacher (discriminator) to tell where our policy did wrong and helps it adapts to new envs and tasks.

Xiaolong Wang (@xiaolonw) 's Twitter Profile Photo

The robot climbs stairs🏯, steps over stones 🪨, and runs in the wild🏞️, all in one policy, without any remote control! Our #CVPR2023 Highlight paper achieves this by using RL + a 3D Neural Volumetric Memory (NVM) trained with view synthesis! rchalyang.github.io/NVM/

Isabella Liu (@isabella__liu) 's Twitter Profile Photo

Want to obtain time-consistent dynamic mesh from monocular videos? Introducing: Dynamic Gaussians Mesh: Consistent Mesh Reconstruction from Monocular Videos liuisabella.com/DG-Mesh/ We reconstruct meshes with flexible topology change and build the corresp. across meshes. 🧵(1/n)

Xuxin Cheng (@xuxin_cheng) 's Twitter Profile Photo

 🤖Introducing 📺𝗢𝗽𝗲𝗻-𝗧𝗲𝗹𝗲𝗩𝗶𝘀𝗶𝗼𝗻: a web-based teleoperation software!  🌐Open source, cross-platform (VisionPro & Quest) with real-time stereo vision feedback.  🕹️Easy-to-use hand, wrist, head pose streaming. Code: github.com/OpenTeleVision…

Jiteng Mu (@jitengmu) 's Twitter Profile Photo

We introduce🌟Editable Image Elements🥳, a new disentangled and controllable latent space for diffusion models, that allows for various image editing operations (e.g., move, resize,  de-occlusion, object removal, variations, composition) jitengmu.github.io/Editable_Image… More details🧵👇

An-Chieh Cheng (@anjjei) 's Twitter Profile Photo

🌟Introducing "🤖SpatialRGPT: Grounded Spatial Reasoning in Vision Language Model" anjiecheng.me/SpatialRGPT SpatialRGPT is a powerful region-level VLM that can understand both 2D and 3D spatial arrangements. It can process any region proposal (e.g., boxes or masks) and provide

Jiteng Mu (@jitengmu) 's Twitter Profile Photo

Do we need separate models for image editing and conditional generation? We introduce🌟EditAR, a unified autoregressive model for diverse tasks, e.g., image editing, depth-to-image, edge-to-image, segmentation-to-image. jitengmu.github.io/EditAR/ More details🧵👇

Yinbo Chen (@yinbochen) 's Twitter Profile Photo

Introducing Consistent Flow Distillation, a method with new theoretical perspective for text-to-3D generation: runjie-yan.github.io/cfd/ It is as simple and efficient as SDS, while having significantly better quality and diversity. The first author Runjie is applying for PhD 2025