Yinbo Chen (@yinbochen) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

Glad to share our work LIIF, has been accepted as #CVPR2021 Oral. We learn to generate images in a continuous representation, which can be presented in an arbitrary resolution (extrapolate to ×30). video & code: yinboc.github.io/liif/

thumb_up_off_alt136

chat_bubble_outline6

repeat19

shareShare

Yanjie Ze

@zeyanjie

3 years ago

How about 3D pre-training for motor control? We use Video Autoencoder to learn *3D* self-supervised representations from large-scale videos for RL. yanjieze.com/3d4rl Such 3D representations learn faster and transfer better sim-to-real than 2D pre-training such as MoCo.

thumb_up_off_alt67

chat_bubble_outline2

repeat18

shareShare

Xiaolong Wang

@xiaolonw

2 years ago

🏗️ Policy Adaptation from Foundation Model Feedback #CVPR2023 geyuying.github.io/PAFF/ Instead of using foundation model as a pre-trained encoder (generator), we use it as a Teacher (discriminator) to tell where our policy did wrong and helps it adapts to new envs and tasks.

thumb_up_off_alt120

chat_bubble_outline3

repeat25

shareShare

Xiaolong Wang

@xiaolonw

2 years ago

The robot climbs stairs🏯, steps over stones 🪨, and runs in the wild🏞️, all in one policy, without any remote control! Our #CVPR2023 Highlight paper achieves this by using RL + a 3D Neural Volumetric Memory (NVM) trained with view synthesis! rchalyang.github.io/NVM/

thumb_up_off_alt293

chat_bubble_outline5

repeat62

shareShare

Isabella Liu

@isabella__liu

a year ago

Want to obtain time-consistent dynamic mesh from monocular videos? Introducing: Dynamic Gaussians Mesh: Consistent Mesh Reconstruction from Monocular Videos liuisabella.com/DG-Mesh/ We reconstruct meshes with flexible topology change and build the corresp. across meshes. 🧵(1/n)

thumb_up_off_alt195

chat_bubble_outline8

repeat49

shareShare

Xuxin Cheng

@xuxin_cheng

a year ago

🤖Introducing 📺𝗢𝗽𝗲𝗻-𝗧𝗲𝗹𝗲𝗩𝗶𝘀𝗶𝗼𝗻: a web-based teleoperation software! 🌐Open source, cross-platform (VisionPro & Quest) with real-time stereo vision feedback. 🕹️Easy-to-use hand, wrist, head pose streaming. Code: github.com/OpenTeleVision…

thumb_up_off_alt374

chat_bubble_outline14

repeat89

shareShare

Jiteng Mu

@jitengmu

a year ago

We introduce🌟Editable Image Elements🥳, a new disentangled and controllable latent space for diffusion models, that allows for various image editing operations (e.g., move, resize, de-occlusion, object removal, variations, composition) jitengmu.github.io/Editable_Image… More details🧵👇

thumb_up_off_alt212

chat_bubble_outline6

repeat35

shareShare

An-Chieh Cheng

@anjjei

a year ago

🌟Introducing "🤖SpatialRGPT: Grounded Spatial Reasoning in Vision Language Model" anjiecheng.me/SpatialRGPT SpatialRGPT is a powerful region-level VLM that can understand both 2D and 3D spatial arrangements. It can process any region proposal (e.g., boxes or masks) and provide

thumb_up_off_alt468

chat_bubble_outline11

repeat107

shareShare

Jiteng Mu

@jitengmu

7 months ago

Do we need separate models for image editing and conditional generation? We introduce🌟EditAR, a unified autoregressive model for diverse tasks, e.g., image editing, depth-to-image, edge-to-image, segmentation-to-image. jitengmu.github.io/EditAR/ More details🧵👇

thumb_up_off_alt145

chat_bubble_outline5

repeat31

shareShare

Yinbo Chen

@yinbochen

7 months ago

Introducing Consistent Flow Distillation, a method with new theoretical perspective for text-to-3D generation: runjie-yan.github.io/cfd/ It is as simple and efficient as SDS, while having significantly better quality and diversity. The first author Runjie is applying for PhD 2025

thumb_up_off_alt969

chat_bubble_outline12

repeat159

shareShare