Yuanwen Yue (@yueyuanwen) 's Twitter Profile
Yuanwen Yue

@yueyuanwen

ELLIS PhD student @ETH & @UniofOxford. Previously intern @Meta @RealityLabs

ID: 1365699732677357570

linkhttps://ywyue.github.io/ calendar_today27-02-2021 16:26:12

49 Tweet

280 Followers

312 Following

Zhenjun Zhao (@zhenjun_zhao) 's Twitter Profile Photo

Improving 2D Feature Representations by 3D-Aware Fine-Tuning Yuanwen Yue, Anurag Das, Francis Engelmann, Siyu Tang @VLG-ETHZ, Jan Eric Lenssen tl;dr: FiT3D proposes 3D-aware fine-tuning to improve 2D foundation models, like DINOv2, CLIP and MAE arxiv.org/pdf/2407.20229

Improving 2D Feature Representations by 3D-Aware Fine-Tuning

<a href="/YueYuanwen/">Yuanwen Yue</a>, <a href="/_anurag_das/">Anurag Das</a>, <a href="/FrancisEngelman/">Francis Engelmann</a>, <a href="/SiyuTang3/">Siyu Tang @VLG-ETHZ</a>, <a href="/janericlenssen/">Jan Eric Lenssen</a>

tl;dr: FiT3D proposes 3D-aware fine-tuning to improve 2D foundation models, like DINOv2, CLIP and MAE

arxiv.org/pdf/2407.20229
Alexandre Morgand (@almorgand) 's Twitter Profile Photo

"Improving 2D Feature Representations by 3D-Aware Fine-Tuning" TL;DR: Lifting 2D features from AI at Meta's DinoV2 to 3D into 3D gaussian representation. Novel view synthesis on features for depth estimation and segmentation

Gradio (@gradio) 's Twitter Profile Photo

FiT3D - Improving 2D Feature Representations by 3D-Aware Fine-Tuning (ECCV 2024) This work proposes a 3D-aware fine-tuning to improve 2D foundation models, like DINOv2, CLIP, and MAE. Try out the app to see the improvements brought about by FiT3D in your favorite methods!

Andrew Davison (@ajddavison) 's Twitter Profile Photo

Looking forward to what this strong team achieves. Their "LWM" concept will surely catch on, but do spatial world models need to be large? I hope they consider the Computational Structure of #SpatialAI for embodied hardware, where the real challenges lie: arxiv.org/abs/1803.11288

Jan Eric Lenssen (@janericlenssen) 's Twitter Profile Photo

If you are at #ECCV2024 and interested in improving 2D foundation features with 3D, make sure to visit Yuanwen at his Poster in the Wednesday morning session, where he presents FiT3D! Project Page: ywyue.github.io/FiT3D/

ELLIS (@ellisforeurope) 's Twitter Profile Photo

The #ELLISPhD application portal is now open! Apply to top #AI labs & supervisors in Europe with a single application, and choose from different areas & tracks. The call for applications: ellis.eu/news/ellis-phd… Deadline: 15 November 2024 #PhD #PhDProgram #MachineLearning #ML

The #ELLISPhD application portal is now open! Apply to top #AI labs &amp; supervisors in Europe with a single application, and choose from different areas &amp; tracks.

The call for applications: ellis.eu/news/ellis-phd…

Deadline: 15 November 2024

#PhD #PhDProgram #MachineLearning #ML
Ishan Misra (@imisra_) 's Twitter Profile Photo

So, this is what we were up to for a while :) Building SOTA foundation models for media -- text-to-video, video editing, personalized videos, video-to-audio One of the most exciting projects I got to tech lead at my time in Meta!

Julian Straub (@jstraub6) 's Twitter Profile Photo

🔥 EFM3D: a new benchmark for 3D egocentric perception tasks The EFM3D benchmark measures progress on egocentric 3D reconstruction and 3D object detection to accelerate research on egocentric foundation models rooted in 3D space. A new model, EVL, establishes the first baseline.

Eric Brachmann (@eric_brachmann) 's Twitter Profile Photo

We wrote a text on how we envision the future of visual mapping: nianticlabs.com/news/largegeos… So far, we have trained ~50 million ACE networks. 😎 Together, that's 150 trillion parameters... Not bad! But the future lies in going from local small models to a global large model.

Yuanwen Yue (@yueyuanwen) 's Twitter Profile Photo

Happy to see FiT3D being used as an alternative feature extractor to DINOv2 for motion mask extraction🤓, which goes beyond the tasks we originally considered.

Happy to see FiT3D being used as an alternative feature extractor to DINOv2 for motion mask extraction🤓, which goes beyond the tasks we originally considered.
Minghao Chen (@minghaochen23) 's Twitter Profile Photo

🥳Excited to share my recent work at Meta, "PartGen: Part-level 3D Generation and Reconstruction with Multi-View Diffusion Models", which aims at compositional/part-level 3D generation and reconstruction from various modalities. Project page: silent-chen.github.io/PartGen/

Jianyuan Wang (@jianyuan_wang) 's Twitter Profile Photo

Looking for a huge #GaussianSplat dataset? We got you! Introducing #uCo3D: 170K sequences packed with well-trained Gaussians Splat models. It is CC-BY-licensed and ready for your 3D experiments 🧙🧙‍♀️ Project Page: uco3d.github.io

Yuanwen Yue (@yueyuanwen) 's Twitter Profile Photo

This is a great opportunity to join Dora’s new group as a PhD student working on 3D Vision! I have worked with Dora before and learned so much. Highly recommend! 🤓

Jianyuan Wang (@jianyuan_wang) 's Twitter Profile Photo

Introducing VGGT (CVPR'25), a feedforward Transformer that directly infers all key 3D attributes from one, a few, or hundreds of images, in seconds! No expensive optimization needed, yet delivers SOTA results for: ✅ Camera Pose Estimation ✅ Multi-view Depth Estimation ✅ Dense