Yuanwen Yue (@yueyuanwen) Twitter Tweets • TwiCopy

Aran Komatsuzaki

@arankomatsuzaki

a year ago

Improving 2D Feature Representations by 3D-Aware Fine-Tuning proj: ywyue.github.io/FiT3D/ abs: arxiv.org/abs/2407.20229

thumb_up_off_alt57

chat_bubble_outline2

repeat12

shareShare

Improving 2D Feature Representations by 3D-Aware Fine-Tuning Yuanwen Yue, Anurag Das, Francis Engelmann, Siyu Tang @VLG-ETHZ, Jan Eric Lenssen tl;dr: FiT3D proposes 3D-aware fine-tuning to improve 2D foundation models, like DINOv2, CLIP and MAE arxiv.org/pdf/2407.20229

Improving 2D Feature Representations by 3D-Aware Fine-Tuning

<a href="/YueYuanwen/">Yuanwen Yue</a>, <a href="/_anurag_das/">Anurag Das</a>, <a href="/FrancisEngelman/">Francis Engelmann</a>, <a href="/SiyuTang3/">Siyu Tang @VLG-ETHZ</a>, <a href="/janericlenssen/">Jan Eric Lenssen</a>

tl;dr: FiT3D proposes 3D-aware fine-tuning to improve 2D foundation models, like DINOv2, CLIP and MAE

arxiv.org/pdf/2407.20229

thumb_up_off_alt55

chat_bubble_outline0

repeat7

shareShare

Alexandre Morgand

@almorgand

a year ago

"Improving 2D Feature Representations by 3D-Aware Fine-Tuning" TL;DR: Lifting 2D features from AI at Meta's DinoV2 to 3D into 3D gaussian representation. Novel view synthesis on features for depth estimation and segmentation

thumb_up_off_alt99

chat_bubble_outline2

repeat19

shareShare

Gradio

@gradio

a year ago

FiT3D - Improving 2D Feature Representations by 3D-Aware Fine-Tuning (ECCV 2024) This work proposes a 3D-aware fine-tuning to improve 2D foundation models, like DINOv2, CLIP, and MAE. Try out the app to see the improvements brought about by FiT3D in your favorite methods!

thumb_up_off_alt104

chat_bubble_outline3

repeat21

shareShare

Andrew Davison

@ajddavison

a year ago

Looking forward to what this strong team achieves. Their "LWM" concept will surely catch on, but do spatial world models need to be large? I hope they consider the Computational Structure of #SpatialAI for embodied hardware, where the real challenges lie: arxiv.org/abs/1803.11288

thumb_up_off_alt41

chat_bubble_outline1

repeat3

shareShare

Jan Eric Lenssen

@janericlenssen

a year ago

If you are at #ECCV2024 and interested in improving 2D foundation features with 3D, make sure to visit Yuanwen at his Poster in the Wednesday morning session, where he presents FiT3D! Project Page: ywyue.github.io/FiT3D/

thumb_up_off_alt41

chat_bubble_outline1

repeat4

shareShare

ELLIS

@ellisforeurope

a year ago

The #ELLISPhD application portal is now open! Apply to top #AI labs & supervisors in Europe with a single application, and choose from different areas & tracks. The call for applications: ellis.eu/news/ellis-phd… Deadline: 15 November 2024 #PhD #PhDProgram #MachineLearning #ML

thumb_up_off_alt323

chat_bubble_outline2

repeat110

shareShare

Ishan Misra

@imisra_

a year ago

So, this is what we were up to for a while :) Building SOTA foundation models for media -- text-to-video, video editing, personalized videos, video-to-audio One of the most exciting projects I got to tech lead at my time in Meta!

thumb_up_off_alt893

chat_bubble_outline38

repeat70

shareShare

Julian Straub

@jstraub6

a year ago

🔥 EFM3D: a new benchmark for 3D egocentric perception tasks The EFM3D benchmark measures progress on egocentric 3D reconstruction and 3D object detection to accelerate research on egocentric foundation models rooted in 3D space. A new model, EVL, establishes the first baseline.

thumb_up_off_alt14

chat_bubble_outline2

repeat4

shareShare

Eric Brachmann

@eric_brachmann

10 months ago

We wrote a text on how we envision the future of visual mapping: nianticlabs.com/news/largegeos… So far, we have trained ~50 million ACE networks. 😎 Together, that's 150 trillion parameters... Not bad! But the future lies in going from local small models to a global large model.

thumb_up_off_alt196

chat_bubble_outline3

repeat32

shareShare

Yuanwen Yue

@yueyuanwen

9 months ago

Happy to see FiT3D being used as an alternative feature extractor to DINOv2 for motion mask extraction🤓, which goes beyond the tasks we originally considered.

thumb_up_off_alt13

chat_bubble_outline0

repeat0

shareShare

Minghao Chen

@minghaochen23

8 months ago

🥳Excited to share my recent work at Meta, "PartGen: Part-level 3D Generation and Reconstruction with Multi-View Diffusion Models", which aims at compositional/part-level 3D generation and reconstruction from various modalities. Project page: silent-chen.github.io/PartGen/

thumb_up_off_alt234

chat_bubble_outline3

repeat48

shareShare

Jianyuan Wang

@jianyuan_wang

8 months ago

Looking for a huge #GaussianSplat dataset? We got you! Introducing #uCo3D: 170K sequences packed with well-trained Gaussians Splat models. It is CC-BY-licensed and ready for your 3D experiments 🧙🧙‍♀️ Project Page: uco3d.github.io

thumb_up_off_alt89

chat_bubble_outline1

repeat16

shareShare

Yuanwen Yue

@yueyuanwen

8 months ago

This is a great opportunity to join Dora’s new group as a PhD student working on 3D Vision! I have worked with Dora before and learned so much. Highly recommend! 🤓

thumb_up_off_alt11

chat_bubble_outline1

repeat0

shareShare

Jianyuan Wang

@jianyuan_wang

6 months ago

Introducing VGGT (CVPR'25), a feedforward Transformer that directly infers all key 3D attributes from one, a few, or hundreds of images, in seconds! No expensive optimization needed, yet delivers SOTA results for: ✅ Camera Pose Estimation ✅ Multi-view Depth Estimation ✅ Dense

thumb_up_off_alt3,3K

chat_bubble_outline21

repeat195

shareShare

Yuanwen Yue

@yueyuanwen

5 months ago

Excited to see interactive 3D segmentation is pushed to another level for VR ✨ Awesome work from Andrea Simonelli!

thumb_up_off_alt5

chat_bubble_outline0

repeat1

shareShare

Yuanwen Yue

Aran Komatsuzaki

Zhenjun Zhao

Alexandre Morgand

Gradio

Andrew Davison

Jan Eric Lenssen

ELLIS

Ishan Misra

Julian Straub

Eric Brachmann

Yuanwen Yue

Minghao Chen

Jianyuan Wang

Yuanwen Yue

Jianyuan Wang

Yuanwen Yue