Afshin Dehghan (@afshin_dn) Twitter Tweets • TwiCopy

Amir Zamir

2 years ago

We are releasing the 1st version of 4M, a framework for training multimodal foundation models across tens of modalities & tasks, based on scalable masked modeling. Joint effort by EPFL & Apple. 4M: Massively Multimodal Masked Modeling 🌐4m.epfl.ch 🧵1/n

thumb_up_off_alt604

chat_bubble_outline8

repeat135

shareShare

Amir Zamir

@zamir_ar

2 years ago

We'll present at NeurIPS, today at 5pm CST. Spotlight #1022. Effectively bringing sensory modalities to large models is one way to make them more grounded, and ultimately have a more complete World Model. This is a step in that direction hopefully, and more will come.

thumb_up_off_alt71

chat_bubble_outline1

repeat9

shareShare

Afshin Dehghan

@afshin_dn

7 months ago

🚀 Model and data for our CubifyAnything project are now released! 🔗 github.com/apple/ml-cubif… #SpatialReasoning #3DObjectDetection #transformers #detection #ai #genai

thumb_up_off_alt3

chat_bubble_outline0

repeat1

shareShare

Afshin Dehghan

@afshin_dn

7 months ago

Excited to share that we have recently released the source code for FlexTok, bringing a fresh perspective to tokenization. Code on GitHub: lnkd.in/g4iNJFmU. Project Page: flextok.epfl.ch #FlexTok #Tokenization #MachineLearning #MLResearch #OpenSource #AI

thumb_up_off_alt37

chat_bubble_outline0

repeat7

shareShare

Afshin Dehghan

@afshin_dn

6 months ago

Singapore can get you off a plane, through immigration, and into a cab in under 30 minutes. But at #ICLR25, you’ll need over 2 hours and a 0.5 mile hike just to get your badge. Congrats to #ICLR for breaking the record for most academic patience ever tested. #ICLR25 #ConfLife

thumb_up_off_alt8

chat_bubble_outline0

repeat0

shareShare

Francis Engelmann

@francisengelman

5 months ago

Very excited to announce our final line-up of fantastic speakers at this year's #CVPR2025 workshop on Open-World 3D Scene Understanding with Foundation Models ✨ #OpenSUN3D #cvpr2025 📆 June 12, 2pm-6pm 🏡 opensun3d.github.io

Very excited to announce our final line-up of fantastic speakers at this year's <a href="/CVPR/">#CVPR2025</a> workshop on Open-World 3D Scene Understanding with Foundation Models ✨ #OpenSUN3D #cvpr2025

📆 June 12, 2pm-6pm
🏡 opensun3d.github.io

thumb_up_off_alt70

chat_bubble_outline0

repeat6

shareShare

Afshin Dehghan

@afshin_dn

5 months ago

Incredibly proud of the work across teams in delivering the latest version of Visual Intelligence. Visual Intelligence makes it faster to do more with what’s right in front of you. #WWDC25 #visualintelligence #AppleIntelligence

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

David Mizrahi

@dmizrahi_

3 months ago

Excited to share our new work: “Language Models Improve When Pretraining Data Matches Target Tasks” Yes, it sounds obvious (and it is!), but typically this only happens implicitly and indirectly: intuitively select data → benchmark → refine → repeat. We wondered: what

thumb_up_off_alt390

chat_bubble_outline7

repeat48

shareShare

Afshin Dehghan

@afshin_dn

3 months ago

Yesterday we shared our latest work on pretraining data curation. What if we stop guessing which data is “good” and directly match pretraining data to the benchmarks we care about? 📄 arxiv.org/abs/2507.12466 #AIResearch #llm #DataCuration #Pretraining #ScalingLaws

thumb_up_off_alt23

chat_bubble_outline0

repeat4

shareShare