Nicolas DUFOUR (@nico_dufour) Twitter Tweets • TwiCopy

Nicolas DUFOUR

a year ago

🌍 Guessing where an image was taken is a hard, and often ambiguous problem. Introducing diffusion-based geolocation—we predict global locations by refining random guesses into trajectories across the Earth's surface! 🗺️ Paper, code, and demo: nicolas-dufour.github.io/plonk

thumb_up_off_alt145

chat_bubble_outline6

repeat37

shareShare

Thibaut Loiseau

@thibaut_loiseau

9 months ago

1/13 🐊 Introducing our latest work on improving relative camera pose regression with a novel pre-training approach Alligat0R (arxiv.org/abs/2503.07561)! Guillaume Bourmaud VincentLepetit

thumb_up_off_alt12

chat_bubble_outline2

repeat8

shareShare

Lucas Ventura

@lucas__ventura

9 months ago

Introducing Chapter-Llama [#CVPR2025], a framework for 𝐯𝐢𝐝𝐞𝐨 𝐜𝐡𝐚𝐩𝐭𝐞𝐫𝐢𝐧𝐠 using Large Language Models! 🎬🦙 Check it out: 📄 Paper: arxiv.org/abs/2504.00072 🔗 Project: imagine.enpc.fr/~lucas.ventura… 💻 Code: github.com/lucas-ventura/… 🤗 Demo: huggingface.co/spaces/lucas-v…

thumb_up_off_alt200

chat_bubble_outline4

repeat39

shareShare

Nicolas DUFOUR

@nico_dufour

8 months ago

This is an idea I've had for a while, but wow, it's working way better than expected! 🚀 The model looks really promising, even though it's just 256px for now.

thumb_up_off_alt16

chat_bubble_outline0

repeat1

shareShare

Imagine-ENPC

@imagineenpc

8 months ago

Looking forward to #CVPR2025! We will present the following papers:

thumb_up_off_alt10

chat_bubble_outline1

repeat4

shareShare

Imagine-ENPC

@imagineenpc

8 months ago

#CVPR2025 Sat June 14 (PM) 🌍 Around the World in 80 Timesteps: A Generative Approach to Global Visual Geolocation Nicolas DUFOUR Vicky Kalogeiton David Picard Loic Landrieu 📄 pdf: arxiv.org/abs/2412.06781 🌐 webpage: nicolas-dufour.github.io/plonk.html

#CVPR2025 Sat June 14 (PM)
🌍 Around the World in 80 Timesteps: A Generative Approach to Global Visual Geolocation
<a href="/nico_dufour/">Nicolas DUFOUR</a> <a href="/VickyKalogeiton/">Vicky Kalogeiton</a> <a href="/david_picard/">David Picard</a> <a href="/loiclandrieu/">Loic Landrieu</a>
📄 pdf: arxiv.org/abs/2412.06781
🌐 webpage: nicolas-dufour.github.io/plonk.html

thumb_up_off_alt8

chat_bubble_outline1

repeat1

shareShare

Nicolas DUFOUR

@nico_dufour

8 months ago

Our paper Around the World got accepted at CVPR! See you in Nashville!

thumb_up_off_alt8

chat_bubble_outline0

repeat1

shareShare

Pablo Pernías

@pabloppp

7 months ago

What is a reasonable amount of GPU hours to train to convergence a "small" t2i diffusion model? 🤔 What would be considered groundbreaking in your opinion?

thumb_up_off_alt8

chat_bubble_outline3

repeat2

shareShare

Nicolas DUFOUR

@nico_dufour

7 months ago

Pablo Pernías You can check arxiv.org/abs/2405.20324 We train a 330M params model for around 500 H100 hours. I've been modernizing it since and it can get pretty close to SoTA

thumb_up_off_alt7

chat_bubble_outline2

repeat1

shareShare

Nicolas DUFOUR

@nico_dufour

7 months ago

Pablo Pernías So in my experience, At this small scale, textual adherence is actually the "easiest" to have. We worked at those scale to train a T2I model trained only on imagenet and we can compete with models like SD XL on Geneval or DPGBench! arxiv.org/abs/2502.21318

thumb_up_off_alt2

chat_bubble_outline1

repeat1

shareShare

Nicolas DUFOUR

@nico_dufour

6 months ago

I will be at #CVPR2025 this week in Nashville. I will be presenting our paper "Around the World in 80 Timesteps: A Generative Approach to Global Visual Geolocation". We tackle geolocalization as a generative task allowing for SOTA performance and more interpretable predictions.

thumb_up_off_alt9

chat_bubble_outline1

repeat5

shareShare

Guillaume Astruc

@g_astruc

6 months ago

🛰️ At #CVPR2025 presenting "AnySat: An Earth Observation Model for Any Resolutions, Scales, and Modalities" - Saturday afternoon, Poster 355! If you're here and want to discuss geolocation or geospatial foundation models, let's connect!

thumb_up_off_alt16

chat_bubble_outline0

repeat3

shareShare

Antoine Guédon

@antoine_guedon

6 months ago

I'm at #CVPR2025 to present our paper 🍵MAtCha Gaussians🍵, today Friday afternoon, Hall D, Poster 53! If you're in Nashville and want to discuss detailed 3D mesh reconstruction from sparse or dense RGB images, let's connect! Kyoto University Computer Vision Lab (Nishino Lab)

thumb_up_off_alt16

chat_bubble_outline0

repeat3

shareShare

Nicolas DUFOUR

@nico_dufour

6 months ago

Come see us in poster 186 to see our poster Around the World in 80 timesteps: A generative Approach to Global Visual Geolocation!

thumb_up_off_alt13

chat_bubble_outline0

repeat2

shareShare

Junyu Xie

@junyuxiearthur

5 months ago

Movies are more than just video clips, they are stories! 🎬 We’re hosting the 1st SLoMO Workshop at #ICCV2025 to discuss Story-Level Movie Understanding & Audio Descriptions! Website: slomo-workshop.github.io Competition: huggingface.co/spaces/SLoMO-W…

thumb_up_off_alt40

chat_bubble_outline1

repeat14

shareShare

Tanishq Mathew Abraham, Ph.D.

@iscienceluvr

5 months ago

Diffusion Beats Autoregressive in Data-Constrained Settings Comparison of diffusion and autoregressive language models from 7M to 2.5B params and up to 80B training tokens. Key findings: 1. Diffusion models surpass autoregressive models given sufficient compute. Across a wide

thumb_up_off_alt680

chat_bubble_outline13

repeat119

shareShare