Saining Xie (@sainingxie) 's Twitter Profile
Saining Xie

@sainingxie

researcher in #deeplearning #computervision | assistant professor at @NYU_Courant @nyuniversity | previous: research scientist @metaai (FAIR) @UCSanDiego

ID: 1283081795890626560

linkhttp://www.sainingxie.com calendar_today14-07-2020 16:51:59

479 Tweet

20,20K Followers

1,1K Following

Willis (Nanye) Ma (@ma_nanye) 's Twitter Profile Photo

Come and check out our paper, Inference-Time Scaling for Diffusion Models Beyond Denoising Steps, at Poster Session 1 at #CVPR2025, slot 226, happening right now!

Come and check out our paper, Inference-Time Scaling for Diffusion Models Beyond Denoising Steps, at Poster Session 1 at #CVPR2025, slot 226, happening right now!
Georgia Gkioxari (@georgiagkioxari) 's Twitter Profile Photo

I had to cancel my trip to #CVPR2025 because I caught the stupid flu 🙃 So sad to miss everyone! But if you are in Nashville go check out Damiano's project tomorrow on improving 3D spatial reasoning from a single image with the power of LLMs 🌟 #CVPR2025 Damiano Marsili

Andrei Bursuc (@abursuc) 's Twitter Profile Photo

Visuals from slides of Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces #cvpr2025 #startikz

Visuals from slides of Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces #cvpr2025 #startikz
Deedy (@deedydas) 's Twitter Profile Photo

LLMs are far worse at competitive programming than we thought. Every one scored 0% on Hard problems. LiveCodeBench-Pro is a new benchmark with 584 always updating problems from IOI, ICPC and Codeforces. What's most interesting is the categories they perform really poorly on:

LLMs are far worse at competitive programming than we thought. Every one scored 0% on Hard problems.

LiveCodeBench-Pro is a new benchmark with 584 always updating problems from IOI, ICPC and Codeforces.

What's most interesting is the categories they perform really poorly on:
Deedy (@deedydas) 's Twitter Profile Photo

I rarely see benchmark papers with this depth. It has links to problems, solutions and LLM attempts from the PDF directly. And it goes deep into each problem category. Great read. Live leaderboard: livecodebenchpro.com Paper: arxiv.org/pdf/2506.11928

Rohan Paul (@rohanpaul_ai) 's Twitter Profile Photo

This is really BAD news of LLM's coding skill. ☹️ The best Frontier LLM models achieve 0% on hard real-life Programming Contest problems, domains where expert humans still excel. LiveCodeBench Pro, a benchmark composed of problems from Codeforces, ICPC, and IOI (“International

This is really BAD news of LLM's coding skill. ☹️

The best Frontier LLM models achieve 0% on hard real-life Programming Contest problems, domains where expert humans still excel.

LiveCodeBench Pro, a benchmark composed of
problems from Codeforces, ICPC, and IOI (“International
Mathurin Massias (@mathusmassias) 's Twitter Profile Photo

New paper on the generalization of Flow Matching arxiv.org/abs/2506.03719 🤯 Why does flow matching generalize? Did you know that the flow matching target you're trying to learn **can only generate training points**? with Quentin Bertrand, Anne Gagneux & Rémi Emonet 👇👇👇

Benjamin Feuer (@feuerbenjamin) 's Twitter Profile Photo

So excited to announce the DCVLR (Data Curation for Vision-Language Reasoning) competition at NeurIPS 2025, led by Oumi and sponsored by Lambda! 🌟open-data 🌟 🤖 open-models 🤖 💻 open-source 💻 💪anyone can compete for free 💪 dcvlr-neurips.github.io 🧵 1 / n

Saining Xie (@sainingxie) 's Twitter Profile Photo

wait, speaking of false dichotomies---during your phd, you *can* write code, dive into data and systems, collaborate with a team, and build useful things---all while enjoying complete openness and the freedom to pursue what *genuinely* excites you.

Saining Xie (@sainingxie) 's Twitter Profile Photo

guys, real geospatial data is a total goldmine for digital agents. step away from the web browser and get real. (we explored a bit in virl-platform.github.io, but building a simulation-ready pipeline like this could take things way further)

Tal Linzen (@tallinzen) 's Twitter Profile Photo

I'm hiring at least one post-doc! We're interested in creating language models that process language more like humans than mainstream LLMs do, through architectural modifications and interpretability-style steering.

Manling Li (@manlingli_) 's Twitter Profile Photo

Can VLMs build Spatial Mental Models like humans? Reasoning from limited views? Reasoning from partial observations? Reasoning about unseen objects behind furniture / beyond current view? Check out MindCube! 🌐mll-lab-nu.github.io/mind-cube/ 📰arxiv.org/pdf/2506.21458