Saurabh Saxena (@srbhsxn) 's Twitter Profile
Saurabh Saxena

@srbhsxn

Researcher at Google Deepmind

ID: 1876211365

linkhttp://saurabhsaxena.org calendar_today17-09-2013 16:59:15

129 Tweet

846 Followers

373 Following

Ben Poole (@poolio) 's Twitter Profile Photo

ReconFusion = 3D Reconstruction + Diffusion prior for novel view synthesis reconfusion.github.io Better NeRFs, less data.

Ruiqi Gao (@ruiqigao) 's Twitter Profile Photo

Looking for diffusion model advancements at #NeurIPS2023? Come to check our oral work "Understanding Diffusion Objectives as the ELBO with Simple Data Augmentation" w/ Durk Kingma. New theoretical understanding, SOTA empirical results, and more! Arxiv: arxiv.org/abs/2303.00848

Looking for diffusion model advancements at #NeurIPS2023? Come to check our oral work "Understanding Diffusion Objectives as the ELBO with Simple Data Augmentation" w/ <a href="/dpkingma/">Durk Kingma</a>. 

New theoretical understanding, SOTA empirical results, and more! 

Arxiv: arxiv.org/abs/2303.00848
AK (@_akhaliq) 's Twitter Profile Photo

NeRFiller: Completing Scenes via Generative 3D Inpainting paper page: huggingface.co/papers/2312.04… propose NeRFiller, an approach that completes missing portions of a 3D capture via generative 3D inpainting using off-the-shelf 2D visual generative models. Often parts of a captured

Saurabh Saxena (@srbhsxn) 's Twitter Profile Photo

Excited to share that our work was accepted for an oral presentation at #NeurIPS2023. If you are interested in diffusion models or computer vision, please drop by our talk and poster on Thursday! nips.cc/virtual/2023/o…

Ethan Weber (@ethanjohnweber) 's Twitter Profile Photo

Excited to release NeRFiller with my amazing collaborators Aleksander Holynski, Varun Jampani, Saurabh Saxena, Noah Snavely, Abhishek Kar, and Angjoo Kanazawa! The project page is available at ethanweber.me/nerfiller/. We focus on scene completion by using a 2D inpainter.

AI Bites | YouTube Channel (@ai_bites) 's Twitter Profile Photo

DMD (Diffusion for Metric Depth) is a state-of-the-art diffusion model for monocular absolute depth estimation. Innovations include: 👉use of log-scale depth parameterization to enable joint modeling of indoor and outdoor scenes, 👉conditioning on the field-of-view (FOV) to

DMD (Diffusion for Metric Depth) is a state-of-the-art diffusion model for monocular absolute depth estimation.

Innovations include:
👉use of log-scale depth parameterization to enable joint modeling of indoor and outdoor scenes, 
👉conditioning on the field-of-view (FOV) to
Alex Carlier (@alexcarliera) 's Twitter Profile Photo

Google just revealed an ABSOLUTE depth estimation model 🤯 As opposed to recent depth models (Marigold, PatchFusion) which aim for maximum details, DMD aims to estimate the ABSOLUTE depth (in meters) within the image More details below ⬇️⬇️

Shek Azizi (@azizishekoofeh) 's Twitter Profile Photo

Hiring Research Scientists within Google DeepMind - Toronto to join our team & advance the next generation of medical AI, develop cutting-edge LLMs & Multi-modal models to tackle real-world healthcare challenges. Please submit your interest through: forms.gle/2cSbBotUwSfVfu…

Daniel Watson (@watson_nn) 's Twitter Profile Photo

[[THREAD]] Happy to announce 4DiM, our diffusion model for novel view synthesis of scenes! 4DiM allows camera+time control with as few as one input image. Joint work with Saurabh Saxena* Lala Li* Andrea Tagliasacchi 🇨🇦 David Fleet *equal contribution

Sander Dieleman (@sedielem) 's Twitter Profile Photo

Diffusion is the rising tide that eventually submerges all frequencies, high and low 🌊 Diffusion is the gradual decomposition into feature scales, fine and coarse 🗼 Diffusion is just spectral autoregression 🤷🌈

Jeff Dean (@jeffdean) 's Twitter Profile Photo

My Google colleague and longtime UC Berkeley faculty member David Patterson has a great essay out in this month's Communications of the ACM (Association for Computing Machinery):🎉 "Life Lessons from the First Half-Century of My Career Sharing 16 life lessons, and nine magic words." I saw an

My <a href="/Google/">Google</a> colleague and longtime <a href="/UCBerkeley/">UC Berkeley</a>  faculty member David Patterson has a great essay out in this month's Communications of the ACM (<a href="/TheOfficialACM/">Association for Computing Machinery</a>):🎉

"Life Lessons from the First Half-Century of My Career
Sharing 16 life lessons, and nine magic words."

I saw an
Michael Tschannen (@mtschannen) 's Twitter Profile Photo

Have you ever wondered how to train an autoregressive generative transformer on text and raw pixels, without a pretrained visual tokenizer (e.g. VQ-VAE)? We have been pondering this during summer and developed a new model: JetFormer 🌊🤖 arxiv.org/abs/2411.19722 A thread 👇 1/

Have you ever wondered how to train an autoregressive generative transformer on text and raw pixels, without a pretrained visual tokenizer (e.g. VQ-VAE)?

We have been pondering this during summer and developed a new model: JetFormer 🌊🤖

arxiv.org/abs/2411.19722

A thread 👇

1/
Saurabh Saxena (@srbhsxn) 's Twitter Profile Photo

SfM failing on dynamic videos? 😠 RoMo to the rescue! 💪 Our simple method uses epipolar cues and semantic features for robustly estimating motion masks, boosting dynamic SfM performance 🚀 Plus, a new dataset of dynamic scenes with ground truth cameras! 🤯 #computervision 🧵👇

Saumya Saxena (@saxena_saumya) 's Twitter Profile Photo

Can 3D scene graphs act as effective online memory for solving EQA tasks in⚡️real-time? Presenting GraphEQA🤖, a framework for grounding Vision Language Models using multimodal memory for real-time embodied question answering.

Andrea Tagliasacchi 🇨🇦 (@taiyasaki) 's Twitter Profile Photo

📢📢📢 Consider applying to SFU, wher we have one of the largest Graphics/Vision groups in the world. Still in time! Deadline is Jan 18, 2025 The language of Vancouver is English, and despite being in 🇨🇦... it is not that cold (similar to Paris, Berlin, London). Links in 🧵

Ricardo Martin-Brualla (@rmbrualla) 's Twitter Profile Photo

Excited about 3D GenAI? There’s something super exciting brewing… If you know of researchers, or 3D / ML / infra engineers, there are positions open in Munich and London. Reach out!

Saurabh Saxena (@srbhsxn) 's Twitter Profile Photo

Our team in Google DeepMind Toronto is hiring a Student Researcher for Summer 2025 to work on projects in Video generative models and 3D Computer Vision. If you are interested please apply at: forms.gle/Yj1jmbvjBFQCzC…