Nithin GK (@nithingk10) 's Twitter Profile
Nithin GK

@nithingk10

PhD @JHU| Ex Intern @Google @NVIDIA Research, @Adobe,MERL | @IITMadras alum. Working on Image and video diffusion models.

ID: 1393206635972206600

linkhttp://nithin-gk.github.io calendar_today14-05-2021 14:08:50

29 Tweet

74 Followers

304 Following

Vishal Patel (@vishalm_patel) 's Twitter Profile Photo

📢 Excited to share that the JHU VIU Lab will present three papers at #NeurIPS2024 this week! 🎉 Join us at our poster sessions to explore our latest research and connect. See you there! 🚀 JHU CLSP JHU ECE JHU Computer Science Shameema Sikder, MD Yiqun Mei jay paranjape

📢 Excited to share that the JHU VIU Lab will present three papers at #NeurIPS2024 this week! 🎉 Join us at our poster sessions to explore our latest research and connect. See you there! 🚀 <a href="/jhuclsp/">JHU CLSP</a> <a href="/JHUECE/">JHU ECE</a> <a href="/JHUCompSci/">JHU Computer Science</a> <a href="/ShameemaSikder/">Shameema Sikder, MD</a> <a href="/myq_1997/">Yiqun Mei</a> <a href="/JayParanjape99/">jay paranjape</a>
Google DeepMind (@googledeepmind) 's Twitter Profile Photo

Today, we’re announcing Veo 2: our state-of-the-art video generation model which produces realistic, high-quality clips from text or image prompts. 🎥 We’re also releasing an improved version of our text-to-image model, Imagen 3 - available to use in ImageFX through

Jason Baldridge (@jasonbaldridge) 's Twitter Profile Photo

My daughter was annoyed that I did this only for cats and she wanted our labradoodle Ivy (left, enjoying the ride) and her friend Bailey (a bit worried) to ride a coaster too. #Veo2 (You can see the real Ivy in many DOCCI dataset images: google.github.io/docci/viz.html…)

Ming-Yu Liu (@liu_mingyu) 's Twitter Profile Photo

github.com/NVIDIA/Cosmos Cosmos is a developer-first platform designed to help physical AI builders accelerate their development. It has pre-trained world foundation models (diffusion & autoregressive) in different sizes and video tokenizers. They are open models with permissive

Simo Ryu (@cloneofsimo) 's Twitter Profile Photo

Wild numbers. If you plot trajectory of the non-stochastic diffusion sampling, 99.8% of the latent of the entire trajectory can be explained with first two principle components. Roughly speaking, your entire diffusion trajectory is 99.8% two dimensional.

Durk Kingma (@dpkingma) 's Twitter Profile Photo

It's already the case that people's free will gets hijacked by screens for hours a day, with lots of negative consequences. AI video can make this worse, since it's directly optimizable. AI video has positive uses, but most of it will be fast food for the mind.

vitrupo (@vitrupo) 's Twitter Profile Photo

OpenAI's Greg Brockman says the AGI future looks less like a monolith - and more like a menagerie of specialized agents. Models that call other models. “We're heading to a world where the economy is fundamentally powered by AI.” The goal is to unlock 10x more activity, output,

Sander Dieleman (@sedielem) 's Twitter Profile Photo

If you've read my latest blog post on generative modelling in latent space, this one is a great follow-up about putting things into practice. openworldlabs.ai/blog/training-…

Ming-Yu Liu (@liu_mingyu) 's Twitter Profile Photo

We post-trained a reasoning model to reason whether a video is real or generated. It might be very useful as a critic to improve video generators. Take a look. NVIDIA AI

Ming-Yu Liu (@liu_mingyu) 's Twitter Profile Photo

For people looking for a diffusion-based video generator to finetune or post-train for their downstream physical AI applications, we just released our latest one. We have 2 models: 2B and 14B. 2B for fast prototyping and 14B for better quality. The license is fully open. Give it

Ruiqi Gao (@ruiqigao) 's Twitter Profile Photo

Scalable EBM training?? Can’t believe I can see this keyword in my life 😭. This is amazing! On a second read of the algorithm, looks like it is combining EBM with a training algorithm closer to AR. But still, leveraging EBM as a way of latent thinking is such a cool idea 💗

Kevin Lu (@_kevinlu) 's Twitter Profile Photo

Why you should stop working on RL research and instead work on product // The technology that unlocked the big scaling shift in AI is the internet, not transformers I think it's well known that data is the most important thing in AI, and also that researchers choose not to work

Why you should stop working on RL research and instead work on product //
The technology that unlocked the big scaling shift in AI is the internet, not transformers

I think it's well known that data is the most important thing in AI, and also that researchers choose not to work
Vishal Patel (@vishalm_patel) 's Twitter Profile Photo

🚀 Open Vision Reasoner (OVR) Transferring linguistic cognitive behaviors to visual reasoning via large-scale multimodal RL. SOTA on MATH500 (95.3%), MathVision, and MathVerse. 💻 Code: github.com/Open-Reasoner-… 🌐 Project: weiyana.github.io/Open-Vision-Re… #LLM yana wei Johns Hopkins Engineering

Thinking Machines (@thinkymachines) 's Twitter Profile Photo

Today Thinking Machines Lab is launching our research blog, Connectionism. Our first blog post is “Defeating Nondeterminism in LLM Inference” We believe that science is better when shared. Connectionism will cover topics as varied as our research is: from kernel numerics to

Today Thinking Machines Lab is launching our research blog, Connectionism. Our first blog post is “Defeating Nondeterminism in LLM Inference”

We believe that science is better when shared. Connectionism will cover topics as varied as our research is: from kernel numerics to
Lucas Beyer (bl16) (@giffmana) 's Twitter Profile Photo

Did you know that when they say stuff like "The A18 uses TSMC's 3nm process" or "announced the 2nm node" The 3nm, 2nm actually doesn't mean anything?! It's just like a version number. They make it up. Literally nothing measures 2nm or 3nm. I certainly didn't know.

Did you know that when they say stuff like "The A18 uses TSMC's 3nm process" or "announced the 2nm node"

The 3nm, 2nm actually doesn't mean anything?! It's just like a version number. They make it up. Literally nothing measures 2nm or 3nm.

I certainly didn't know.
Amandeep Kumar (@amandee59573123) 's Twitter Profile Photo

🧩 “It’s not AR vs diffusion… it’s AR through diffusion.” 👉Is it possible that Visual Autoregressive are models are secretely a discrete diffusion ? We show: with a Markovian attention mask, VAR becomes mathematically equivalent to discrete diffusion. Here's how 🧵👇

Xintao Wang (@xinntao) 's Twitter Profile Photo

🥳🥳DiT w/o VAE, but with Semantic Encoder, such as DINO! We introduce SVG (Self-supervised representation for Visual Generation) . Paper: huggingface.co/papers/2510.15… Code: github.com/shiml20/SVG

🥳🥳DiT w/o VAE, but with Semantic Encoder, such as DINO!
We introduce SVG (Self-supervised representation for Visual Generation) .
Paper: huggingface.co/papers/2510.15…
Code: github.com/shiml20/SVG