Nikhil Naik (@nikhil_ai) 's Twitter Profile
Nikhil Naik

@nikhil_ai

llama training @Meta (tweets personal) | Previously AI researcher: @MIT, @Harvard, @sfresearch, @Google

ID: 302288218

linkhttp://mit.edu/~naik calendar_today20-05-2011 22:54:14

397 Tweet

1,1K Followers

1,1K Following

Nikhil Naik (@nikhil_ai) 's Twitter Profile Photo

Great to see broader community adoption for Diffusion-DPO and results confirming its efficacy for state-of-the-art models!

Omar Sanseviero (@osanseviero) 's Twitter Profile Photo

What a year for open ML! Trending models on Hugging Face include models from Meta, Google (TimesFM, PaliGemma), Tencent, NVIDIA, DeepSeek, RefuelAI, TII, Salesforce, 01-ai, Apple, Fugaku, Hugging Face, Microsoft, Stability, NousResearch, Gradient, Mistral, ByteDance 🤯

What a year for open ML! 

Trending models on Hugging Face include models from Meta, Google (TimesFM, PaliGemma), Tencent, NVIDIA, DeepSeek, RefuelAI, TII, Salesforce, 01-ai, Apple, Fugaku, Hugging Face, Microsoft, Stability, NousResearch, Gradient, Mistral, ByteDance 🤯
Andrej Karpathy (@karpathy) 's Twitter Profile Photo

Armen Aghajanyan Two related good quotes I heard recently: "You can prove that something won't work at small scale, but not that something works at small scale" "There's way more ideas out there than compute that's willing to take a risk on it"

Hyungjin Chung (@hyungjin_chung) 's Twitter Profile Photo

(1/N) CFG requires high guidance (>5) to "work", but comes with several issues 🤦‍♂️: reduced diversity, saturation, poor invertibility. Is this inevitable? 🤔 Presenting CFG++,🚀 a simple fix enabling small guidance: better sample quality + invertibility, smooth trajectory 🤟

(1/N) CFG requires high guidance (>5) to "work", but comes with several issues 🤦‍♂️: reduced diversity, saturation, poor invertibility. Is this inevitable? 🤔

Presenting CFG++,🚀 a simple fix enabling small guidance: better sample quality + invertibility, smooth trajectory 🤟
Tejas Kulkarni (@tejasdkulkarni) 's Twitter Profile Photo

I am currently holding my dad's cryopreserved brain tumor samples in hopes of creating a personalized vaccine for immunotherapy. However, there are some critical and time-sensitive questions in the attached post: x.com/tejasdkulkarni… This is time-sensitive so would appreciate

I am currently holding my dad's cryopreserved brain tumor samples in hopes of creating a personalized vaccine for immunotherapy. However, there are some critical and time-sensitive questions in the attached post: x.com/tejasdkulkarni…

This is time-sensitive so would appreciate
Ruchi Sanghvi (@rsanghvi) 's Twitter Profile Photo

Very excited to announce that Mark Zuckerberg will be joining us South Park Commons for a talk on Aug 6th! It's a rare chance to hear from one of the great founders of our time on how he kept a -1 to 0 mindset while building Meta. Space is very limited. Apply to attend below.

Very excited to announce that Mark Zuckerberg will be joining us <a href="/southpkcommons/">South Park Commons</a> for a talk on Aug 6th!

It's a rare chance to hear from one of the great founders of our time on how he kept a -1 to 0 mindset while building <a href="/Meta/">Meta</a>.

Space is very limited. Apply to attend below.
Rafael Rafailov @ NeurIPS (@rm_rafailov) 's Twitter Profile Photo

Our new paper MJ-BENCH evaluating generative reward models for text-to-image generation is now out! We find that Large Vision Language Models can act as zero shot feedback providers for diffusion models! More details below 👇

Our new paper MJ-BENCH evaluating generative reward models for text-to-image generation is now out! We find that Large Vision Language Models can act as zero shot feedback providers for diffusion models! More details below 👇
Stefano Ermon (@stefanoermon) 's Twitter Profile Photo

Diffusion models are state-of-the-art for continuous data generation (images, videos, etc). Can they beat autoregressive models also on text generation? Check out our ICML paper tomorrow to find out how. Congrats to my students Aaron Lou Chenlin Meng for the best paper award!

AI at Meta (@aiatmeta) 's Twitter Profile Photo

Introducing Meta Segment Anything Model 2 (SAM 2) — the first unified model for real-time, promptable object segmentation in images & videos. SAM 2 is available today under Apache 2.0 so that anyone can use it to build their own experiences Details ➡️ go.fb.me/p749s5

Aditya Agarwal (@adityaag) 's Twitter Profile Photo

1/ I'm thrilled to share something close to my heart: I'm co-founding Bevel (Bevel) with ben 🤠 and Grey. It's born from my personal journey to better health. Here's my story...

The Nobel Prize (@nobelprize) 's Twitter Profile Photo

BREAKING NEWS The Royal Swedish Academy of Sciences has decided to award the 2024 #NobelPrize in Chemistry with one half to David Baker “for computational protein design” and the other half jointly to Demis Hassabis and John M. Jumper “for protein structure prediction.”

BREAKING NEWS
The Royal Swedish Academy of Sciences has decided to award the 2024 #NobelPrize in Chemistry with one half to David Baker “for computational protein design” and the other half jointly to Demis Hassabis and John M. Jumper “for protein structure prediction.”
Andrej Karpathy (@karpathy) 's Twitter Profile Photo

Moravec's paradox in LLM evals I was reacting to this new benchmark of frontier math where LLMs only solve 2%. It was introduced because LLMs are increasingly crushing existing math benchmarks. The interesting issue is that even though by many accounts (/evals), LLMs are inching

Kyunghyun Cho (@kchonyc) 's Twitter Profile Photo

congratulations, Ian Goodfellow, for the test-of-time award at NeurIPS Conference ! this award reminds me of how GAN started with this one email ian sent to the Mila - Institut québécois d'IA lab mailing list in May 2014. super insightful and amazing execution!

congratulations, <a href="/goodfellow_ian/">Ian Goodfellow</a>, for the test-of-time award at <a href="/NeurIPSConf/">NeurIPS Conference</a> ! 

this award reminds me of how GAN started with this one email ian sent to the <a href="/Mila_Quebec/">Mila - Institut québécois d'IA</a> lab mailing list in May 2014. super insightful and amazing execution!
Jeff Dean (@jeffdean) 's Twitter Profile Photo

It's been really awesome to watch the progression of improvements in AI weather prediction. 5-6 years ago, AI models started to be better than classic methods out to about 6-8 hours. Then it became 2-3 days, and now, AI methods are state of the art out to 15 days (+ way more

Volodymyr Kuleshov 🇺🇦 (@volokuleshov) 's Twitter Profile Photo

Excited to announce the first commercial-scale diffusion language model---Mercury Coder. Mercury runs at 1000 tokens/sec on Nvidia hardware while matching the performance of existing speed-optimized LLMs. Mercury introduces a new approach to language generation inspired by image