Xiang Zhou (@xiangzhou14) 's Twitter Profile
Xiang Zhou

@xiangzhou14

Engineer @GoogleDeepMind (Gemini team), Ph.D. from UNC-Chapel Hill (@unccs @uncnlp)

ID: 1008983568138190849

linkhttp://owenzx.github.io calendar_today19-06-2018 08:03:23

294 Tweet

444 Followers

667 Following

Elon Musk (@elonmusk) 's Twitter Profile Photo

Steven Mackey The reason I’m in America along with so many critical people who built SpaceX, Tesla and hundreds of other companies that made America strong is because of H1B. Take a big step back and FUCK YOURSELF in the face. I will go to war on this issue the likes of which you cannot

Niklas Stoehr (@niklas_stoehr) 's Twitter Profile Photo

I recently defended my PhD and moved from one dream team at ETH Zurich to another at DeepMind—a huge thank you to the many people who have supported me along the way! 🤖

I recently defended my PhD and moved from one dream team at ETH Zurich to another at DeepMind—a huge thank you to the many people who have supported me along the way! 🤖
Gokul Swamy (@g_k_swamy) 's Twitter Profile Photo

Recent work has seemed somewhat magical: how can RL with *random* rewards make LLMs reason? We pull back the curtain on these claims and find out this unexpected behavior hinges on the inclusion of certain *heuristics* in the RL algorithm. Our blog post: tinyurl.com/heuristics-con…

Recent work has seemed somewhat magical: how can RL with *random* rewards make LLMs reason? We pull back the curtain on these claims and find out this unexpected behavior hinges on the inclusion of certain *heuristics* in the RL algorithm. Our blog post: tinyurl.com/heuristics-con…
Quoc Le (@quocleix) 's Twitter Profile Photo

Excited to share that a scaled up version of Gemini DeepThink achieves gold-medal standard at the International Mathematical Olympiad. This result is official, and certified by the IMO organizers. Watch out this space, more to come soon! deepmind.google/discover/blog/…

Owain Evans (@owainevans_uk) 's Twitter Profile Photo

New paper & surprising result. LLMs transmit traits to other models via hidden signals in data. Datasets consisting only of 3-digit numbers can transmit a love for owls, or evil tendencies. 🧵

New paper & surprising result.
LLMs transmit traits to other models via hidden signals in data.
Datasets consisting only of 3-digit numbers can transmit a love for owls, or evil tendencies. 🧵
Jinjie Ni @ ICLR'25 🇸🇬 (@nijinjie) 's Twitter Profile Photo

Token crisis: solved. ✅ We pre-trained diffusion language models (DLMs) vs. autoregressive (AR) models from scratch — up to 8B params, 480B tokens, 480 epochs. Findings: > DLMs beat AR when tokens are limited, with >3× data potential. > A 1B DLM trained on just 1B tokens

Token crisis: solved. ✅

We pre-trained diffusion language models (DLMs) vs. autoregressive (AR) models from scratch — up to 8B params, 480B tokens, 480 epochs.

Findings:
>  DLMs beat AR when tokens are limited, with >3× data potential.
>  A 1B DLM trained on just 1B tokens
jack morris (@jxmnop) 's Twitter Profile Photo

OpenAI hasn’t open-sourced a base model since GPT-2 in 2019. they recently released GPT-OSS, which is reasoning-only... or is it? turns out that underneath the surface, there is still a strong base model. so we extracted it. introducing gpt-oss-20b-base 🧵

OpenAI hasn’t open-sourced a base model since GPT-2 in 2019.  they recently released GPT-OSS, which is reasoning-only...

or is it? 

turns out that underneath the surface, there is still a strong base model. so we extracted it.

introducing gpt-oss-20b-base 🧵
Tom McCoy (@rtommccoy) 's Twitter Profile Photo

🤖🧠 NEW PAPER ON COGSCI & AI 🧠🤖 Recent neural networks capture properties long thought to require symbols: compositionality, productivity, rapid learning So what role should symbols play in theories of the mind? For our answer...read on! Paper: arxiv.org/abs/2508.05776 1/n

🤖🧠 NEW PAPER ON COGSCI & AI 🧠🤖

Recent neural networks capture properties long thought to require symbols: compositionality, productivity, rapid learning

So what role should symbols play in theories of the mind? For our answer...read on!

Paper: arxiv.org/abs/2508.05776

1/n
Dan Jurafsky (@jurafsky) 's Twitter Profile Photo

Now that school is starting for lots of folks, it's time for a new release of Speech and Language Processing! Jim and I added all sorts of material for the August 2025 release! With slides to match! Check it out here: web.stanford.edu/~jurafsky/slp3/

Jyo Pari (@jyo_pari) 's Twitter Profile Photo

For agents to improve over time, they can’t afford to forget what they’ve already mastered. We found that supervised fine-tuning forgets more than RL when training on a new task! Want to find out why? 👇

For agents to improve over time, they can’t afford to forget what they’ve already mastered.

We found that supervised fine-tuning forgets more than RL when training on a new task! 

Want to find out why? 👇
Woosuk Kwon (@woosuk_k) 's Twitter Profile Photo

At Thinking Machines, our work includes collaborating with the broader research community. Today we are excited to share that we are building a vLLM team at Thinking Machines to advance open-source vLLM and serve frontier models. If you are interested, please DM me or Barret Zoph!

Peirong Liu (@peirong26) 's Twitter Profile Photo

🇰🇷 From 9.24-9.27, I will be at #MICCAI. This is my first time attending conferences as faculty 🥳 Happy to chat and grab coffee together ☕️ I'll join Women in MICCAI (WiM) Board for 2025-2027. Join us at WiM luncheon on 9.24! On 9.27, we will host FOMO25: Foundation Model Challenge for Brain MRI. See you there!

🇰🇷 From 9.24-9.27, I will be at #MICCAI. This is my first time attending conferences as faculty 🥳 Happy to chat and grab coffee together ☕️

I'll join <a href="/WomenInMICCAI/">Women in MICCAI</a> (WiM) Board for 2025-2027. Join us at WiM luncheon on 9.24!

On 9.27, we will host <a href="/fomochallenge/">FOMO25: Foundation Model Challenge for Brain MRI</a>. See you there!
OpenAI (@openai) 's Twitter Profile Photo

Today we’re introducing GDPval, a new evaluation that measures AI on real-world, economically valuable tasks. Evals ground progress in evidence instead of speculation and help track how AI improves at the kind of work that matters most. openai.com/index/gdpval-v0

Ming Zhong (@mingzhong_) 's Twitter Profile Photo

Vibe coding with an LLM, but the final vibe is off? 🤔 We analyze why models fail the "vibe check" and what truly matters to users. Key insight: human preference 🧑‍💻 ≈ functional correctness ✅ + instruction following 🎯 Check out our paper: arxiv.org/abs/2510.07315

Vibe coding with an LLM, but the final vibe is off? 🤔

We analyze why models fail the "vibe check" and what truly matters to users. Key insight: human preference 🧑‍💻 ≈ functional correctness ✅ + instruction following 🎯

Check out our paper: arxiv.org/abs/2510.07315
Andrej Karpathy (@karpathy) 's Twitter Profile Photo

Excited to release new repo: nanochat! (it's among the most unhinged I've written). Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single,

Excited to release new repo: nanochat!
(it's among the most unhinged I've written).

Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single,
Shiyue Zhang (@byryuer) 's Twitter Profile Photo

Do you ever feel puzzled about the 𝛽 parameter in DPO loss? A bigger 𝛽 is supposed to prefer a solution that is closer to 𝜋-ref, which somehow isn't the case for DPO. Check out this insightful blog written by my colleague Ozan oirs.substack.com/p/a-fundamenta…, which will bring you a

Yi Lin Sung (on job market) (@yilin_sung) 's Twitter Profile Photo

Tough week! I also got impacted less than 3 months after joining. Ironically, I just landed some new RL infra features the day before. Life moves on. My past work spans RL, PEFT, Quantization, and Multimodal LLMs. If your team is working on these areas, I’d love to connect.

Ari Holtzman (@universeinanegg) 's Twitter Profile Photo

Hiring anybody who can help me explain this Grok output: Me: Show me the seahorse emoji Grok: Here it is: 🦐 Wait, that's a shrimp. My bad—the actual seahorse emoji is: 🦎 No, that's a lizard. Let me get this right: the seahorse is 🦈? Shark? Nope. Actually, the real seahorse

David Chiang (@davidweichiang) 's Twitter Profile Photo

I am recruiting a PhD student to work with me, Peter Cholak, Anand Pillay, and Andy Yang Andy J Yang on transformers and logic/model theory (or related topics). If you are interested, please email me with "FLaNN" in the subject line!