Xiang Zhou (@xiangzhou14) Twitter Tweets • TwiCopy

Elon Musk

10 months ago

Steven Mackey The reason I’m in America along with so many critical people who built SpaceX, Tesla and hundreds of other companies that made America strong is because of H1B. Take a big step back and FUCK YOURSELF in the face. I will go to war on this issue the likes of which you cannot

thumb_up_off_alt45,45K

chat_bubble_outline16,16K

repeat6,6K

shareShare

Niklas Stoehr

@niklas_stoehr

5 months ago

I recently defended my PhD and moved from one dream team at ETH Zurich to another at DeepMind—a huge thank you to the many people who have supported me along the way! 🤖

thumb_up_off_alt740

chat_bubble_outline14

repeat13

shareShare

Chaitanya K. Joshi @ICLR2025 🇸🇬

@chaitjo

4 months ago

Really thought-provoking new paper on representation learning and the notion of 'semantic compression' by Chen Shani Dan Jurafsky Yann LeCun Ravid Shwartz Ziv

Really thought-provoking new paper on representation learning and the notion of 'semantic compression'

by <a href="/ChenShani2/">Chen Shani</a> <a href="/jurafsky/">Dan Jurafsky</a> <a href="/ylecun/">Yann LeCun</a> <a href="/ziv_ravid/">Ravid Shwartz Ziv</a>

thumb_up_off_alt624

chat_bubble_outline8

repeat111

shareShare

Gokul Swamy

@g_k_swamy

4 months ago

Recent work has seemed somewhat magical: how can RL with *random* rewards make LLMs reason? We pull back the curtain on these claims and find out this unexpected behavior hinges on the inclusion of certain *heuristics* in the RL algorithm. Our blog post: tinyurl.com/heuristics-con…

thumb_up_off_alt477

chat_bubble_outline11

repeat69

shareShare

Quoc Le

@quocleix

3 months ago

Excited to share that a scaled up version of Gemini DeepThink achieves gold-medal standard at the International Mathematical Olympiad. This result is official, and certified by the IMO organizers. Watch out this space, more to come soon! deepmind.google/discover/blog/…

thumb_up_off_alt709

chat_bubble_outline9

repeat51

shareShare

Owain Evans

@owainevans_uk

3 months ago

New paper & surprising result. LLMs transmit traits to other models via hidden signals in data. Datasets consisting only of 3-digit numbers can transmit a love for owls, or evil tendencies. 🧵

thumb_up_off_alt7,7K

chat_bubble_outline260

repeat1,1K

shareShare

Jinjie Ni @ ICLR'25 🇸🇬

@nijinjie

3 months ago

Token crisis: solved. ✅ We pre-trained diffusion language models (DLMs) vs. autoregressive (AR) models from scratch — up to 8B params, 480B tokens, 480 epochs. Findings: > DLMs beat AR when tokens are limited, with >3× data potential. > A 1B DLM trained on just 1B tokens

thumb_up_off_alt1,1K

chat_bubble_outline27

repeat187

shareShare

jack morris

@jxmnop

3 months ago

OpenAI hasn’t open-sourced a base model since GPT-2 in 2019. they recently released GPT-OSS, which is reasoning-only... or is it? turns out that underneath the surface, there is still a strong base model. so we extracted it. introducing gpt-oss-20b-base 🧵

thumb_up_off_alt6,6K

chat_bubble_outline151

repeat458

shareShare

Tom McCoy

@rtommccoy

3 months ago

🤖🧠 NEW PAPER ON COGSCI & AI 🧠🤖 Recent neural networks capture properties long thought to require symbols: compositionality, productivity, rapid learning So what role should symbols play in theories of the mind? For our answer...read on! Paper: arxiv.org/abs/2508.05776 1/n

thumb_up_off_alt164

chat_bubble_outline4

repeat33

shareShare

Dan Jurafsky

@jurafsky

2 months ago

Now that school is starting for lots of folks, it's time for a new release of Speech and Language Processing! Jim and I added all sorts of material for the August 2025 release! With slides to match! Check it out here: web.stanford.edu/~jurafsky/slp3/

thumb_up_off_alt391

chat_bubble_outline7

repeat69

shareShare

Jyo Pari

@jyo_pari

2 months ago

For agents to improve over time, they can’t afford to forget what they’ve already mastered. We found that supervised fine-tuning forgets more than RL when training on a new task! Want to find out why? 👇

thumb_up_off_alt487

chat_bubble_outline5

repeat78

shareShare

Woosuk Kwon

@woosuk_k

2 months ago

At Thinking Machines, our work includes collaborating with the broader research community. Today we are excited to share that we are building a vLLM team at Thinking Machines to advance open-source vLLM and serve frontier models. If you are interested, please DM me or Barret Zoph!

thumb_up_off_alt1,1K

chat_bubble_outline39

repeat81

shareShare

Peirong Liu

@peirong26

a month ago

🇰🇷 From 9.24-9.27, I will be at #MICCAI. This is my first time attending conferences as faculty 🥳 Happy to chat and grab coffee together ☕️ I'll join Women in MICCAI (WiM) Board for 2025-2027. Join us at WiM luncheon on 9.24! On 9.27, we will host FOMO25: Foundation Model Challenge for Brain MRI. See you there!

🇰🇷 From 9.24-9.27, I will be at #MICCAI. This is my first time attending conferences as faculty 🥳 Happy to chat and grab coffee together ☕️

I'll join <a href="/WomenInMICCAI/">Women in MICCAI</a> (WiM) Board for 2025-2027. Join us at WiM luncheon on 9.24!

On 9.27, we will host <a href="/fomochallenge/">FOMO25: Foundation Model Challenge for Brain MRI</a>. See you there!

thumb_up_off_alt13

chat_bubble_outline0

repeat1

shareShare

OpenAI

@openai

a month ago

Today we’re introducing GDPval, a new evaluation that measures AI on real-world, economically valuable tasks. Evals ground progress in evidence instead of speculation and help track how AI improves at the kind of work that matters most. openai.com/index/gdpval-v0

thumb_up_off_alt4,4K

chat_bubble_outline202

repeat596

shareShare

Ming Zhong

@mingzhong_

23 days ago

Vibe coding with an LLM, but the final vibe is off? 🤔 We analyze why models fail the "vibe check" and what truly matters to users. Key insight: human preference 🧑‍💻 ≈ functional correctness ✅ + instruction following 🎯 Check out our paper: arxiv.org/abs/2510.07315

thumb_up_off_alt69

chat_bubble_outline2

repeat17

shareShare

Andrej Karpathy

@karpathy

19 days ago

Excited to release new repo: nanochat! (it's among the most unhinged I've written). Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single,

thumb_up_off_alt16,16K

chat_bubble_outline517

repeat2,2K

shareShare

Shiyue Zhang

@byryuer

17 days ago

Do you ever feel puzzled about the 𝛽 parameter in DPO loss? A bigger 𝛽 is supposed to prefer a solution that is closer to 𝜋-ref, which somehow isn't the case for DPO. Check out this insightful blog written by my colleague Ozan oirs.substack.com/p/a-fundamenta…, which will bring you a

thumb_up_off_alt50

chat_bubble_outline3

repeat6

shareShare

Yi Lin Sung (on job market)

@yilin_sung

9 days ago

Tough week! I also got impacted less than 3 months after joining. Ironically, I just landed some new RL infra features the day before. Life moves on. My past work spans RL, PEFT, Quantization, and Multimodal LLMs. If your team is working on these areas, I’d love to connect.

thumb_up_off_alt509

chat_bubble_outline43

repeat63

shareShare

Ari Holtzman

@universeinanegg

4 days ago

Hiring anybody who can help me explain this Grok output: Me: Show me the seahorse emoji Grok: Here it is: 🦐 Wait, that's a shrimp. My bad—the actual seahorse emoji is: 🦎 No, that's a lizard. Let me get this right: the seahorse is 🦈? Shark? Nope. Actually, the real seahorse

thumb_up_off_alt49

chat_bubble_outline12

repeat1

shareShare

David Chiang

@davidweichiang

2 days ago

I am recruiting a PhD student to work with me, Peter Cholak, Anand Pillay, and Andy Yang Andy J Yang on transformers and logic/model theory (or related topics). If you are interested, please email me with "FLaNN" in the subject line!

thumb_up_off_alt266

chat_bubble_outline8

repeat66

shareShare