Khanh Nguyen (on job market) (@khanhxuannguyen) Twitter Tweets • TwiCopy

Khanh Nguyen (on job market)

@khanhxuannguyen

+ Follow

Postdoc at CHAI Berkeley with Prof. Stuart Russell, Prev. Postdoc at Princeton NLP, PhD @umdcs, Human-AI Communication, Interactive Learning, NLP.

ID: 2829879914

linkhttp://machineslearner.com calendar_today24-09-2014 13:18:31

1,1K Tweet

1,1K Followers

468 Following

Khanh Nguyen (on job market)

@khanhxuannguyen

10 months ago

Time for this tweet to resurface. $$$ and great marketing have made grid world relevant again.

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Gary Marcus recently posted a thread responding to my critique of his public statements on AI. Though he didn't name me, it's clear it was directed at me. I want to clarify my position and explain why this isn't about us it's about trust in expertise and why it matters for AI

thumb_up_off_alt27

chat_bubble_outline1

repeat2

shareShare

Khanh Nguyen (on job market)

@khanhxuannguyen

10 months ago

Two reasons to NOT get over-hyped over test-time compute scaling: 1. The fact that we need to train on test distribution just shows that vanilla neural nets don't generalize well. The behaviorist approaches to training neural nets can't teach them to think. The mental search

thumb_up_off_alt18

chat_bubble_outline2

repeat1

shareShare

Khanh Nguyen (on job market)

@khanhxuannguyen

9 months ago

I read the DeepSeek R1 paper and couldn't find what data they RL on. Did I miss something? This is super important but is left out.

thumb_up_off_alt11

chat_bubble_outline1

repeat1

shareShare

Khanh Nguyen (on job market)

@khanhxuannguyen

9 months ago

My meta-reviewer said DPO doesn't learn a reward model.

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Christopher Manning

@chrmanning

9 months ago

Re: “Every major breakthrough in AI has been American”: America does itself no favors when it overestimates its specialness. Yes, the center of the AI industry is the US (California!), but many of the breakthroughs of (neural, gradient-based) AI happened elsewhere: • LSTMs,

thumb_up_off_alt2,2K

chat_bubble_outline75

repeat341

shareShare

Omar Khattab

@lateinteraction

9 months ago

More Qwen. I'm increasingly comfortable saying these papers seem to be a discovery of some sort about Qwen models, not necessarily about reasoning.

thumb_up_off_alt394

chat_bubble_outline13

repeat31

shareShare

Ben Plaut

@benplaut

9 months ago

(1/5) New paper! Despite concerns about AI catastrophe, there isn’t much work on learning while provably avoiding catastrophe. In fact, nearly all of learning theory assumes all errors are reversible. Stuart Russell, Hanlin Zhu and I fill this gap: arxiv.org/pdf/2402.08062

thumb_up_off_alt12

chat_bubble_outline1

repeat6

shareShare

leloy!

@leloykun

7 months ago

I'm not sure if someone has already pointed this out, but Dr. GRPO still has a bias that is more pronounced the smaller the group size is. To make it unbiased, simply multiply Dr. GRPO's A_i by the correction term N/N-1. With this, you'll get LOOP (Leave-One-Out Proximal Policy

thumb_up_off_alt381

chat_bubble_outline11

repeat63

shareShare

Hanze Dong @ ICLR 2025

@hendrydong

7 months ago

🤖What makes GRPO work? Rejection Sampling→Reinforce→GRPO - RS is underrated - Key of GRPO: implicitly remove prompts without correct answer - Reinforce+Filtering > GRPO (better KL) 💻github.com/RLHFlow/Minima… 📄arxiv.org/abs/2504.11343 👀RAFT was invited to ICLR25! Come & Chat☕️

thumb_up_off_alt456

chat_bubble_outline8

repeat110

shareShare

Khanh Nguyen (on job market)

@khanhxuannguyen

6 months ago

You might have heard that "LLMs are overconfident" and "LLMs know what they know" but these claims were only verified on a small set of models. We conduct rigorous experiments to confirm these claims on Q&A tasks for a diverse set of models. Yes, even GPT-4o is terribly

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

Khanh Nguyen (on job market)

@khanhxuannguyen

6 months ago

Sharing the slides of my recent talk at DeepMind on the topic of "Learning from human feedback" docs.google.com/presentation/d…

thumb_up_off_alt4

chat_bubble_outline1

repeat0

shareShare

Kyunghyun Cho

@kchonyc

5 months ago

LLM community has finally caught up with NMT community in 2019. almost there. cc Leshem (Legend) Choshen 🤖🤗 :)

LLM community has finally caught up with NMT community in 2019. almost there.

cc <a href="/LChoshen/">Leshem (Legend) Choshen 🤖🤗</a> :)

thumb_up_off_alt664

chat_bubble_outline9

repeat60

shareShare

Alex Zhang

@a1zhang

5 months ago

Can GPT, Claude, and Gemini play video games like Zelda, Civ, and Doom II? 𝗩𝗶𝗱𝗲𝗼𝗚𝗮𝗺𝗲𝗕𝗲𝗻𝗰𝗵 evaluates VLMs on Game Boy & MS-DOS games given only raw screen input, just like how a human would play. The best model (Gemini) completes just 0.48% of the benchmark! 🧵👇

thumb_up_off_alt518

chat_bubble_outline23

repeat71

shareShare

Chuong M. Huynh

@ryanhuynh1108

5 months ago

CVPR-bound! ✈️ I'll be presenting CoLLM on Friday, 6/13 (Morning, #364) and looking for my next challenge as a full-time Scientist/Engineer. If you're hiring or just want to chat about exciting research, find me there! My work: hmchuong.github.io #CVPR2025 #JobHunt

thumb_up_off_alt22

chat_bubble_outline1

repeat6

shareShare

Khanh Nguyen (on job market)

@khanhxuannguyen

2 months ago

People: Did I say your name correctly? Me: I wish people asked whether they SPELL my name correctly.

thumb_up_off_alt4

chat_bubble_outline1

repeat0

shareShare

Khanh Nguyen (on job market)

Khanh Nguyen (on job market)

Dylan HadfieldMenell

Khanh Nguyen (on job market)

Khanh Nguyen (on job market)

Khanh Nguyen (on job market)

Christopher Manning

Omar Khattab

Ben Plaut

leloy!

Hanze Dong @ ICLR 2025

Khanh Nguyen (on job market)

Khanh Nguyen (on job market)

Kyunghyun Cho

Alex Zhang

Chuong M. Huynh

Khanh Nguyen (on job market)