He He (@hhexiy) 's Twitter Profile
He He

@hhexiy

NLP researcher. Assistant Professor at NYU CS & CDS.

ID: 806193181972779008

linkhttp://hhexiy.github.io calendar_today06-12-2016 17:46:47

123 Tweet

6,6K Followers

382 Following

Ofir Press (@ofirpress) 's Twitter Profile Photo

I'm on the academic job market! I develop autonomous systems for: programming, research-level question answering, finding sec vulnerabilities & other useful+challenging tasks. I do this by building frontier-pushing benchmarks and agents that do well on them. See you at NeurIPS!

I'm on the academic job market! 
I develop autonomous systems for: programming, research-level question answering, finding sec vulnerabilities & other useful+challenging tasks.
I do this by building frontier-pushing benchmarks and agents that do well on them.
See you at NeurIPS!
Chuanyang Jin (@chuanyang_jin) 's Twitter Profile Photo

❓Most reward models are trained using binary judgments—can they effectively capture diverse preferences? 💡Short answer: No, particularly when the training samples are subjective.

❓Most reward models are trained using binary judgments—can they effectively capture diverse preferences?

💡Short answer: No, particularly when the training samples are subjective.
Vishakh Padmakumar (@vishakh_pk) 's Twitter Profile Photo

Had a lot of fun poking holes at how LLMs capture diverse preferences with Chuanyang Jin, Hannah Rose Kirk and He He 🧐! Not all is lost though, a simple regularizing term can help prevent overfitting to binary judgments. Check out our paper SoLaR @ NeurIPS2024 to find out more 😉

Richard Pang (@yzpang_) 's Twitter Profile Photo

🚨🔔Foundational graph search task as testbed: for some distribution, transformers can learn to search (100% acc). We interpreted their algo!! But as graph size ↑, transformers struggle. Scaling up # params does not help; CoT does not help. 1.5 years of learning in 10 pages!

🚨🔔Foundational graph search task as testbed: for some distribution, transformers can learn to search (100% acc). We interpreted their algo!! But as graph size ↑, transformers struggle. Scaling up # params does not help; CoT does not help.

1.5 years of learning in 10 pages!
Hannah Rose Kirk (@hannahrosekirk) 's Twitter Profile Photo

A real honour and career dream that PRISM has won a NeurIPS Conference best paper award! 🌈 One year ago I was sat in a 13,000+ person audience of NeurIPs '23 having just finished data collection. Safe to say I've gone from feeling #stressed to very #blessed 😁

He He (@hhexiy) 's Twitter Profile Photo

Unbelievable. This quote is blatantly false and unnecessary for the argument. And she surely had expected the backlash with the patronizing NOTE. This is racism, not "cultural generalization". NeurIPS Conference

Manos Koukoumidis (@koukoumidis) 's Twitter Profile Photo

If AI isn’t truly open, it will fail us. We can’t close in a black box our greatest invention yet just so that a few can freely monetize. AI needs its Linux moment, and so we started working towards it. This can only succeed if we all work together! #oumi #opensource

Jane Pan (@janepan_) 's Twitter Profile Photo

When benchmarks talk, do LLMs listen? Our new paper shows that evaluating that code LLMs with interactive feedback significantly affects model performance compared to standard static benchmarks! Work w/ Ryan Shar, Jacob Pfau, Ameet Talwalkar, He He, and Valerie Chen! [1/6]

When benchmarks talk, do LLMs listen?

Our new paper shows that evaluating that code LLMs with interactive feedback significantly affects model performance compared to standard static benchmarks!

Work w/ <a href="/RyanShar01/">Ryan Shar</a>, <a href="/jacob_pfau/">Jacob Pfau</a>, <a href="/atalwalkar/">Ameet Talwalkar</a>, <a href="/hhexiy/">He He</a>,  and <a href="/valeriechen_/">Valerie Chen</a>!

[1/6]
NYU Center for Data Science (@nyudatascience) 's Twitter Profile Photo

CDS is hiring a Clinical Professor of Data Science. Teach ML, programming, and specialized courses in our 60 5th Ave building. Renewable contracts with promotion opportunities. Apply by April 1, 2025. For details, see: apply.interfolio.com/155349 #MachineLearning #ML #AIjobs

CDS is hiring a Clinical Professor of Data Science.

Teach ML, programming, and specialized courses in our 60 5th Ave building.

Renewable contracts with promotion opportunities.

Apply by April 1, 2025.

For details, see: apply.interfolio.com/155349

#MachineLearning #ML #AIjobs
Naomi Saphra hiring a lab 🧈🪰 (@nsaphra) 's Twitter Profile Photo

Life update: I'm starting as faculty at Boston University in 2026! BU has SCHEMES for LM interpretability & analysis, so I couldn't be more pumped to join a burgeoning supergroup w/ Najoung Kim 🫠 Aaron Mueller. Looking for my first students, so apply and reach out!

Life update: I'm starting as faculty at Boston University in 2026! BU has SCHEMES for LM interpretability &amp; analysis, so I couldn't be more pumped to join a burgeoning supergroup w/ <a href="/najoungkim/">Najoung Kim 🫠</a> <a href="/amuuueller/">Aaron Mueller</a>. Looking for my first students, so apply and reach out!
Yulin Chen (@yulinchen99) 's Twitter Profile Photo

Reasoning models overthink, generating multiple answers during reasoning. Is it because they can’t tell which ones are right? No! We find while reasoning models encode strong correctness signals during chain-of-thought, they may not use them optimally. 🧵 below

Reasoning models overthink, generating multiple answers during reasoning. Is it because they can’t tell which ones are right?

No! We find while reasoning models encode strong correctness signals during chain-of-thought, they may not use them optimally.

🧵 below
Jane Pan (@janepan_) 's Twitter Profile Photo

Do reasoning models know when their answers are right?🤔 Really excited about this work led by Anqi and Yulin Chen. Check out this thread below!

Yulin Chen (@yulinchen99) 's Twitter Profile Photo

We're excited to receive wide attention from the community—thank you for your support! We release code, trained probes, and the generated CoT data👇 github.com/AngelaZZZ-611/… We have labeled answer data on its way. Stay tuned!

Jiaxin Wen @ICLR2025 (@jiaxinwen22) 's Twitter Profile Photo

I'll present this paper tomorrow (10:00-12:30 am, poster at Hall 3 #300). Let's chat about reward hacking against real humans, not just proxy rewards.

Vishakh Padmakumar (@vishakh_pk) 's Twitter Profile Photo

What does it mean for #LLM output to be novel? In work w/ John(Yueh-Han) Chen, Jane Pan, Valerie Chen, He He we argue it needs to be both original and high quality. While prompting tricks trade one for the other, better models (scaling/post-training) can shift the novelty frontier 🧵

What does it mean for #LLM output to be novel?
In work w/ <a href="/jcyhc_ai/">John(Yueh-Han) Chen</a>, <a href="/JanePan_/">Jane Pan</a>, <a href="/valeriechen_/">Valerie Chen</a>,  <a href="/hhexiy/">He He</a> we argue it needs to be both original and high quality. While prompting tricks trade one for the other, better models (scaling/post-training) can shift the novelty frontier 🧵
He He (@hhexiy) 's Twitter Profile Photo

Automating AI research is bottlenecked by verification speed (running experiments takes time). Our new paper explores whether LLMs can tell which ideas will work before executing them, and they appear to have better research intuition than human researchers.

Jiaxin Wen @ICLR2025 (@jiaxinwen22) 's Twitter Profile Photo

New Anthropic research: We elicit capabilities from pretrained models using no external supervision, often competitive or better than using human supervision. Using this approach, we are able to train a Claude 3.5-based assistant that beats its human-supervised counterpart.

New Anthropic research: We elicit capabilities from pretrained models using no external supervision, often competitive or better than using human supervision.

Using this approach, we are able to train a Claude 3.5-based assistant that beats its human-supervised counterpart.
Percy Liang (@percyliang) 's Twitter Profile Photo

Wrapped up Stanford CS336 (Language Models from Scratch), taught with an amazing team Tatsunori Hashimoto Marcel Rød Neil Band Rohith Kuditipudi. Researchers are becoming detached from the technical details of how LMs work. In CS336, we try to fix that by having students build everything:

He He (@hhexiy) 's Twitter Profile Photo

Talking to ChatGPT isn’t like talking to a collaborator yet. It doesn’t track what you really want to do—only what you just said. Check out work led by John (Yueh-Han) Chen and @rico_angel that shows how attackers can exploit this, and a simple fix: just look at more context!