Angelica Chen (@_angie_chen) 's Twitter Profile
Angelica Chen

@_angie_chen

She/Her | PhD student @NYUDataScience - formerly at @Princeton 🐅
angie-chen at 🦋
Interested in deep learning+NLP, pastries, and running

ID: 703274475576541189

linkhttp://angie-chen55.github.io/ calendar_today26-02-2016 17:44:36

109 Tweet

1,1K Followers

437 Following

Sadhika Malladi (@sadhikamalladi) 's Twitter Profile Photo

Theory + exps from our new work arxiv.org/abs/2405.19534: preference tuning algs often don't and can't teach a model to output preferred responses w/ higher prob than rejected ones! Analysis: find hard-to-learn pairs, forecast model perf w/ ideal training, study on- vs off-policy!

Naomi Saphra hiring a lab 🧈🪰 (@nsaphra) 's Twitter Profile Photo

Wild result. The most popular ranking-based human feedback systems fail, empirically and theoretically, to optimize their ranking objective. Goes beyond issues of interannotator- and self-inconsistency. The algorithm itself fails to approach consistency with ranking data!

Sadhika Malladi (@sadhikamalladi) 's Twitter Profile Photo

My new blog post argues from first principles how length normalization in preference learning objectives (e.g., SimPO) can facilitate learning from model-annotated preference data. Check it out! cs.princeton.edu/~smalladi/blog…

Richard Pang (@yzpang_) 's Twitter Profile Photo

Self-rewarding LMs at #icml2024 ! Thru iterative DPO (w/ a small amount of seed data), LLM instruction following ↑ (AlpacaEval 2.0, human, MT-bench) & reward modeling ↑ (corr w human rankings). Jing Xu will be presenting in Vienna (Tues 7/23 11:30am); please stop by! (1/2)

Sadhika Malladi (@sadhikamalladi) 's Twitter Profile Photo

Be sure to stop by Angie's oral presentation and our poster on our preference learning work (arxiv.org/abs/2405.19534) at the MHFAIA workshop at ICML! We'll also be presenting this poster at the Theoretical Foundations of Foundation Models (TF2M) workshop :)

Naomi Saphra hiring a lab 🧈🪰 (@nsaphra) 's Twitter Profile Photo

What makes some LM interpretability research “mechanistic”? In our new position paper in BlackboxNLP, Sarah Wiegreffe and I argue that the practical distinction was never technical, but a historical artifact that we should be—and are—moving past to bridge communities.

What makes some LM interpretability research “mechanistic”? In our new position paper in <a href="/BlackboxNLP/">BlackboxNLP</a>, <a href="/sarahwiegreffe/">Sarah Wiegreffe</a> and I argue that the practical distinction was never technical, but a historical artifact that we should be—and are—moving past to bridge communities.
Nathan C. Frey (@nc_frey) 's Twitter Profile Photo

LLMs are highly constrained biological sequence optimizers. In new work led by Angelica Chen & Samuel Stanton , we show how to drive an active learning loop for protein design with an LLM. 1/

LLMs are highly constrained biological sequence optimizers. In new work led by <a href="/_angie_chen/">Angelica Chen</a> &amp; <a href="/samuel_stanton_/">Samuel Stanton</a> , we show how to drive an active learning loop for protein design with an LLM.

1/
Samuel Stanton (@samuel_stanton_) 's Twitter Profile Photo

LLMs are clearly very general interfaces, but we weren't sure they could be made precise enough for protein design to really work. With active data collection, the right preference tuning, and test-time scaling (or just search as we used to call it) it looks like yes!

Angelica Chen (@_angie_chen) 's Twitter Profile Photo

Check out Sadhika's talk tomorrow! She'll be talking about our paper "Preference Learning Algorithms Do Not Learn Preference Rankings" (arxiv.org/abs/2405.19534) as well as some cool very cool follow-up work :)

Richard Pang (@yzpang_) 's Twitter Profile Photo

🚨🔔Foundational graph search task as testbed: for some distribution, transformers can learn to search (100% acc). We interpreted their algo!! But as graph size ↑, transformers struggle. Scaling up # params does not help; CoT does not help. 1.5 years of learning in 10 pages!

🚨🔔Foundational graph search task as testbed: for some distribution, transformers can learn to search (100% acc). We interpreted their algo!! But as graph size ↑, transformers struggle. Scaling up # params does not help; CoT does not help.

1.5 years of learning in 10 pages!
Nathan C. Frey (@nc_frey) 's Twitter Profile Photo

Two NeurIPS Conference workshop spotlight talks from our lab this year! Amy Lu will present on all-atom protein generation from sequence-only inputs at MLSB and Angelica Chen will present on LLMs as highly-constrained biophysical sequence optimizers at AIDrugX

Angelica Chen (@_angie_chen) 's Twitter Profile Photo

I’ll be at NeurIPS this week! Presenting at the Thursday 4:30pm poster session and giving a spotlight talk at the AIDrugX workshop on Sunday. Also, I’ve finally joined 🦋. Come find me, both at NeurIPS and on 🦋! ☺️

I’ll be at NeurIPS this week! Presenting at the Thursday 4:30pm poster session and giving a spotlight talk at the AIDrugX workshop on Sunday. Also, I’ve finally joined 🦋. Come find me, both at NeurIPS and on 🦋! ☺️
Furong Huang (@furongh) 's Twitter Profile Photo

I saw a slide circulating on social media last night while working on a deadline. I didn’t comment immediately because I wanted to understand the full context before speaking. After learning more, I feel compelled to address what I witnessed during an invited talk at NeurIPS 2024

NYU Center for Data Science (@nyudatascience) 's Twitter Profile Photo

CDS PhD student Angelica Chen presents LLOME, using LLMs to optimize synthetic sequences with potential applications for drug design. Co-led by Samuel Stanton & Nathan C. Frey and with insights from Kyunghyun Cho, Richard Bonneau, and others at Prescient Design. nyudatascience.medium.com/language-model…

Jason Weston (@jaseweston) 's Twitter Profile Photo

🚨 Diverse Preference Optimization (DivPO) 🚨 SOTA LLMs have model collapse🫠: they can't generate diverse creative writing or synthetic data 🎨 DivPO trains for both high reward & diversity, vastly improving variety with similar quality. Paper 📝: arxiv.org/abs/2501.18101 🧵below

🚨 Diverse Preference Optimization (DivPO) 🚨
SOTA LLMs have model collapse🫠: they can't generate diverse creative writing or synthetic data
🎨 DivPO trains for both high reward &amp; diversity, vastly improving variety with similar quality.
Paper 📝: arxiv.org/abs/2501.18101
🧵below
Vishakh Padmakumar (@vishakh_pk) 's Twitter Profile Photo

What does it mean for #LLM output to be novel? In work w/ John(Yueh-Han) Chen, Jane Pan, Valerie Chen, He He we argue it needs to be both original and high quality. While prompting tricks trade one for the other, better models (scaling/post-training) can shift the novelty frontier 🧵

What does it mean for #LLM output to be novel?
In work w/ <a href="/jcyhc_ai/">John(Yueh-Han) Chen</a>, <a href="/JanePan_/">Jane Pan</a>, <a href="/valeriechen_/">Valerie Chen</a>,  <a href="/hhexiy/">He He</a> we argue it needs to be both original and high quality. While prompting tricks trade one for the other, better models (scaling/post-training) can shift the novelty frontier 🧵