Angelica Chen (@_angie_chen) Twitter Tweets • TwiCopy

Sadhika Malladi

a year ago

Theory + exps from our new work arxiv.org/abs/2405.19534: preference tuning algs often don't and can't teach a model to output preferred responses w/ higher prob than rejected ones! Analysis: find hard-to-learn pairs, forecast model perf w/ ideal training, study on- vs off-policy!

thumb_up_off_alt23

chat_bubble_outline0

repeat5

shareShare

Naomi Saphra hiring a lab 🧈🪰

@nsaphra

a year ago

Wild result. The most popular ranking-based human feedback systems fail, empirically and theoretically, to optimize their ranking objective. Goes beyond issues of interannotator- and self-inconsistency. The algorithm itself fails to approach consistency with ranking data!

thumb_up_off_alt44

chat_bubble_outline1

repeat4

shareShare

Sadhika Malladi

@sadhikamalladi

a year ago

My new blog post argues from first principles how length normalization in preference learning objectives (e.g., SimPO) can facilitate learning from model-annotated preference data. Check it out! cs.princeton.edu/~smalladi/blog…

thumb_up_off_alt79

chat_bubble_outline1

repeat22

shareShare

Richard Pang

@yzpang_

a year ago

Self-rewarding LMs at #icml2024 ! Thru iterative DPO (w/ a small amount of seed data), LLM instruction following ↑ (AlpacaEval 2.0, human, MT-bench) & reward modeling ↑ (corr w human rankings). Jing Xu will be presenting in Vienna (Tues 7/23 11:30am); please stop by! (1/2)

thumb_up_off_alt59

chat_bubble_outline5

repeat10

shareShare

Sadhika Malladi

@sadhikamalladi

a year ago

Be sure to stop by Angie's oral presentation and our poster on our preference learning work (arxiv.org/abs/2405.19534) at the MHFAIA workshop at ICML! We'll also be presenting this poster at the Theoretical Foundations of Foundation Models (TF2M) workshop :)

thumb_up_off_alt10

chat_bubble_outline0

repeat2

shareShare

Angelica Chen

@_angie_chen

a year ago

10/10 nyc aurora 😍 New York Metro Weather

10/10 nyc aurora 😍 <a href="/nymetrowx/">New York Metro Weather</a>

thumb_up_off_alt281

chat_bubble_outline5

repeat43

shareShare

Naomi Saphra hiring a lab 🧈🪰

@nsaphra

a year ago

What makes some LM interpretability research “mechanistic”? In our new position paper in BlackboxNLP, Sarah Wiegreffe and I argue that the practical distinction was never technical, but a historical artifact that we should be—and are—moving past to bridge communities.

What makes some LM interpretability research “mechanistic”? In our new position paper in <a href="/BlackboxNLP/">BlackboxNLP</a>, <a href="/sarahwiegreffe/">Sarah Wiegreffe</a> and I argue that the practical distinction was never technical, but a historical artifact that we should be—and are—moving past to bridge communities.

thumb_up_off_alt335

chat_bubble_outline12

repeat57

shareShare

Nathan C. Frey

@nc_frey

a year ago

LLMs are highly constrained biological sequence optimizers. In new work led by Angelica Chen & Samuel Stanton , we show how to drive an active learning loop for protein design with an LLM. 1/

LLMs are highly constrained biological sequence optimizers. In new work led by <a href="/_angie_chen/">Angelica Chen</a> & <a href="/samuel_stanton_/">Samuel Stanton</a> , we show how to drive an active learning loop for protein design with an LLM.

1/

thumb_up_off_alt126

chat_bubble_outline1

repeat22

shareShare

Samuel Stanton

@samuel_stanton_

a year ago

LLMs are clearly very general interfaces, but we weren't sure they could be made precise enough for protein design to really work. With active data collection, the right preference tuning, and test-time scaling (or just search as we used to call it) it looks like yes!

thumb_up_off_alt28

chat_bubble_outline0

repeat1

shareShare

Angelica Chen

@_angie_chen

a year ago

Check out Sadhika's talk tomorrow! She'll be talking about our paper "Preference Learning Algorithms Do Not Learn Preference Rankings" (arxiv.org/abs/2405.19534) as well as some cool very cool follow-up work :)

thumb_up_off_alt7

chat_bubble_outline0

repeat0

shareShare

Richard Pang

@yzpang_

a year ago

🚨🔔Foundational graph search task as testbed: for some distribution, transformers can learn to search (100% acc). We interpreted their algo!! But as graph size ↑, transformers struggle. Scaling up # params does not help; CoT does not help. 1.5 years of learning in 10 pages!

thumb_up_off_alt113

chat_bubble_outline2

repeat27

shareShare

Nathan C. Frey

@nc_frey

a year ago

Two NeurIPS Conference workshop spotlight talks from our lab this year! Amy Lu will present on all-atom protein generation from sequence-only inputs at MLSB and Angelica Chen will present on LLMs as highly-constrained biophysical sequence optimizers at AIDrugX

thumb_up_off_alt21

chat_bubble_outline1

repeat3

shareShare

Angelica Chen

@_angie_chen

a year ago

I’ll be at NeurIPS this week! Presenting at the Thursday 4:30pm poster session and giving a spotlight talk at the AIDrugX workshop on Sunday. Also, I’ve finally joined 🦋. Come find me, both at NeurIPS and on 🦋! ☺️

thumb_up_off_alt115

chat_bubble_outline2

repeat10

shareShare

Furong Huang

@furongh

10 months ago

I saw a slide circulating on social media last night while working on a deadline. I didn’t comment immediately because I wanted to understand the full context before speaking. After learning more, I feel compelled to address what I witnessed during an invited talk at NeurIPS 2024

thumb_up_off_alt1,1K

chat_bubble_outline48

repeat178

shareShare

NYU Center for Data Science

@nyudatascience

9 months ago

CDS PhD student Angelica Chen presents LLOME, using LLMs to optimize synthetic sequences with potential applications for drug design. Co-led by Samuel Stanton & Nathan C. Frey and with insights from Kyunghyun Cho, Richard Bonneau, and others at Prescient Design. nyudatascience.medium.com/language-model…

thumb_up_off_alt11

chat_bubble_outline0

repeat6

shareShare

Jason Weston

@jaseweston

9 months ago

🚨 Diverse Preference Optimization (DivPO) 🚨 SOTA LLMs have model collapse🫠: they can't generate diverse creative writing or synthetic data 🎨 DivPO trains for both high reward & diversity, vastly improving variety with similar quality. Paper 📝: arxiv.org/abs/2501.18101 🧵below

thumb_up_off_alt343

chat_bubble_outline1

repeat77

shareShare

Vishakh Padmakumar

@vishakh_pk

6 months ago

What does it mean for #LLM output to be novel? In work w/ John(Yueh-Han) Chen, Jane Pan, Valerie Chen, He He we argue it needs to be both original and high quality. While prompting tricks trade one for the other, better models (scaling/post-training) can shift the novelty frontier 🧵

What does it mean for #LLM output to be novel?
In work w/ <a href="/jcyhc_ai/">John(Yueh-Han) Chen</a>, <a href="/JanePan_/">Jane Pan</a>, <a href="/valeriechen_/">Valerie Chen</a>, <a href="/hhexiy/">He He</a> we argue it needs to be both original and high quality. While prompting tricks trade one for the other, better models (scaling/post-training) can shift the novelty frontier 🧵

thumb_up_off_alt82

chat_bubble_outline2

repeat22

shareShare