Ping (Iris) Yu (@ping_iris_yu) 's Twitter Profile
Ping (Iris) Yu

@ping_iris_yu

Research Scientist @MetaAI (FAIR); Previously @amazon @TencentGlobal ; Opinions are my own.

ID: 1038103241966411778

linkhttp://irisyu.me calendar_today07-09-2018 16:34:34

12 Tweet

70 Followers

95 Following

AI at Meta (@aiatmeta) 's Twitter Profile Photo

Announcing OPT-IML: a new language model from Meta AI with 175B parameters, fine-tuned on 2,000 language tasks — openly available soon under a noncommercial license for research use cases. Research paper & more details on GitHub ⬇️

Jason Weston (@jaseweston) 's Twitter Profile Photo

🚨New Paper 🚨 Self-Alignment with Instruction Backtranslation - New method auto-labels web text with instructions & curates high quality ones for FTing - Our model Humpback 🐋 outperforms LIMA, Claude, Guanaco, davinci-003 & Falcon-Inst arxiv.org/abs/2308.06259 (1/4)🧵

🚨New Paper 🚨
Self-Alignment with Instruction Backtranslation

- New method auto-labels web text with instructions & curates high quality ones for FTing

- Our model Humpback 🐋 outperforms LIMA, Claude, Guanaco, davinci-003 & Falcon-Inst

arxiv.org/abs/2308.06259
(1/4)🧵
Ping (Iris) Yu (@ping_iris_yu) 's Twitter Profile Photo

Check our Shepherd model that can critique LLM generations! Training data is available on Github: github.com/facebookresear… x.com/MetaAI/status/…

Jason Weston (@jaseweston) 's Twitter Profile Photo

🚨 Distilling System 2 into System 1🚨 - System 2 LLMs spend compute to improve responses (CoT, BSM, RaR, Sys 2 Attention, ..) - *System 2 distillation* keeps this improvement but distills it back into the base LLM (System 1) outputs arxiv.org/abs/2407.06023 🧵(1/5)

🚨 Distilling System 2 into System 1🚨
- System 2 LLMs spend compute to improve responses (CoT, BSM, RaR, Sys 2 Attention, ..)
- *System 2 distillation* keeps this improvement but distills it back into the base LLM (System 1) outputs
arxiv.org/abs/2407.06023
🧵(1/5)
Furong Huang (@furongh) 's Twitter Profile Photo

I saw a slide circulating on social media last night while working on a deadline. I didn’t comment immediately because I wanted to understand the full context before speaking. After learning more, I feel compelled to address what I witnessed during an invited talk at NeurIPS 2024

Jason Weston (@jaseweston) 's Twitter Profile Photo

💀 Introducing RIP: Rejecting Instruction Preferences💀 A method to *curate* high quality data, or *create* high quality synthetic data. Large performance gains across benchmarks (AlpacaEval2, Arena-Hard, WildBench). Paper 📄: arxiv.org/abs/2501.18578

💀 Introducing RIP: Rejecting Instruction Preferences💀

A method to *curate* high quality data, or *create* high quality synthetic data.

Large performance gains across benchmarks (AlpacaEval2, Arena-Hard, WildBench).

Paper 📄: arxiv.org/abs/2501.18578