Ilya Sutskever (@ilyasut) Twitter Tweets • TwiCopy

OpenAI

2 years ago

Sam Altman is back as CEO, Mira Murati as CTO and Greg Brockman as President. OpenAI has a new initial board. Messages from Sam Altman and board chair Bret Taylor openai.com/blog/sam-altma…

thumb_up_off_alt12,12K

chat_bubble_outline965

repeat1,1K

shareShare

Ilya Sutskever

@ilyasut

2 years ago

❤️

thumb_up_off_alt6,6K

chat_bubble_outline448

repeat222

shareShare

Super excited about our new research direction for aligning smarter-than-human AI: We finetune large models to generalize from weak supervision—using small models instead of humans as weak supervisors. Check out our new paper: openai.com/research/weak-…

thumb_up_off_alt1,1K

chat_bubble_outline75

repeat316

shareShare

OpenAI

@openai

2 years ago

We're announcing, together with Eric Schmidt: Superalignment Fast Grants. $10M in grants for technical research on aligning superhuman AI systems, including weak-to-strong generalization, interpretability, scalable oversight, and more. Apply by Feb 18! openai.com/blog/superalig…

thumb_up_off_alt2,2K

chat_bubble_outline280

repeat476

shareShare

Leopold Aschenbrenner

@leopoldasch

2 years ago

RLHF works great for today's models. But aligning future superhuman models will present fundamentally new challenges. We need new approaches + scientific understanding. New researchers can make enormous contributions—and we want to fund you! Apply by Feb 18!

thumb_up_off_alt529

chat_bubble_outline27

repeat58

shareShare

OpenAI

@openai

2 years ago

In the future, humans will need to supervise AI systems much smarter than them. We study an analogy: small models supervising large models. Read the Superalignment team's first paper showing progress on a new approach, weak-to-strong generalization: openai.com/research/weak-…

thumb_up_off_alt6,6K

chat_bubble_outline558

repeat1,1K

shareShare

OpenAI

@openai

2 years ago

Large pretrained models have excellent raw capabilities—but can we elicit these fully with only weak supervision? GPT-4 supervised by ~GPT-2 recovers performance close to GPT-3.5 supervised by humans—generalizing to solve even hard problems where the weak supervisor failed!

thumb_up_off_alt705

chat_bubble_outline26

repeat83

shareShare

Pavel Izmailov

@pavel_izmailov

2 years ago

Extremely excited to have this work out, the first paper from the Superalignment team! We study how large models can generalize from supervision of much weaker models. x.com/OpenAI/status/…

thumb_up_off_alt285

chat_bubble_outline16

repeat33

shareShare

Boaz Barak

@boazbaraktcs

2 years ago

My view is that what makes super-alignment "super" is ensuring we can safely scale the capabilities of AIs even though we can't scale their human supervisors. For this, it is imperative to study the "weak teacher strong student" setting. Paper shows great promise in this area!

thumb_up_off_alt430

chat_bubble_outline19

repeat82

shareShare

Leo Gao

@nabla_theta

2 years ago

new paper! one reason aligning superintelligence is hard is because it will be different from current models, so doing useful empirical research today is hard. we fix one major disanalogy of previous empirical setups. I'm excited for future work making it even more analogous.

thumb_up_off_alt411

chat_bubble_outline16

repeat54

shareShare

Jan Leike

@janleike

2 years ago

Kudos especially to Collin Burns for being the visionary behind this work, Pavel Izmailov for all the great scientific inquisition, Ilya Sutskever for stoking the fires, Jan Hendrik Kirchner and Leopold Aschenbrenner for moving things forward every day. Amazing ✨

thumb_up_off_alt144

chat_bubble_outline9

repeat12

shareShare

Greg Brockman

@gdb

2 years ago

New direction for AI alignment — weak-to-strong generalization. Promising initial results: we used outputs from a weak model (fine-tuned GPT-2) to communicate a task to a stronger model (GPT-4), resulting in intermediate (GPT-3-level) performance.

thumb_up_off_alt1,1K

chat_bubble_outline90

repeat210

shareShare

Sam Altman

@sama

2 years ago

i'd particularly like to recognize Collin Burns for today's generalization result, who came to openai excited to pursue this vision and helped get the rest of the team excited about it!

thumb_up_off_alt2,2K

chat_bubble_outline173

repeat159

shareShare