Behrooz Ghorbani (@_ghorbani) Twitter Tweets • TwiCopy

Sam Altman

10 months ago

🎄🎅starting tomorrow at 10 am pacific, we are doing 12 days of openai. each weekday, we will have a livestream with a launch or demo, some big ones and some stocking stuffers. we’ve got some great stuff to share, hope you enjoy! merry christmas.

thumb_up_off_alt24,24K

chat_bubble_outline1,1K

repeat2,2K

shareShare

Nat McAleese

@__nmca__

10 months ago

o3 represents enormous progress in general-domain reasoning with RL — excited that we were able to announce some results today! Here’s a summary of what we shared about o3 in the livestream (1/n)

thumb_up_off_alt3,3K

chat_bubble_outline39

repeat372

shareShare

Shengjia Zhao

@shengjia_zhao

10 months ago

We are also hiring top researchers/engineers to keep breaking the data wall and find out ways to pretrain both frontier models & extremely cost/performance efficient models. If you are interested in working on this, apply & drop me an email.

thumb_up_off_alt406

chat_bubble_outline16

repeat33

shareShare

Sam Altman

@sama

10 months ago

it is hard to overstate how much alec radford has contributed to the field, and how much of everyone's current progress traces back to his work. i believe he is a genius at the level of einstein, and also he is one of my favorite people ever--hard to imagine a nicer, warmer, or

thumb_up_off_alt8,8K

chat_bubble_outline314

repeat412

shareShare

OpenAI

@openai

9 months ago

Today we’re rolling out a beta version of tasks—a new way to ask ChatGPT to do things for you at a future time. Whether it's one-time reminders or recurring actions, tell ChatGPT what you need and when, and it will automatically take care of it.

thumb_up_off_alt14,14K

chat_bubble_outline722

repeat1,1K

shareShare

OpenAI

@openai

8 months ago

OpenAI o3-mini is now available in ChatGPT and the API. Pro users will have unlimited access to o3-mini and Plus & Team users will have triple the rate limits (vs o1-mini). Free users can try o3-mini in ChatGPT by selecting the Reason button under the message composer.

thumb_up_off_alt13,13K

chat_bubble_outline997

repeat1,1K

shareShare

Aidan Clark

@_aidan_clark_

8 months ago

o3-mini's intelligence x speed combo is incredible, idk what to say other than just try it and see for yourself. This took 8 seconds, how long would it take you?

thumb_up_off_alt611

chat_bubble_outline40

repeat48

shareShare

OpenAI

@openai

8 months ago

Deep Research Live from Tokyo 4pm PT / 9am JST Stay tuned for link to livestream.

thumb_up_off_alt6,6K

chat_bubble_outline490

repeat742

shareShare

Aleksander Madry

@aleks_madry

8 months ago

Do current LLMs perform simple tasks (e.g., grade school math) reliably? We know they don't (is 9.9 larger than 9.11?), but why? Turns out that, for one reason, benchmarks are too noisy to pinpoint such lingering failures. w/ Josh Vendrow Eddie Vendrow Sara Beery 1/5

thumb_up_off_alt241

chat_bubble_outline12

repeat48

shareShare

lmarena.ai (formerly lmsys.org)

@lmarena_ai

7 months ago

GPT-4.5 is singularly leading on the Style Control leaderboard, showing its strength in both style and substance.

thumb_up_off_alt215

chat_bubble_outline6

repeat20

shareShare

Tejal Patwardhan

@tejalpatwardhan

6 months ago

Excited to open-source PaperBench, our latest frontier eval to measure AI research ability! Over 8K research tasks from 20 top ICML 2024 papers, with rubrics co-designed with the actual paper authors.

thumb_up_off_alt219

chat_bubble_outline8

repeat24

shareShare

Sam Altman

@sama

6 months ago

we've got a lot of good stuff for you this coming week! kicking it off tomorrow.

thumb_up_off_alt16,16K

chat_bubble_outline1,1K

repeat901

shareShare

OpenAI

@openai

6 months ago

OpenAI o3 and o4-mini openai.com/live/

thumb_up_off_alt3,3K

chat_bubble_outline189

repeat492

shareShare

Noam Brown

@polynoamial

4 months ago

I'm fortunate to be able to devote my career to researching AI and building reasoning models like o3 for the world to use. If you want to join us in pushing forward the intelligence frontier, we're hiring at OpenAI.

thumb_up_off_alt1,1K

chat_bubble_outline46

repeat48

shareShare

François Chollet

@fchollet

4 months ago

Key to research success: ambition in vision, but pragmatism in execution. You must be guided by a long-term, ambitious goal that addresses a fundamental problem, rather than chasing incremental gains on established benchmarks. Yet, your progress should be grounded by tractable

thumb_up_off_alt1,1K

chat_bubble_outline47

repeat263

shareShare

Behrooz Ghorbani

@_ghorbani

3 months ago

Really cool paper! A valuable lesson we keep seeing in DL optimization research: poorly tuned hyperparameters frequently lead to misleading conclusions.

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Andrej Karpathy

@karpathy

3 months ago

Scaling up RL is all the rage right now, I had a chat with a friend about it yesterday. I'm fairly certain RL will continue to yield more intermediate gains, but I also don't expect it to be the full story. RL is basically "hey this happened to go well (/poorly), let me slightly

thumb_up_off_alt7,7K

chat_bubble_outline371

repeat731

shareShare

Jerry Tworek

@millionint

3 months ago

To summarize this week: - we released general purpose computer using agent - got beaten by a single human in atcoder heuristics competition - solved 5/6 new IMO problems with natural language proofs All of those are based on the same single reinforcement learning system

thumb_up_off_alt1,1K

chat_bubble_outline41

repeat117

shareShare

Behrooz Ghorbani

@_ghorbani

3 months ago

Congrats to Alexander Wei , Sheryl Hsu , Noam Brown , and the team for this truly remarkable result! It's a clear example of the rapid pace of AI progress!

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Behrooz Ghorbani

@_ghorbani

3 months ago

Huge congratulations to AI at Meta and to Shengjia Zhao! Shengjia is one of the most brilliant and kind researchers I’ve had the privilege to work with.

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare