Stanislas Polu (@spolu) Twitter Tweets • TwiCopy

Stanislas Polu

4 months ago

Inverse scaling laws were fundamental in the exploration of pre training scaling laws so it’s super exciting to see them emerge for test time compute scaling. They eventually faded away (because scale won or because model size started being something we don’t readily have access

thumb_up_off_alt11

chat_bubble_outline0

repeat2

shareShare

Mario Gabriele 🦊

@mariogabriele

3 months ago

Just dropped our latest episode featuring Stanislas Polu and Harrison Chase! The Future of AI Agents What's next for AI agents, and how will they change how we work? Stanislas Polu (CEO of Dust, formerly OpenAI research) and Harrison Chase (CEO of LangChain, one of the most

thumb_up_off_alt24

chat_bubble_outline2

repeat5

shareShare

Stanislas Polu

@spolu

3 months ago

Fantastic graph of MiniF2F[0] SotA over the years by the seed prover[1] team (which saturates it!) (h/t Yann Fleureau) When we created this benchmark with Kunhao Zheng and Jesse Michael Han, getting to 100% seemed completely alien. It took 4 years to get there (though arguably DeepMind

Fantastic graph of MiniF2F[0] SotA over the years by the seed prover[1] team (which saturates it!) (h/t <a href="/yannfleureau/">Yann Fleureau</a>)

When we created this benchmark with <a href="/KunhaoZ/">Kunhao Zheng</a> and <a href="/jessemhan/">Jesse Michael Han</a>, getting to 100% seemed completely alien. It took 4 years to get there (though arguably DeepMind

thumb_up_off_alt33

chat_bubble_outline2

repeat8

shareShare

Stanislas Polu

@spolu

3 months ago

Congrats Oleg Mürk! You’ve been targeting IOI since the very beginning, before reasoning models, before chatGPT. In these ancient long forgotten times we were all dreaming of that brave future where machines could match humans at these highest marks of intelligence. The real

thumb_up_off_alt2

chat_bubble_outline3

repeat0

shareShare

Stanislas Polu

@spolu

3 months ago

« The outer loop Era » It’s very hard to predict the future even a few months out in terms of models capabilities. End of 2024, word on the street was that pre-training was saturated (which is mostly true since model sizes have stopped increasing dramatically for years now)

thumb_up_off_alt73

chat_bubble_outline3

repeat13

shareShare

Stanislas Polu

@spolu

2 months ago

I had missed a paper from ByteDance from end of July that is, to my knowledge, the best existence proof that outer loop approaches can yield incredibly powerful results with current models, confirming the potential for this class of approaches. >> Solving Formal Math Problems by

thumb_up_off_alt77

chat_bubble_outline7

repeat6

shareShare

Stanislas Polu

@spolu

2 months ago

If you're in Paris. Come! I'll be talking at this one about the "outer loop era" :loop:

thumb_up_off_alt8

chat_bubble_outline0

repeat0

shareShare

Stanislas Polu

@spolu

2 months ago

In case useful... (i) things I wished I did differently at Dust early on which had somewhat large impact at different stages. Trust me, if you're starting your project, trust me: - Drizzle not Sequelize. Sequelize is a bit too complex for its own good. Drizzle comes battery

thumb_up_off_alt44

chat_bubble_outline4

repeat1

shareShare

Stanislas Polu

@spolu

a month ago

OpenAI AgentBuilder Nov 2025 vs Google PromptChainer Mar 2022 (paper, not a product). At ~~the same time I remember chatting with a folk from Sutter Hill Venture who had built an entire similar agent builder internally. Same ideas, new products. 3 years apart. Obviously since

thumb_up_off_alt15

chat_bubble_outline1

repeat0

shareShare

Stanislas Polu

@spolu

a month ago

"Less is more" HRM[0] is very reminiscent of UT[1] which was released in 2019 (and that we spent a lot of time studying at OpenAI in the reasoning team circa 2020). Uses same dynamic halting approach (ACT). I'm a bit suspicious it does appear in their ablation studies. Exciting

thumb_up_off_alt26

chat_bubble_outline1

repeat2

shareShare

Dust

@dusthq

a month ago

Today we're introducing Frames: interactive data visualization built directly into Dust agents. Your agents now create customized content your team can explore and share, without coming back to you for every variation.

thumb_up_off_alt174

chat_bubble_outline18

repeat12

shareShare

Stanislas Polu

@spolu

a month ago

We are lucky to be bootstrapping Dust with Dust. Building the product we use to build a company was not completely intentional at first but turned out to be the greatest part of the job if you ask me! Dog fooding is not a way to get a better product for us, it’s a core operating

thumb_up_off_alt8

chat_bubble_outline0

repeat2

shareShare

Stanislas Polu

@spolu

24 days ago

Non-poaching agreements are quietly damaging the French tech ecosystem. Non-poaching is per se illegal in the US and France will likely follow this direction soon. Beyond legal compliance, they're simply bad for employees and hence a bad idea. At Dust, we compete for talent the

thumb_up_off_alt13

chat_bubble_outline0

repeat2

shareShare

Stanislas Polu

@spolu

23 days ago

OSS models catching up on frontier proprietary ones is likely the triggering event for the burst of the current tech bubble (which is likely a good kind of a bubble, the ones that help technology permeate society). After that, it’ll be a rough semester (not years, world moving

thumb_up_off_alt35

chat_bubble_outline4

repeat0

shareShare

Stanislas Polu

@spolu

12 days ago

"9 months later but 9 times better" We're releasing a deep-dive agent that has access to your entire company data (structured and unstructured) and all the MCP tools connected to a Dust workspace to perform high level tasks on longer time horizons. It has taken over a number of

thumb_up_off_alt18

chat_bubble_outline4

repeat5

shareShare