Clem Bonnet @ICLR 2025 (@clementbonnet16) 's Twitter Profile
Clem Bonnet @ICLR 2025

@clementbonnet16

AI Research @ndea

ID: 1248279744421994501

linkhttps://scholar.google.com/citations?user=H6euRhAAAAAJ calendar_today09-04-2020 16:01:05

208 Tweet

806 Followers

529 Following

Dwarkesh Patel (@dwarkesh_sp) 's Twitter Profile Photo

I still haven't heard a good answer to this question, on or off the podcast. AI researchers often tell me, "Don't worry bout it, scale solves this." But what is the rebuttal to someone who argues that this indicates a fundamental limitation?

I still haven't heard a good answer to this question, on or off the podcast.

AI researchers often tell me, "Don't worry bout it, scale solves this."

But what is the rebuttal to someone who argues that this indicates a fundamental limitation?
Taelin (@victortaelin) 's Twitter Profile Photo

Please do not over-hype this post! HOC is doing a $4m post-seed at $100m valuation to build a dataset with the shortest possible solution (in BLC length) for each ARC Prize instance, and use it to tune SupGen. Our immediate goal is to achieve 85% at <$1 per task, validating the

Machine Learning Street Talk (@mlstreettalk) 's Twitter Profile Photo

We spoke with Clem Bonnet at NeurIPS about his extremely innovative approach to the ARC Prize using a form of test time inference where you search a latent space of a VAE before making an optimal prediction. François Chollet was so impressed, he hired Clem shortly after! 😃 -

ARC Prize (@arcprize) 's Twitter Profile Photo

AGI is reached when the capability gap between humans and computers is zero ARC Prize Foundation measures this to inspire progress Today we preview the unbeaten ARC-AGI-2 + open public donations to fund ARC-AGI-3 TY Schmidt Sciences (Eric Schmidt) for $50k to kick us off!

AGI is reached when the capability gap between humans and computers is zero

ARC Prize Foundation measures this to inspire progress

Today we preview the unbeaten ARC-AGI-2 + open public donations to fund ARC-AGI-3

TY Schmidt Sciences (<a href="/ericschmidt/">Eric Schmidt</a>) for $50k to kick us off!
Ndea (@ndea) 's Twitter Profile Photo

Deep Learning architectures usually aren't trained to perform search at test time, leading to sample inefficiency + poor generalization. Latent Program Network (LPN) builds in test-time adaption by learning a latent space that can be searched. Clem Bonnet Matthew Macfarlane

Nathan Grinsztajn (@ngrinsztajn) 's Twitter Profile Photo

So proud of this one: today we're releasing Command A, our new 111B flagship model tailored for business use cases. Gpt4o/Deepseek-level in evals, better than Sonnet on LMsys, has 256k context length. Also, weights are available now on hf! huggingface.co/CohereForAI/c4…

So proud of this one: today we're releasing Command A, our new 111B flagship model tailored for business use cases. Gpt4o/Deepseek-level in evals, better than Sonnet on LMsys, has 256k context length.
Also, weights are available now on hf! huggingface.co/CohereForAI/c4…
Mohamed Osman (@mohamedosmanml) 's Twitter Profile Photo

Honored to be a guest on the infamous MLST podcast again! We discuss our test-time methods, compositionality in LLMs, limitations of VLMs, logic vs perception, efficient adaptation, and more. Machine Learning Street Talk youtu.be/3p0O28W1ZHg

ARC Prize (@arcprize) 's Twitter Profile Photo

Today we are announcing ARC-AGI-2, an unsaturated frontier AGI benchmark that challenges AI reasoning systems (same relative ease for humans). Grand Prize: 85%, ~$0.42/task efficiency Current Performance: * Base LLMs: 0% * Reasoning Systems: <4%

Today we are announcing ARC-AGI-2, an unsaturated frontier AGI benchmark that challenges AI reasoning systems (same relative ease for humans).

Grand Prize: 85%, ~$0.42/task efficiency

Current Performance:
* Base LLMs: 0%
* Reasoning Systems: &lt;4%
ARC Prize (@arcprize) 's Twitter Profile Photo

ARC Prize 2025 is Live $1M competition to open source a solution to ARC-AGI Your objective: Reach 85% on the private evaluation dataset Progress needs new ideas, not just scale

ARC Prize 2025 is Live

$1M competition to open source a solution to ARC-AGI

Your objective: Reach 85% on the private evaluation dataset

Progress needs new ideas, not just scale
Ndea (@ndea) 's Twitter Profile Photo

Quick hiring update: we've assembled an incredible founding research team. We have no open positions, for now. However — we will create roles for exceptional program synthesis researchers. If that's you: ndea.com/join Onward.

Lewis Hemens (@lewishemens) 's Twitter Profile Photo

I've just finished drafting a fairly thorough review of a lot of the research that went into ARC Prize 2024, all the paper winners, top scorers, and a few hot takes for ARC-AGI-2 in 2025. Here's a short 🧵 with some highlights!

FranƧois Chollet (@fchollet) 's Twitter Profile Photo

Compressing the timeline to get to AGI also means compressing the timeline of every single scientific breakthrough that is downstream of AGI. There is no greater leverage

Clem Bonnet @ICLR 2025 (@clementbonnet16) 's Twitter Profile Photo

I will be at ICLR next week. Always up for chatting about ARC-AGI, program synthesis, RL, open-endedness, or related rabbit holes. DMs open! #ICLR2025

Ndea (@ndea) 's Twitter Profile Photo

Ndea is sponsoring SYNT 2025 - a workshop on synthesis of computing systems - July 22 in Zagreb, Croatia. Part of CAV 2025 (Conference on Computer Aided Verification). Have a synthesis-related paper abstract? Submissions due by May 18. synt2025.github.io

Ndea is sponsoring SYNT 2025 - a workshop on synthesis of computing systems - July 22 in Zagreb, Croatia. Part of CAV 2025 (Conference on Computer Aided Verification).

Have a synthesis-related paper abstract? Submissions due by May 18.

synt2025.github.io
Levi Lelis (@levilelis) 's Twitter Profile Photo

Previous work has shown that programmatic policies—computer programs written in a domain-specific language—generalize to out-of-distribution problems more easily than neural policies. Is this really the case? 🧵

Previous work has shown that programmatic policies—computer programs written in a domain-specific language—generalize to out-of-distribution problems more easily than neural policies.

Is this really the case?  🧵
Matthew Macfarlane (@mattvmacfarlane) 's Twitter Profile Photo

I’ll be presenting two workshop papers: ā€œSearching Latent Program Spacesā€ (oral at Programmatic Representations for Agent Learning) & ā€œInstilling Parallel Reasoning into Language Modelsā€ (AI4Math).