Michael Hodel (@bayesilicon) Twitter Tweets • TwiCopy

Wenhao Li

a year ago

We trained a Vision Transformer to solve ONE single task from François Chollet and Mike Knoop’s ARC Prize. Unexpectedly, it failed to produce the test output, even when using 1 MILLION examples! Why is this the case? 🤔

We trained a Vision Transformer to solve ONE single task from <a href="/fchollet/">François Chollet</a> and <a href="/mikeknoop/">Mike Knoop</a>’s <a href="/arcprize/">ARC Prize</a>. Unexpectedly, it failed to produce the test output, even when using 1 MILLION examples! Why is this the case? 🤔

thumb_up_off_alt1,1K

chat_bubble_outline29

repeat128

shareShare

Michael Hodel

@bayesilicon

a year ago

just achieved a score of 53% on the ARC Prize - what a feeling! Jack Cole Mohamed Osman lets gooo!

thumb_up_off_alt332

chat_bubble_outline17

repeat21

shareShare

Michael Hodel

@bayesilicon

a year ago

yes

thumb_up_off_alt17

chat_bubble_outline0

repeat0

shareShare

Jack Cole

@mindsai_jack

a year ago

New SoTA on ARC-AGI. Nothing like the synergy of an awesome team (Michael Hodel Mohamed Osman). From 53 to 54.5 today. Onward and upward! 🚀 ARC Prize Mike Knoop François Chollet Bryan Landers Greg Kamradt Machine Learning Street Talk First. #kaggle - kaggle.com/competitions/a…

thumb_up_off_alt147

chat_bubble_outline5

repeat19

shareShare

Kevin Ellis

@ellisk_kellis

a year ago

New ARC-AGI paper ARC Prize w/ fantastic collaborators Wen-Ding Li @ ICLR'25 Keya Hu Zenna Tavares evanthebouncy Basis For few-shot learning: better to construct a symbolic hypothesis/program, or have a neural net do it all, ala in-context learning? cs.cornell.edu/~ellisk/docume…

New ARC-AGI paper
<a href="/arcprize/">ARC Prize</a> w/ fantastic collaborators <a href="/xu3kev/">Wen-Ding Li @ ICLR'25</a> <a href="/HuLillian39250/">Keya Hu</a> <a href="/ZennaTavares/">Zenna Tavares</a> <a href="/evanthebouncy/">evanthebouncy</a> <a href="/BasisOrg/">Basis</a>
For few-shot learning: better to construct a symbolic hypothesis/program, or have a neural net do it all, ala in-context learning?
cs.cornell.edu/~ellisk/docume…

thumb_up_off_alt906

chat_bubble_outline18

repeat166

shareShare

Mohamed Osman

@mohamedosmanml

a year ago

We got upto 55.5% on the ARC Prize leaderboard today! Progress towards the 60.2 % milestone of median human performance reported by arxiv.org/pdf/2409.01374 is not slowing down. Jack Cole Michael Hodel

We got upto 55.5% on the <a href="/arcprize/">ARC Prize</a> leaderboard today!
Progress towards the 60.2 % milestone of median human performance reported by arxiv.org/pdf/2409.01374 is not slowing down.
<a href="/MindsAI_Jack/">Jack Cole</a> <a href="/bayesilicon/">Michael Hodel</a>

thumb_up_off_alt114

chat_bubble_outline7

repeat21

shareShare

Machine Learning Street Talk

@mlstreettalk

a year ago

I finally got to meet François Chollet in person recently to interview him about ARC Prize, intelligence vs memorization, human cognitive development, learning abstractions, limits of pattern recognition and consciousness development. These are the best bits. Full show released tomorrow

thumb_up_off_alt520

chat_bubble_outline9

repeat57

shareShare

Andreas Köpf

@neurosp1ke

a year ago

Have been working on my 2nd synthetic ARC riddle generator (agent: ideation -> prog generation). Got >1k diverse generator+solver pairs as PoC so far. Some nice examples:

thumb_up_off_alt116

chat_bubble_outline4

repeat14

shareShare

Michael Hodel

@bayesilicon

a year ago

very excited to win guys, it's been such a blast! let's goo

thumb_up_off_alt48

chat_bubble_outline6

repeat1

shareShare

Andreas Köpf

@neurosp1ke

a year ago

ARC prize 2024 🥈place paper by the ARChitects who scored 53.5 (56.5): github.com/da-fr/arc-priz… - Transformers/LLMs are for ARC what ConvNets were for Imagenet - strong base model, TTT, specialized datasets (e.g. Michael Hodel’s re-arc) + novel: DFS sampling with LLM critique

thumb_up_off_alt149

chat_bubble_outline3

repeat28

shareShare

Michael Hodel

@bayesilicon

a year ago

talk is still cheap

thumb_up_off_alt9

chat_bubble_outline0

repeat0

shareShare

Michael Hodel

@bayesilicon

a year ago

it's not solved

thumb_up_off_alt24

chat_bubble_outline1

repeat0

shareShare

François Chollet

@fchollet

a year ago

Consulting my heart... Ok, looks like you haven't. But whenever you have a SotA (or close) solution built on top of the OpenAI API we're more than happy to verify it and add it to the public ARC Prize leaderboard. Anything using less than $10k worth of API calls is eligible.

thumb_up_off_alt1,1K

chat_bubble_outline45

repeat53

shareShare

Jack Cole

@mindsai_jack

a year ago

Great presentation on some unique TTT ideas and experiments by Jonas Hübotter Tufalabs. youtu.be/vei7uf9wOxI?si…

thumb_up_off_alt162

chat_bubble_outline3

repeat24

shareShare

Tufalabs

@tufalabs

a year ago

Today, MindsAI (Jack Cole Mohamed Osman Michael Hodel) becomes part of Tufalabs First assignment: complete the ARC Prize challenge

thumb_up_off_alt30

chat_bubble_outline2

repeat8

shareShare

Akira Yoshiyama ⁂

@yoshiyama_akira

8 months ago

Happy to announce we outperformed OpenAI o1 with a 7B model :) We released two self-improvement methods for verifiable domains in our preliminary paper -->

Happy to announce we outperformed <a href="/OpenAI/">OpenAI</a> o1 with a 7B model :)

We released two self-improvement methods for verifiable domains in our preliminary paper -->

thumb_up_off_alt3,3K

chat_bubble_outline108

repeat254

shareShare

Dimitri von Rütte

@dvruette

8 months ago

🚨 NEW PAPER DROP! Wouldn't it be nice if LLMs could spot and correct their own mistakes? And what if we could do so directly from pre-training, without any SFT or RL? We present a new class of discrete diffusion models, called GIDD, that are able to do just that: 🧵1/12

thumb_up_off_alt1,1K

chat_bubble_outline20

repeat157

shareShare

Jack Cole

@mindsai_jack

6 months ago

Excited to advance our lead and SoTA score on ARC-AGI-2 (ARC Prize) by 3 points to 15.28. Dries Smit Mohamed Osman Michael Hodel Greg Kamradt Tufalabs kaggle.com/competitions/a…

thumb_up_off_alt137

chat_bubble_outline4

repeat16

shareShare

Toby Simonds

@tobyrsimonds

5 months ago

📝 New research: AlphaWrite applies evolutionary algorithms to creative writing. Inspired by AlphaEvolve, we use iterative generation + Elo ranking to systematically improve story quality through inference-time compute scaling. Results: 72% preference over baseline generation

thumb_up_off_alt18

chat_bubble_outline2

repeat9

shareShare

Michael Hodel

@bayesilicon

5 months ago

DM if you're interested in meeting!

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare