Blaze(j) Manczak 🇵🇱🇱🇺🇪🇺 (@blazejmanczak) 's Twitter Profile
Blaze(j) Manczak 🇵🇱🇱🇺🇪🇺

@blazejmanczak

AI Research @ Dynamo AI (prev Qualcomm AI) 🧐 🤖 also into science of peak human performance and endurance sports

ID: 414160604

linkhttps://bmanczak.github.io/about/ calendar_today16-11-2011 18:15:59

147 Tweet

68 Followers

186 Following

AK (@_akhaliq) 's Twitter Profile Photo

CodeIt Self-Improving Language Models with Prioritized Hindsight Replay paper page: huggingface.co/papers/2402.04… Large language models are increasingly solving tasks that are commonly believed to require human-level reasoning ability. However, these models still perform very poorly

CodeIt

Self-Improving Language Models with Prioritized Hindsight Replay

paper page: huggingface.co/papers/2402.04…

Large language models are increasingly solving tasks that are commonly believed to require human-level reasoning ability. However, these models still perform very poorly
Dmytro Mishkin 🇺🇦 (@ducha_aiki) 's Twitter Profile Photo

CodeIt: Self-Improving Language Models with Prioritized Hindsight Replay Natasha Butt, Blaze(j) Manczak 🇵🇱🇱🇺🇪🇺, Auke Wiggers, Corrado Rainone, David Zhang, Michaël Defferrard, Taco Cohen tl;dr: sample a program, try it, add to the replay pool. New sota on ARC arxiv.org/abs/2402.04858…

CodeIt: Self-Improving Language Models with Prioritized Hindsight Replay

<a href="/NatashaEve4/">Natasha Butt</a>, <a href="/blazejmanczak/">Blaze(j) Manczak 🇵🇱🇱🇺🇪🇺</a>, <a href="/aukejw/">Auke Wiggers</a>, Corrado Rainone, David Zhang, <a href="/m_deff/">Michaël Defferrard</a>, <a href="/TacoCohen/">Taco Cohen</a>

tl;dr: sample a program, try it, add to the replay pool.
New sota on ARC
arxiv.org/abs/2402.04858…
Sergey Levine (@svlevine) 's Twitter Profile Photo

Can we design sample-efficient off-policy RL algorithms for LLMs to master multi-turn tasks? In our new work, we introduce ArCHer, a hierarchical actor-critic algorithm that improves massively in sample efficiency over PPO for multi-turn tasks: yifeizhou02.github.io/archer.io/ Thread 👇

Qualcomm Research & Technologies (@qcomresearch) 's Twitter Profile Photo

Congratulations to the #AI Research team for having the paper "CodeIt: Self-Improving Language Models with Prioritized Hindsight Replay" accepted at #ICML2024. Discover the future of LLMs: arxiv.org/abs/2402.04858 Blaze(j) Manczak 🇵🇱🇱🇺🇪🇺 Auke Wiggers David Zhang Michaël Defferrard Taco Cohen Natasha Butt

Magdalena (@pieinggg) 's Twitter Profile Photo

Hej Tłiter! Konkurs! Wśród RT tego tłita do 1.03 wylosuję 1osobę której namaluję co tylko będzie chciała! Chętni? fb.com/pieing