Marc G. Bellemare (@marcgbellemare) 's Twitter Profile
Marc G. Bellemare

@marcgbellemare

CSO & co-founder, Reliant AI. Ex RL research lead at Google Brain, DeepMind. Known for Atari 2600 RL benchmark, Distributional RL (MIT Press 2023).

ID: 289158382

linkhttp://marcgbellemare.info calendar_today28-04-2011 03:50:48

1,1K Tweet

14,14K Followers

349 Following

Marc G. Bellemare (@marcgbellemare) 's Twitter Profile Photo

Most people don't know that although most of my research work is in RL, I spent a significant portion of my PhD & early career on generative modelling (text, images, data compression). Building new RL algorithms for LLM training is a real delight - putting two passions together.

Marc G. Bellemare (@marcgbellemare) 's Twitter Profile Photo

It took us 2+ years to figure out exactly how to think about, & work with a distributional version of the successor representation - doubly proud of this work by Jesse Farebrother and Harley Wiltzer that both lays down a mathematical foundation and improves on γ-models! Also, A+ visuals.

Marc G. Bellemare (@marcgbellemare) 's Twitter Profile Photo

On the back of our 2017 distributional RL paper Martha White and Ehsan Imani wrote a piece showing that you can do regression better with a classification loss... that seemed wild at the time, but Jesse Farebrother, Rishabh Agarwal and co pushed this further and the results are amazing!

Marc G. Bellemare (@marcgbellemare) 's Twitter Profile Photo

Amazing piece of work by Jesse Farebrother , Rishabh Agarwal , & star co-authors digging into classification losses in RL and their unreasonable effectiveness in a problem space that has mostly been dominated by regression methods. Don't miss this talk at ICML!

Marc G. Bellemare (@marcgbellemare) 's Twitter Profile Photo

Because one level of distributions isn't enough - don't miss tomorrow's ICML spotlight by Harley Wiltzer , Jesse Farebrother , Arthur Gretton , and Mark Rowland: lifting the successor representation to distributions and moving the needle on what you can do with technique like γ-models.

Marc G. Bellemare (@marcgbellemare) 's Twitter Profile Photo

Distributional successor features: A follow up to our distributional successor representation by my students Harley Wiltzer and Jesse Farebrother - those manim animations are quite something!

Marc G. Bellemare (@marcgbellemare) 's Twitter Profile Photo

Interested in using reinforcement learning to train LLMs for problems where there’s no room for error? Do you want to build massive data pipelines to transform how we interact with scientific knowledge? We're hiring for multiple roles at Reliant: apply.workable.com/reliant-ai

Harley Wiltzer (@harwiltz) 's Twitter Profile Photo

🚀 Extremely excited about our latest work on Distributional RL algorithms for *high-frequency control*, to be presented at #neurips2024! Incredible collaboration with the OT wizard Yash Jhaveri, Marc G. Bellemare, David Meger, Patrick Shafto. Paper: arxiv.org/pdf/2410.11022

Pablo Samuel Castro (@pcastr) 's Twitter Profile Photo

we've used Atari games as an RL benchmark for so long, but for a little while it's bugged me that it's a discrete action problem, since the original joysticks were analog... Jesse Farebrother & i fix this by introducing the Continuous ALE (CALE)! read thread for details! 1/9

we've used Atari games as an RL benchmark for so long, but for a little while it's bugged me that it's a discrete action problem, since the original joysticks were analog...
<a href="/JesseFarebro/">Jesse Farebrother</a> &amp; i fix this by introducing the Continuous ALE (CALE)!
read thread for details!
1/9
Marc G. Bellemare (@marcgbellemare) 's Twitter Profile Photo

We're hiring at Reliant AI! On engage chez Reliant AI! If you or someone you know is excited to build the future of AI-powered research - take a look, share widely. Montreal and Berlin (platform eng., ML eng., and RS) and North America (commercial). apply.workable.com/reliant-ai/

Marc G. Bellemare (@marcgbellemare) 's Twitter Profile Photo

Take a look at this amazing piece of work by my student Jesse Farebrother - a new kind of world model based on successor representations that's a lot more robust than prior iterations. Incredible to see all the progress we've made in the last 5 years in RL.

Marc G. Bellemare (@marcgbellemare) 's Twitter Profile Photo

Goodbye Toronto! So many serendipitous meetings Toronto Tech Week, incredible energy. Learned that Isaac Souweine and I like the same parties. Met too many AI founders to count, all making amazing new things. Now back to building!

Marc G. Bellemare (@marcgbellemare) 's Twitter Profile Photo

A can-of-worms type question to all of you, from an RL researcher turned NLP startup founder: assuming a benchmark with no annotator error, is 100% accuracy on question answering possible?

Jesse Farebrother (@jessefarebro) 's Twitter Profile Photo

Heading to Vancouver for #ICML2025 to present our work: Temporal Difference Flows. Make sure to check out the oral to learn how we’re now able to scale this exciting world model framework based on the successor representation! Also, feel free to reach out to discuss anything RL!

Heading to Vancouver for #ICML2025 to present our work: Temporal Difference Flows. Make sure to check out the oral to learn how we’re now able to scale this exciting world model framework based on the successor representation! Also, feel free to reach out to discuss anything RL!
Marc G. Bellemare (@marcgbellemare) 's Twitter Profile Photo

AI isn’t about saving time – it’s about doing what you couldn’t do before. If you could read ten thousand papers a day, you would ...