Matteo Gallici (@matteogallici) 's Twitter Profile
Matteo Gallici

@matteogallici

PhD Student UPC Barcelona - Reinforcement Learning

ID: 344691651

calendar_today29-07-2011 12:55:39

34 Tweet

313 Followers

93 Following

Jonny Cook (@jonnycoook) 's Twitter Profile Photo

1/ 🚀 Presenting AGI - Artificial Generational Intelligence 🚀 We apply the concept of cultural accumulation to RL and find that agents can improve across generations, outperforming those trained for a single lifetime of the same experience budget! Co-led w/ Chris Lu. 🧵

1/ 🚀 Presenting AGI - Artificial Generational Intelligence 🚀

We apply the concept of cultural accumulation to RL and find that agents can improve across generations, outperforming those trained for a single lifetime of the same experience budget!

Co-led w/ <a href="/_chris_lu_/">Chris Lu</a>.

🧵
Chris Lu (@_chris_lu_) 's Twitter Profile Photo

Excited to share my first work from my internship Sakana AI! We used LLMs to design and implement new preference optimization algorithms for training LLMs, discovering cutting-edge methods! Co-led with Sam Holt and Claudio Fanconi. Details in thread 🧵 (1/N)

Silvia Sapora (@silviasapora) 's Twitter Profile Photo

1 / 🧵 Excited to introduce our #ICML2024 paper: 😈 EvIL (Evolution Strategies for Generalisable Imitation Learning) a new inverse RL (IRL) method for sample efficient transfer of expert behaviour across environments – it's so good, it's downright EvIL!

1 / 🧵 Excited to introduce our #ICML2024 paper: 😈 EvIL (Evolution Strategies for Generalisable Imitation Learning) a new inverse RL (IRL) method for sample efficient transfer of expert behaviour across environments – it's so good, it's downright EvIL!
Jakob Foerster (@j_foerst) 's Twitter Profile Photo

DQN kick-started the field of deep RL 12 years ago, but Q-learning has recently taken a backseat compared to PPO and other on-policy method. We introduce PQN, a greatly simplified version of DQN which is highly GPU compatible and theoretically supported by convergence proofs.

Michael Matthews @ ICLR 2025 (@mitrma) 's Twitter Profile Photo

Really impactful work, most prominently in finally figuring out how to bring the speed increases of Jax to off-policy value-based algorithms! Another building block falls into place in the Jax RL ecosystem..

Pablo Samuel Castro (@pcastr) 's Twitter Profile Photo

This paper looks super cool. I've started reading it and am really enjoying the clarity of exposition, theoretical investigation, and simplicity of resulting algorithm.

Alex Goldie (@alexdgoldie) 's Twitter Profile Photo

1/ 🤖 Learned optimization offers huge potential to automate machine learning! So why doesn't it work well in RL (and how did we fix it)?! I'm excited to share OPEN, our AutoRL Workshop spotlight paper exploring this question! 🧵

Jakob Foerster (@j_foerst) 's Twitter Profile Photo

1/🚀 Foerster Lab for AI Research is coming to #icml2024 in Vienna 🎉 (I am literally posting from the train) and we are very excited to share our work with you! You can find us here ⬇️✨ see below 🔗 for clickable links

1/🚀 <a href="/FLAIR_Ox/">Foerster Lab for AI Research</a> is coming to #icml2024 in Vienna 🎉 (I am literally posting from the train) and we are very excited to share our work with you! You can find us here ⬇️✨ see below 🔗 for clickable links
Michael Matthews @ ICLR 2025 (@mitrma) 's Twitter Profile Photo

We are very excited to announce Kinetix: an open-ended universe of physics-based tasks for RL! We use Kinetix to train a general agent on millions of randomly generated physics problems and show that this agent generalises to unseen handmade environments. 1/🧵

Michael Beukman (@mcbeukman) 's Twitter Profile Photo

🏋️‍♂️Go from creating an environment to having a trained expert agent within minutes! As part of Kinetix, we are releasing an editor that can create custom physics-based RL environments, and import them seamlessly into an RL training loop. 1/

Costa Huang (@vwxyzjn) 's Twitter Profile Photo

Roger Creus Castanyer just implemented a CleanRL Parallel Q-Networks algorithm (PQN) implementation! 🚀PQN is DQN without a replay buffer and target network. You can run PQN on GPU environments or vectorized environments. E.g., in envpool, PQN gets DQN's score in 1/10th the time

<a href="/creus_roger/">Roger Creus Castanyer</a> just implemented a <a href="/cleanrl_lib/">CleanRL</a> Parallel Q-Networks algorithm (PQN) implementation! 

🚀PQN is DQN without a replay buffer and target network. You can run PQN on GPU environments or vectorized environments. E.g., in envpool, PQN gets DQN's score in 1/10th the time
Jacob E. Kooi (@jacobekooi) 's Twitter Profile Photo

📢New paper on arXiv: Hadamax Encoding: Elevating Performance in Model-Free Atari. (arxiv.org/abs/2505.15345) Our Hadamax (Hadamard max-pooling) encoder architecture improves the recent PQN algorithm’s Atari performance by 80%, allowing it to significantly surpass Rainbow-DQN!

📢New paper on arXiv: Hadamax Encoding: Elevating Performance in Model-Free Atari. (arxiv.org/abs/2505.15345)

Our Hadamax (Hadamard max-pooling) encoder architecture improves the recent PQN algorithm’s Atari performance by 80%, allowing it to significantly surpass Rainbow-DQN!
Pablo Samuel Castro (@pcastr) 's Twitter Profile Photo

The Impact of On-Policy Parallelized Data Collection on Deep Reinforcement Learning Networks thrilled to share our #ICML2025 paper led by Walter Mayor-Toro & Johan S. Obando 👍🏽 , with Aaron Courville , where we explore how data collection affects agents in parallelized setups. 1/

The Impact of On-Policy Parallelized Data Collection on Deep Reinforcement Learning Networks

thrilled to share our #ICML2025 paper led by <a href="/WalterMayor_T/">Walter Mayor-Toro</a> &amp; <a href="/johanobandoc/">Johan S. Obando 👍🏽</a> , with <a href="/AaronCourville/">Aaron Courville</a> , where we explore how data collection affects agents in parallelized setups.
1/