Matteo Gallici (@matteogallici) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

1/ 🚀 Presenting AGI - Artificial Generational Intelligence 🚀 We apply the concept of cultural accumulation to RL and find that agents can improve across generations, outperforming those trained for a single lifetime of the same experience budget! Co-led w/ Chris Lu. 🧵

thumb_up_off_alt100

chat_bubble_outline5

repeat30

shareShare

Chris Lu

@_chris_lu_

a year ago

Excited to share my first work from my internship Sakana AI! We used LLMs to design and implement new preference optimization algorithms for training LLMs, discovering cutting-edge methods! Co-led with Sam Holt and Claudio Fanconi. Details in thread 🧵 (1/N)

thumb_up_off_alt156

chat_bubble_outline4

repeat36

shareShare

Silvia Sapora

@silviasapora

a year ago

1 / 🧵 Excited to introduce our #ICML2024 paper: 😈 EvIL (Evolution Strategies for Generalisable Imitation Learning) a new inverse RL (IRL) method for sample efficient transfer of expert behaviour across environments – it's so good, it's downright EvIL!

thumb_up_off_alt189

chat_bubble_outline2

repeat37

shareShare

Jakob Foerster

@j_foerst

a year ago

DQN kick-started the field of deep RL 12 years ago, but Q-learning has recently taken a backseat compared to PPO and other on-policy method. We introduce PQN, a greatly simplified version of DQN which is highly GPU compatible and theoretically supported by convergence proofs.

thumb_up_off_alt83

chat_bubble_outline1

repeat8

shareShare

Boris Belousov

@_bbelousov

a year ago

Matteo Gallici Very interesting work! It sounds very similar to our CrossQ paper x.com/aditya_bhatt/s…

thumb_up_off_alt11

chat_bubble_outline1

repeat4

shareShare

Michael Matthews @ ICLR 2025

@mitrma

a year ago

Really impactful work, most prominently in finally figuring out how to bring the speed increases of Jax to off-policy value-based algorithms! Another building block falls into place in the Jax RL ecosystem..

thumb_up_off_alt38

chat_bubble_outline0

repeat6

shareShare

Pablo Samuel Castro

@pcastr

a year ago

This paper looks super cool. I've started reading it and am really enjoying the clarity of exposition, theoretical investigation, and simplicity of resulting algorithm.

thumb_up_off_alt40

chat_bubble_outline1

repeat4

shareShare

Alex Goldie

@alexdgoldie

a year ago

1/ 🤖 Learned optimization offers huge potential to automate machine learning! So why doesn't it work well in RL (and how did we fix it)?! I'm excited to share OPEN, our AutoRL Workshop spotlight paper exploring this question! 🧵

thumb_up_off_alt117

chat_bubble_outline1

repeat27

shareShare

Jakob Foerster

@j_foerst

a year ago

1/🚀 Foerster Lab for AI Research is coming to #icml2024 in Vienna 🎉 (I am literally posting from the train) and we are very excited to share our work with you! You can find us here ⬇️✨ see below 🔗 for clickable links

1/🚀 <a href="/FLAIR_Ox/">Foerster Lab for AI Research</a> is coming to #icml2024 in Vienna 🎉 (I am literally posting from the train) and we are very excited to share our work with you! You can find us here ⬇️✨ see below 🔗 for clickable links

thumb_up_off_alt85

chat_bubble_outline1

repeat16

shareShare

Michael Matthews @ ICLR 2025

@mitrma

8 months ago

We are very excited to announce Kinetix: an open-ended universe of physics-based tasks for RL! We use Kinetix to train a general agent on millions of randomly generated physics problems and show that this agent generalises to unseen handmade environments. 1/🧵

thumb_up_off_alt1,1K

chat_bubble_outline14

repeat216

shareShare

Michael Beukman

@mcbeukman

8 months ago

🏋️‍♂️Go from creating an environment to having a trained expert agent within minutes! As part of Kinetix, we are releasing an editor that can create custom physics-based RL environments, and import them seamlessly into an RL training loop. 1/

thumb_up_off_alt96

chat_bubble_outline3

repeat17

shareShare

Costa Huang

@vwxyzjn

8 months ago

Roger Creus Castanyer just implemented a CleanRL Parallel Q-Networks algorithm (PQN) implementation! 🚀PQN is DQN without a replay buffer and target network. You can run PQN on GPU environments or vectorized environments. E.g., in envpool, PQN gets DQN's score in 1/10th the time

<a href="/creus_roger/">Roger Creus Castanyer</a> just implemented a <a href="/cleanrl_lib/">CleanRL</a> Parallel Q-Networks algorithm (PQN) implementation!

🚀PQN is DQN without a replay buffer and target network. You can run PQN on GPU environments or vectorized environments. E.g., in envpool, PQN gets DQN's score in 1/10th the time

thumb_up_off_alt98

chat_bubble_outline6

repeat12

shareShare

Jacob E. Kooi

@jacobekooi

2 months ago

📢New paper on arXiv: Hadamax Encoding: Elevating Performance in Model-Free Atari. (arxiv.org/abs/2505.15345) Our Hadamax (Hadamard max-pooling) encoder architecture improves the recent PQN algorithm’s Atari performance by 80%, allowing it to significantly surpass Rainbow-DQN!

thumb_up_off_alt47

chat_bubble_outline2

repeat9

shareShare

Pablo Samuel Castro

@pcastr

a month ago

The Impact of On-Policy Parallelized Data Collection on Deep Reinforcement Learning Networks thrilled to share our #ICML2025 paper led by Walter Mayor-Toro & Johan S. Obando 👍🏽 , with Aaron Courville , where we explore how data collection affects agents in parallelized setups. 1/

The Impact of On-Policy Parallelized Data Collection on Deep Reinforcement Learning Networks

thrilled to share our #ICML2025 paper led by <a href="/WalterMayor_T/">Walter Mayor-Toro</a> & <a href="/johanobandoc/">Johan S. Obando 👍🏽</a> , with <a href="/AaronCourville/">Aaron Courville</a> , where we explore how data collection affects agents in parallelized setups.
1/

thumb_up_off_alt55

chat_bubble_outline1

repeat14

shareShare

Matteo Gallici

Gate.io

Jonny Cook

Chris Lu

Silvia Sapora

Jakob Foerster

Boris Belousov

Michael Matthews @ ICLR 2025

Pablo Samuel Castro

Alex Goldie

Jakob Foerster

Michael Matthews @ ICLR 2025

Michael Beukman

Costa Huang

Jacob E. Kooi

Pablo Samuel Castro