Roger Creus Castanyer (@creus_roger) 's Twitter Profile
Roger Creus Castanyer

@creus_roger

Maximizing the unexpected return.

PhD student @Mila_Quebec | Prev: @UbisoftLaForge @la_UPC @HP

ID: 1384077639992766472

linkhttps://roger-creus.github.io/ calendar_today19-04-2021 09:33:29

204 Tweet

548 Followers

734 Following

Glen Berseth (@glenberseth) 's Twitter Profile Photo

How can we make behavioural cloning (BC) achieve better combinatorial generalization on out-of-distribution goals? We propose BYOL-γ: an auxiliary self-predictive loss to improve generalization for goal-conditioned BC. 🧵1/6

Chuang Gan (@gan_chuang) 's Twitter Profile Photo

🤖Can world models quickly adapt to new environments with just a few interactions? Introducing AdaWorld 🌍 — a new approach to learning world models conditioned on continuous latent actions extracted from videos via self-supervision! It enables rapid adaptation, efficient

Jianren Wang (@wang_jianren) 's Twitter Profile Photo

(1/n) Since its publication in 2017, PPO has essentially become synonymous with RL. Today, we are excited to provide you with a better alternative - EPO.

Pablo Samuel Castro (@pcastr) 's Twitter Profile Photo

thrilled that we'll be presenting this paper as a spotlight at #ICML2025 . come by our poster in vancouver to chat with us about the use of LLMs for advancing neuroscience! here's the camera-ready version: openreview.net/forum?id=dhRXG…

Harshit Sikchi (@harshit_sikchi) 's Twitter Profile Photo

Behavioral Foundation Models (BFMs) trained with RL are secretly more powerful than we think. BFM’s directly output a policy believed to be near-optimal given any reward function. Our new work shows that they can actually do much better:

Pablo Samuel Castro (@pcastr) 's Twitter Profile Photo

really excited about this new work we just put out, led by my students Roger Creus Castanyer & Johan S. Obando 👍🏽 , where we examine the challenges of gradient propagation when scaling deep RL networks. roger & johan put in a lot of work and care in this work, check out more details in 🧵👇🏾 !

Glen Berseth (@glenberseth) 's Twitter Profile Photo

Being unable to scale #DeepRL to solve diverse, complex tasks with large distribution changes has been holding back the #RL community. In this work, we demonstrate that with the right architecture and optimization adjustments, agents can maintain plasticity for large networks.

Pablo Samuel Castro (@pcastr) 's Twitter Profile Photo

proud to share a survey of state representation learning in RL that my student ayoub echchahed and i prepared, that was just published on Accepted papers at TMLR ! this was the bulk of ayoub's masters thesis and he put a lot of work and care into it! a few details in thread below... 1/

proud to share a survey of state representation learning in RL that my student ayoub echchahed and i prepared, that was just published on <a href="/TmlrPub/">Accepted papers at TMLR</a> !
this was the bulk of ayoub's masters thesis and he put a lot of work and care into it!
a few details in thread below...
1/
Martin Klissarov (@martinklissarov) 's Twitter Profile Photo

As AI agents face increasingly long and complex tasks, decomposing them into subtasks becomes increasingly appealing. But how do we discover such temporal structure? Hierarchical RL provides a natural formalism-yet many questions remain open. Here's our overview of the field🧵

Akhil Bagaria (@akhil_bagaria) 's Twitter Profile Photo

New paper: skill discovery is a hallmark of intelligence--identify interesting questions about the world, and learn how to answer them via policies. Hierarchical RL studies this in AI, and we have written a survey of the field. Please take a look & give us your feedback!