Gopeshh Subbaraj (@gopeshh1) 's Twitter Profile
Gopeshh Subbaraj

@gopeshh1

PhD Student @Mila_Quebec/UdeM Interested in RL and CL! Prev. developing software @MathWorks. Robotics Grad @WPI. Alum @ReachNITT Views my own!

ID: 601898354

linkhttps://www.linkedin.com/in/gopeshhraajsubbaraj/ calendar_today07-06-2012 14:04:42

93 Tweet

419 Followers

484 Following

Sarath Chandar (@apsarathchandar) 's Twitter Profile Photo

In my lab, we have not one but four open postdoc positions! These positions cover developing foundation models for text, proteins, small molecules, genomic data, time series data, and astrophysics data! If you have strong research expertise and a PhD in LLMs and Foundation

In my lab, we have not one but four open postdoc positions! These positions cover developing foundation models for text, proteins, small molecules, genomic data, time series data, and astrophysics data! If you have strong research expertise and a PhD in LLMs and Foundation
Johan S. Obando 👍🏽 (@johanobandoc) 's Twitter Profile Photo

🚨 Very pleased to share our recent work, in which we achieve up to 50x more efficient LLM post-training using off-policy reinforcement learning with replay buffers. Paper: arxiv.org/abs/2503.18929. 🧵See below for a summary of key results by Brian Bartoldson !

Johan S. Obando 👍🏽 (@johanobandoc) 's Twitter Profile Photo

🚨 Excited to share our #ICML2025 paper: The Impact of On-Policy Parallelized Data Collection on Deep RL Networks. Big congrats to Walter Mayor-Toro for the amazing work! 🎉 Read the paper here: arxiv.org/abs/2506.03404, and more details in the thread below ⬇️

Roger Creus Castanyer (@creus_roger) 's Twitter Profile Photo

🚨 Excited to share our new work: "Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning"! 📈 We propose gradient interventions that enable stable, scalable learning, achieving significant performance gains across agents and environments! Details below 👇

🚨 Excited to share our new work: "Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning"! 📈

We propose gradient interventions that enable stable, scalable learning, achieving significant performance gains across agents and environments!

Details below 👇
Mila - Institut québécois d'IA (@mila_quebec) 's Twitter Profile Photo

Chef robots need to act fast or omelets burn! This Mila blog tackles real-time reinforcement learning challenges and introduces solutions for minimizing both inaction and delay regret. mila.quebec/en/article/rea…

Chef robots need to act fast or omelets burn! This Mila blog tackles real-time reinforcement learning challenges and introduces solutions for minimizing both inaction and delay regret. mila.quebec/en/article/rea…
Johan S. Obando 👍🏽 (@johanobandoc) 's Twitter Profile Photo

🚨 Excited to share our #ICML2025 paper: "The Courage to Stop: Overcoming Sunk Cost Fallacy in Deep RL" We train RL agents to know when to quit, cutting wasted effort and improving efficiency with our method LEAST. 📄Paper: arxiv.org/pdf/2506.13672 🧵Check the thread below👇🏾

🚨 Excited to share our #ICML2025 paper: "The Courage to Stop: Overcoming Sunk Cost Fallacy in Deep RL"  

We train RL agents to know when to quit, cutting wasted effort and improving efficiency with our method LEAST.

📄Paper: arxiv.org/pdf/2506.13672
🧵Check the thread below👇🏾
Andrei Mircea (@mirandrom) 's Twitter Profile Photo

Interested in LLM training dynamics and scaling laws? Come to our #ACL2025 oral tomorrow! ⏰ Tuesday 2:55pm 📍 Hall C (Language Modeling 1) 🌐 mirandrom.github.io/zsl/ If you're in Vienna and want to chat, let me know! Mila - Institut québécois d'IA