Raphael Avalos (@raphael_avalos) 's Twitter Profile
Raphael Avalos

@raphael_avalos

Intern @cohere |
PhD Student @aibrussels | @FWOVlaanderen

ID: 1127834279201914880

linkhttp://avalos.fr calendar_today13-05-2019 07:13:41

53 Tweet

182 Followers

353 Following

Alizée Pace (@alizeepace) 's Twitter Profile Photo

Presenting work on synthetic preference generation at two #ICLR2024 workshops today: DPFM & GenAI4DM GenAI4DM Workshop. Come say hi to find out how to improve your reward model without collecting additional human feedback!

Presenting work on synthetic preference generation at two #ICLR2024 workshops today: DPFM &amp; GenAI4DM <a href="/genai4dm/">GenAI4DM Workshop</a>.

Come say hi to find out how to improve your reward model without collecting additional human feedback!
Willem Röpke (@willem_ropke) 's Twitter Profile Photo

Okay people, I need some help. We’re working on a project and have been stuck for a while. My final guess for what the issue may be is that gradients are not flowing as we would want them. Does anyone have a intuitive visualisation/debugging tool for gradient flows in jax?

Florent Delgrange (@f_delgrange) 's Twitter Profile Photo

Two weeks ago, I publicly defended my PhD thesis, entitled « Activating Formal Verification of Deep Reinforcement Learning Policies by Model Checking Bisimilar Latent Space Models ». 📚 The full dissertation is available here: tinyurl.com/formarl (1/n)

Two weeks ago, I publicly defended my PhD thesis, entitled « Activating Formal Verification of Deep Reinforcement Learning Policies by Model Checking Bisimilar Latent Space Models ».
📚 The full dissertation is available here: tinyurl.com/formarl
(1/n)
Raphael Avalos (@raphael_avalos) 's Twitter Profile Photo

Starting my internship at cohere today to work on LLMs! I'll be in Paris a couple of days a week, so if anyone wants to meet up, let me know!

Willem Röpke (@willem_ropke) 's Twitter Profile Photo

Exciting news! My paper on multi-objective reinforcement learning was accepted at AAMAS 2025! We introduce IPRO (Iterated Pareto Referent Optimisation)—a principled approach to solving multi-objective problems. 🔗 Paper: arxiv.org/abs/2402.07182 💻 Code: github.com/wilrop/ipro

Raphael Avalos (@raphael_avalos) 's Twitter Profile Photo

Excited to share the technical report on Command R7B (7B) and Command A (111B), our flagship model! These models are the result of incredible teamwork at cohere, and it was an honor to be part of it. Report: cohere.com/research/paper…

Raphael Avalos (@raphael_avalos) 's Twitter Profile Photo

🚀 Excited to share the 3rd outcome of my internship at @CohereAI: a new RL algo for agentic LLMs that combines policy learning and world modeling, letting agents verify actions before executing them. Check out the 🧵 and 📄! Big thanks to my co-authors and Cohere’s RL team 🙏