Antoine Moulin (@antoine_mln) 's Twitter Profile
Antoine Moulin

@antoine_mln

doing a phd in RL/online learning on questions related to exploration and adaptivity

ID: 1294220396468854784

linkhttps://antoine-moulin.github.io/ calendar_today14-08-2020 10:32:54

198 Tweet

1,1K Followers

433 Following

Yannis Flet-Berliac (@yfletberliac) 's Twitter Profile Photo

Would you like to use Q-learning for LLM fine-tuning? Check out our new preprint where we interpret Q-functions as logits of the LLM: arxiv.org/abs/2505.11081 ✨ Work done with my based colleagues cohere

Rahma (@rahma_chaa) 's Twitter Profile Photo

It's been an incredible experience working on Gemini Diffusion. So much pride in what we've accomplished bringing this from a small research project to an I/O launch

Ivana Balazevic (@ibalazevic) 's Twitter Profile Photo

πŸš€Meet Gemini Diffusion, our first diffusion-based and super fast language model, just announced at Google I/O!πŸš€ Very excited to be able to share what I've been working on for the past little while with our amazing small team Google DeepMind.

Edouard Leurent (@eleurent) 's Twitter Profile Photo

Excited to share what I've been up to: Gemini Diffusion is FAST! I'm convinced this will revolutionise iterative workflows: refine, get instant feedback, repeat! So proud of what our small team achieved hereπŸͺ

Blanca Huergo (@blancahuergo) 's Twitter Profile Photo

Very excited to share what I have been working on. Having been part of the Gemini Diffusion team since day one, it is amazing to see our model demoed at Google I/O :) sign up below to try it out!

Brendan O'Donoghue (@bodonoghue85) 's Twitter Profile Photo

Excited to share what my team has been working on lately - Gemini diffusion! We bring diffusion to language modeling, yielding more power and blazing speeds! πŸš€πŸš€πŸš€ Gemini diffusion is especially strong at coding. In this example the model generates at 2000 tokens/sec,

Gergely Neu (@neu_rips) 's Twitter Profile Photo

new work on computing distances between stochastic processes **based on sample paths only**! we can now: - learn distances between Markov chains - extract "encoder-decoder" pairs for representation learning - with sample- and computational-complexity guarantees details below 1/n

Dimitri Meunier (@dimitrimeunier1) 's Twitter Profile Photo

🚨 New paper accepted at SIMODS! 🚨 β€œNonlinear Meta-learning Can Guarantee Faster Rates” arxiv.org/abs/2307.10870 When does meta learning work? Spoiler: generalise to new tasks by overfitting on your training tasks! Here is why: πŸ§΅πŸ‘‡

Dylan Foster 🐒 (@canondetortugas) 's Twitter Profile Photo

Dhruv Rohatgi will be giving a lecture on our recent work on comp-stat tradeoffs in next-token prediction at the RL Theory virtual seminar series (RL Theory Virtual Seminars) tomorrow at 2pm EST! Should be a fun talk---come check it out!!

Dhruv Rohatgi will be giving a lecture on our recent work on comp-stat tradeoffs in next-token prediction at the RL Theory virtual seminar series (<a href="/RLtheory/">RL Theory Virtual Seminars</a>) tomorrow at 2pm EST! Should be a fun talk---come check it out!!
Luca Viano (@lucaviano4) 's Twitter Profile Photo

Our new preprint is online ! Structural assumptions on the MDP helps in imitation learning, even if offline :) Joint work with Gergely Neu and Antoine Moulin 😎

RL Theory Virtual Seminars (@rltheory) 's Twitter Profile Photo

Join us tomorrow for Dave's talk! He will present his recent work on randomised exploration, which received an outstanding paper award at ALT 2025 earlier this year.

Join us tomorrow for Dave's talk! He will present his recent work on randomised exploration, which received an outstanding paper award at ALT 2025 earlier this year.
Kevin Han Huang (@kevinhanhuang1) 's Twitter Profile Photo

Meanwhile, excited to be in #Lyon for #COLT2025, with a co-first author paper (arxiv.org/abs/2502.15752) with the amazing team -- Matthew Mallory and our advisor Morgane Austern! Keywords: Gaussian universality, dependent data, convex Gaussian min-max theorem, data augmentation!