Antoine Moulin (@antoine_mln) Twitter Tweets • TwiCopy

Yannis Flet-Berliac

5 months ago

Would you like to use Q-learning for LLM fine-tuning? Check out our new preprint where we interpret Q-functions as logits of the LLM: arxiv.org/abs/2505.11081 ✨ Work done with my based colleagues cohere

thumb_up_off_alt19

chat_bubble_outline0

repeat4

shareShare

RL Theory Virtual Seminars

@rltheory

5 months ago

Later today, Sikata and Marcel will talk about their recent work on oracle-efficient RL with ensembles. Join us!

thumb_up_off_alt14

chat_bubble_outline0

repeat4

shareShare

Jean Tarbouriech

@jean_tarbou

5 months ago

1000+ words per second! ⚡ We just unleashed Gemini Diffusion at #GoogleIO! 🚀 Awesome being part of the team that took this from a small research project all the way to I/O Google DeepMind 🪐

thumb_up_off_alt137

chat_bubble_outline5

repeat31

shareShare

Rahma

@rahma_chaa

5 months ago

It's been an incredible experience working on Gemini Diffusion. So much pride in what we've accomplished bringing this from a small research project to an I/O launch

thumb_up_off_alt101

chat_bubble_outline2

repeat12

shareShare

Ivana Balazevic

@ibalazevic

5 months ago

🚀Meet Gemini Diffusion, our first diffusion-based and super fast language model, just announced at Google I/O!🚀 Very excited to be able to share what I've been working on for the past little while with our amazing small team Google DeepMind.

thumb_up_off_alt438

chat_bubble_outline23

repeat45

shareShare

Gellert Weisz

@gellertweisz

5 months ago

Super excited to have been part of the incredible journey with our team, bringing this to you all the way from research idea to Google IO!

thumb_up_off_alt27

chat_bubble_outline0

repeat5

shareShare

Edouard Leurent

@eleurent

5 months ago

Excited to share what I've been up to: Gemini Diffusion is FAST! I'm convinced this will revolutionise iterative workflows: refine, get instant feedback, repeat! So proud of what our small team achieved here🪐

thumb_up_off_alt122

chat_bubble_outline5

repeat18

shareShare

Blanca Huergo

@blancahuergo

5 months ago

Very excited to share what I have been working on. Having been part of the Gemini Diffusion team since day one, it is amazing to see our model demoed at Google I/O :) sign up below to try it out!

thumb_up_off_alt82

chat_bubble_outline4

repeat13

shareShare

Brendan O'Donoghue

@bodonoghue85

5 months ago

Excited to share what my team has been working on lately - Gemini diffusion! We bring diffusion to language modeling, yielding more power and blazing speeds! 🚀🚀🚀 Gemini diffusion is especially strong at coding. In this example the model generates at 2000 tokens/sec,

thumb_up_off_alt2,2K

chat_bubble_outline89

repeat258

shareShare

Dimitri Meunier

@dimitrimeunier1

5 months ago

Check out our new result on regression with heavy-tailed noise ! I learned a lot on this project, thanks to Mattes Mollenhauer for leading the project. Nic Mücke 🦩 🇪🇺 Arthur Gretton

thumb_up_off_alt18

chat_bubble_outline1

repeat2

shareShare

Luca Viano

@lucaviano4

5 months ago

Finally, we have expert sample complexity bounds in multi agent imitation learning! arxiv.org/pdf/2505.17610 Joint work with Till Freihaut, Volkan Cevher, Matthieu and Giorgia Ramponi

thumb_up_off_alt32

chat_bubble_outline2

repeat4

shareShare

Gergely Neu

@neu_rips

5 months ago

new work on computing distances between stochastic processes **based on sample paths only**! we can now: - learn distances between Markov chains - extract "encoder-decoder" pairs for representation learning - with sample- and computational-complexity guarantees details below 1/n

thumb_up_off_alt64

chat_bubble_outline1

repeat8

shareShare

Dimitri Meunier

@dimitrimeunier1

5 months ago

🚨 New paper accepted at SIMODS! 🚨 “Nonlinear Meta-learning Can Guarantee Faster Rates” arxiv.org/abs/2307.10870 When does meta learning work? Spoiler: generalise to new tasks by overfitting on your training tasks! Here is why: 🧵👇

thumb_up_off_alt68

chat_bubble_outline8

repeat15

shareShare

Dylan Foster 🐢

@canondetortugas

5 months ago

Dhruv Rohatgi will be giving a lecture on our recent work on comp-stat tradeoffs in next-token prediction at the RL Theory virtual seminar series (RL Theory Virtual Seminars) tomorrow at 2pm EST! Should be a fun talk---come check it out!!

thumb_up_off_alt25

chat_bubble_outline1

repeat2

shareShare

Luca Viano

@lucaviano4

5 months ago

Our new preprint is online ! Structural assumptions on the MDP helps in imitation learning, even if offline :) Joint work with Gergely Neu and Antoine Moulin 😎

thumb_up_off_alt22

chat_bubble_outline0

repeat4

shareShare

RL Theory Virtual Seminars

@rltheory

5 months ago

Join us tomorrow for Dave's talk! He will present his recent work on randomised exploration, which received an outstanding paper award at ALT 2025 earlier this year.

thumb_up_off_alt22

chat_bubble_outline0

repeat3

shareShare

Kevin Han Huang

@kevinhanhuang1

5 months ago

Meanwhile, excited to be in #Lyon for #COLT2025, with a co-first author paper (arxiv.org/abs/2502.15752) with the amazing team -- Matthew Mallory and our advisor Morgane Austern! Keywords: Gaussian universality, dependent data, convex Gaussian min-max theorem, data augmentation!

thumb_up_off_alt59

chat_bubble_outline1

repeat7

shareShare

RL Theory Virtual Seminars

@rltheory

5 months ago

Join us for Nneka's presentation tomorrow! Last talk before the summer break.

thumb_up_off_alt18

chat_bubble_outline0

repeat3

shareShare