Justin Deschenaux (@jdeschena) Twitter Tweets • TwiCopy

Justin Deschenaux

@jdeschena

+ Follow

PhD student @EPFL working with Caglar Gulcehre. Casting the forces of gradient descent 🧙‍♂️ Working on diffusion language models ⚡️

ID: 1462676204

linkhttps://jdeschena.github.io calendar_today27-05-2013 17:16:58

322 Tweet

371 Followers

490 Following

Sander Dieleman

@sedielem

5 months ago

We can diffuse text now, but we can still diffuse pixels as well!

thumb_up_off_alt157

chat_bubble_outline8

repeat6

shareShare

🚨 [New paper alert] Esoteric Language Models (Eso-LMs) First Diffusion LM to support KV caching w/o compromising parallel generation. 🔥 Sets new SOTA on the sampling speed–quality Pareto frontier 🔥 🚀 65× faster than MDLM ⚡ 4× faster than Block Diffusion 📜 Paper:

thumb_up_off_alt242

chat_bubble_outline10

repeat36

shareShare

Chris Wendler

@wendlerch

5 months ago

How do diffusion models create images and can we control that process? We are excited to release a update to our SDXL Turbo sparse autoencoder paper. New title: One Step is Enough: Sparse Autoencoders for Text-to-Image Diffusion Models Spoiler: We have FLUX SAEs now :)

thumb_up_off_alt55

chat_bubble_outline3

repeat17

shareShare

Mikhail Terekhov

@miterekhov

5 months ago

AI Control is a promising approach for mitigating misalignment risks, but will it be widely adopted? The answer depends on cost. Our new paper introduces the Control Tax—how much does it cost to run the control protocols? (1/8) 🧵

thumb_up_off_alt66

chat_bubble_outline4

repeat18

shareShare

Tanishq Mathew Abraham, Ph.D.

@iscienceluvr

5 months ago

The Diffusion Duality "The arg max operation transforms Gaussian diffusion into Uniform-state diffusion" Adapts consistency distillation to diffusion language models, unlocking few-step generation by accelerating sampling by two orders of magnitude. Introduces a curriculum

thumb_up_off_alt471

chat_bubble_outline5

repeat73

shareShare

Subham Sahoo

@ssahoo_

5 months ago

🚨 “The Diffusion Duality” is out! International Conference on Minority Languages ⚡️ Few-step generation in discrete diffusion language models by exploiting the underlying Gaussian diffusion. 🦾Beats AR on 3/7 zero-shot likelihood benchmarks. 📄 Paper: arxiv.org/abs/2506.10892 💻 Code: github.com/s-sahoo/duo 🧠

thumb_up_off_alt473

chat_bubble_outline15

repeat85

shareShare

Aaron Gokaslan

@skyli0n

5 months ago

Check out our recent paper on the "duality" between discrete and Gaussian diffusion. We show how you can exploit that relationship to massively speed up discrete diffusion by two orders of magnitude.

thumb_up_off_alt20

chat_bubble_outline0

repeat5

shareShare

AK

@_akhaliq

5 months ago

The Diffusion Duality unlock few-step generation in discrete diffusion language models via the underlying Gaussian diffusion

thumb_up_off_alt265

chat_bubble_outline6

repeat47

shareShare

Sander Dieleman

@sedielem

5 months ago

This work uncovers a profound connection between continuous and discrete (non-absorbing) diffusion models, allowing transfer of advanced techniques such as consistency distillation to the discrete setting! Also: amazing title, no notes! 🧑‍🍳😙🤌

thumb_up_off_alt118

chat_bubble_outline3

repeat10

shareShare

Jonathan Whitaker

@johnowhitaker

4 months ago

I did another video, on the paper 'The Diffusion Duality', continuing the series of me trying to understand diffusion applied to language models :) Link: youtube.com/watch?v=o_ISAl… I shied away from some of the scarier math - hope my hand-waving is still vaguely useful + correct!

thumb_up_off_alt244

chat_bubble_outline1

repeat32

shareShare

Xiuying Wei@Neurips (Wed11am East #2010)

@xiuyingwei966

4 months ago

Curious about making Transformers faster on long sequences without compromising accuracy? ⚡️🧠 Meet RAT—an intermediate design between RNN and softmax attention. The results? Faster and lighter like RNNs, with strong performance like Attention! 🐭✨

thumb_up_off_alt7

chat_bubble_outline2

repeat3

shareShare