Cheng Lu (@clu_cheng) 's Twitter Profile
Cheng Lu

@clu_cheng

Member of technical staff @OpenAI. PhD @Tsinghua_Uni. Interested in scalable generative models.

ID: 1235901818808352768

linkhttps://luchengthu.github.io calendar_today06-03-2020 12:15:36

111 Tweet

5,5K Followers

179 Following

Chongxuan Li (@lichongxuan) 's Twitter Profile Photo

šŸš€ć€Large Language Diffusion Models怑#DiffusionModels #LLM #LLaDA We built LLaDA-8B—the FIRST non-autoregressive model rivaling LLaMA3! CRUSHES Llama2-7B on ~20 tasks while unlocking ICL/instruction-following/multi-turn chat

šŸš€ć€Large Language Diffusion Models怑#DiffusionModels #LLM #LLaDA
We built LLaDA-8B—the FIRST non-autoregressive model rivaling LLaMA3! CRUSHES Llama2-7B on ~20 tasks while unlocking ICL/instruction-following/multi-turn chat
Cheng Lu (@clu_cheng) 's Twitter Profile Photo

Since the behaviors of consistency models are quite different in pixel and latent spaces, I wonder if using these new AEs can further improve the training of consistency models

Cheng Lu (@clu_cheng) 's Twitter Profile Photo

Still think consistency models are bad at scale? In fact, sCM can be stably scaled to modern text-to-image diffusion models and greatly improve the generation speed and 1-step generation quality!

Kenji Hata (@kenjihata) 's Twitter Profile Photo

if you want to see someone truly passionate about image generation, look no further than Gabriel Goh he lives and breathes making image generation wonderful.

Cheng Lu (@clu_cheng) 's Twitter Profile Photo

Congrats on everyone who worked in the GPT-4o image generation team! It’s really impressive and I’m so proud of us! It was also quite enjoyable working with such a group of talented people!

Congrats on everyone who worked in the GPT-4o image generation team! It’s really impressive and I’m so proud of us! It was also quite enjoyable working with such a group of talented people!
Richard Sutton (@richardssutton) 's Twitter Profile Photo

I’ve changed so little. From my 1978 Bachelor’s thesis: ā€œThe adult human mind is very complex, but the question remains open whether the learning processes that constructed it in interaction with the environment are similarly complex. Much evidence and many peoples’ intuitions

Cheng Lu (@clu_cheng) 's Twitter Profile Photo

A very promising direction for real-time video generation! arxiv.org/abs/2506.01380 nextframed.github.io 1. You can always use DPM-Solver++ to accelerate your flow matching model. 2. sCM can even scale to video diffusion model and boost the sample quality a lot!

Boaz Barak (@boazbaraktcs) 's Twitter Profile Photo

I didn't want to post on Grok safety since I work at a competitor, but it's not about competition. I appreciate the scientists and engineers at xAI but the way safety was handled is completely irresponsible. Thread below.

Cheng Lu (@clu_cheng) 's Twitter Profile Photo

Congrats! This is an incredible milestone and I was truly shocked by it. ā€œThinking for hoursā€ means 10x or even 100x of current test-time compute, and I can’t wait to see the model think for days, months, years, centuries to solve the science challenges!