Shuo Chen (@an_epsilon0) 's Twitter Profile
Shuo Chen

@an_epsilon0

Prv. Undergraduate @Tsinghua_Uni Math. & Physics | vision & generation

ID: 1504319918585622530

linkhttps://chenshuo20.github.io calendar_today17-03-2022 04:53:28

51 Tweet

116 Followers

243 Following

Jason Zada (@jasonzada) 's Twitter Profile Photo

Let's chat about VEO2 and The Heist. This is a multi-thread post that will discuss prompts and techniques. As an overview, this test was mainly to see if you could tell and cut together a short story (albeit simple) via text-to-video. Also note, that VEO2 is in beta.

Let's chat about VEO2 and The Heist. This is a multi-thread post that will discuss prompts and techniques. As an overview, this test was mainly to see if you could tell and cut together a short story (albeit simple) via text-to-video.  Also note, that VEO2 is in beta.
Yen-Chen Lin (@yen_chen_lin) 's Twitter Profile Photo

Video generation models exploded onto the scene in 2024, sparked by the release of Sora from OpenAI. I wrote a blog post on key techniques that are used in building large video generation models: yenchenlin.me/blog/2025/01/0…

Luma AI (@lumalabsai) 's Twitter Profile Photo

Introducing Ray2, a new frontier in video generative models. Scaled to 10x compute, #Ray2 creates realistic videos with natural and coherent motion, unlocking new freedoms of creative expression and visual storytelling. Available now. Learn more lumalabs.ai/ray.

Boyuan Chen (@boyuanchen0) 's Twitter Profile Photo

Announcing Diffusion Forcing Transformer (DFoT), our new video diffusion algorithm that generates ultra-long videos of 800+ frames. DFoT enables History Guidance, a simple add-on to any existing video diffusion models for a quality boost. Website: boyuan.space/history-guidan… (1/7)

Luma AI (@lumalabsai) 's Twitter Profile Photo

Today, we release Inductive Moment Matching (IMM): a new pre-training paradigm breaking the algorithmic ceiling of diffusion models. Higher sample quality. 10x more efficient. Single-stage, single network, stable training. Read more: lumalabs.ai/news/imm

Fangfu Liu (@fangfu0830) 's Twitter Profile Photo

🚀🚀🚀Introducing VideoScene (CVPR'25) - a turbo upgrade of ReconX! Our one-step video diffusion model bridges the gap from video to 3D, outpacing slow multi-step pipelines. Paper: arxiv.org/abs/2504.01956 Project Page: hanyang-21.github.io/VideoScene Code: github.com/hanyang-21/Vid…

Jonathan Jacobi (@j0nathanj) 's Twitter Profile Photo

Introducing Multiverse: the first AI-generated multiplayer game. Multiplayer was the missing piece in AI-generated worlds — now it’s here. Players can interact and shape a shared AI-simulated world, in real-time. Training and research cost < $1.5K. Run it on your own PC. We

Xun Huang (@xunhuang1995) 's Twitter Profile Photo

A video generator must satisfy 3 criteria to be a world model: 1️⃣ Causality: Past affects future, not vice versa. 2️⃣ Persistence: The world shouldn't change because you looked away. 3️⃣ Constant Speed: Simulation shouldn't slow down over time. We believe SSMs are a natural fit:

Tianyuan Zhang (@tianyuanzhang99) 's Twitter Profile Photo

Bored of linear recurrent memories (e.g., linear attention) and want a scalable, nonlinear alternative? Our new paper “Test-Time Training Done Right” propose LaCT (Large Chunk Test-Time Training) — a highly efficient, massively scalable nonlinear memory with: 💡 Pure PyTorch

Xun Huang (@xunhuang1995) 's Twitter Profile Photo

Real-time video generation is finally real — without sacrificing quality. Introducing Self-Forcing, a new paradigm for training autoregressive diffusion models. The key to high quality? Simulate the inference process during training by unrolling transformers with KV caching.

Albert Gu (@_albertgu) 's Twitter Profile Photo

I converted one of my favorite talks I've given over the past year into a blog post. "On the Tradeoffs of SSMs and Transformers" (or: tokens are bullshit) In a few days, we'll release what I believe is the next major advance for architectures.

I converted one of my favorite talks I've given over the past year into a blog post.

"On the Tradeoffs of SSMs and Transformers"
(or: tokens are bullshit)

In a few days, we'll release what I believe is the next major advance for architectures.
Xun Huang (@xunhuang1995) 's Twitter Profile Photo

What exactly is a "world model"? And what limits existing video generation models from being true world models? In my new blog post, I argue that a true video world model must be causal, interactive, persistent, real-time, and physical accurate. xunhuang.me/blogs/world_mo…

Google DeepMind (@googledeepmind) 's Twitter Profile Photo

What if you could not only watch a generated video, but explore it too? 🌐 Genie 3 is our groundbreaking world model that creates interactive, playable environments from a single text prompt. From photorealistic landscapes to fantasy realms, the possibilities are endless. 🧵

Jiwen Yu (@yujiwenhk) 's Twitter Profile Photo

🚀 My first tweet! (1/n) Thrilled to share our new work: Context-as-Memory (CaM) — tackling the memory problem in Video World Model! Our idea: context=memory. By leveraging context, CaM preserves consistency across generations (like Genie 3). 🎥 Check out our demo video below!

KREA AI (@krea_ai) 's Twitter Profile Photo

today, we're making another step towards the future. introducing our first Real-time Video generation model. join the beta 👇