Xian Liu (@alvinliu27) 's Twitter Profile
Xian Liu

@alvinliu27

Research Scientist @ NVIDIA DIR Group, Ph.D. @ CUHK MMLab; Previous Intern @ NVIDIA Research, Snap Research, Tencent AI Lab, Shanghai AI Lab, SenseTime

ID: 1732061817286152192

linkhttps://alvinliu0.github.io/ calendar_today05-12-2023 15:39:00

27 Tweet

148 Followers

94 Following

Xian Liu (@alvinliu27) 's Twitter Profile Photo

#ICLR24 Welcome to check out our 2D human foundation GenAI work ICLR 2026 Though super-disappointed for unable to attend due to visa issue, please feel free to drop any issues or schedule discussions:) Project Page: snap-research.github.io/HyperHuman/

#ICLR24 Welcome to check out our 2D human foundation GenAI work <a href="/iclr_conf/">ICLR 2026</a> 

Though super-disappointed for unable to attend due to visa issue, please feel free to drop any issues or schedule discussions:)

Project Page: snap-research.github.io/HyperHuman/
Xian Liu (@alvinliu27) 's Twitter Profile Photo

#CVPR24 I will present HumanGaussian@Arch 4A-E #180, Jun19 Wed 17:00–18:30, our highlight paper on Text-Drive 3D Human Generation with GS #CVPR2025 - Project: alvinliu0.github.io/projects/Human… - Code: github.com/alvinliu0/Huma… - Video: youtube.com/watch?v=S3djzH…

#CVPR24 I will present HumanGaussian@Arch 4A-E #180, Jun19 Wed 17:00–18:30, our highlight paper on Text-Drive 3D Human Generation with GS <a href="/CVPR/">#CVPR2025</a> 

- Project: alvinliu0.github.io/projects/Human…
- Code: github.com/alvinliu0/Huma…
- Video: youtube.com/watch?v=S3djzH…
Xihui Liu (@xihuiliu) 's Twitter Profile Photo

AR-based visual generative models have attracted great attention, but their inference is much slower than diffusion-based models. We propose Speculative Jacobi Decoding (SJD) to accelerate auto-regressive text-to-image generation. huggingface.co/papers/2410.01… arxiv.org/pdf/2410.01699

AR-based visual generative models have attracted great attention, but their inference is much slower than diffusion-based models. We propose Speculative Jacobi Decoding (SJD) to accelerate auto-regressive text-to-image generation.
huggingface.co/papers/2410.01…
arxiv.org/pdf/2410.01699
Eric Jang (@ericjang11) 's Twitter Profile Photo

1X NVIDIA Robotics First, 100h of EVE videos encoded with Nvidia’s new Cosmos tokenizer, which has good spatio-temporal compression ratios (1 token=8 temporal, 8x8 spatial). This means you can pack a longer, higher-quality video into fewer tokens. It’s free VRAM real estate! github.com/NVIDIA/Cosmos-…

Xian Liu (@alvinliu27) 's Twitter Profile Photo

Kindly check our team’s newly built SOTA visual (both image and video) tokenizer suite! All the model weights and technical reports are available!

Xiao Fu (@lemonaddie0909) 's Twitter Profile Photo

Upon the release of Sora, I’m glad to share our 𝟑𝐃𝐓𝐫𝐚𝐣𝐌𝐚𝐬𝐭𝐞𝐫 with the Kling team Kling AI. It can control multiple entity motions in 3D space with entity-specific 3D trajectories for video generation. [1/3] 👉 Website: fuxiao0719.github.io/projects/3dtra… 👉 Code:

MrNeRF (@janusch_patas) 's Twitter Profile Photo

3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation Contributions: 1) We are the first to customize 6 degrees of freedom (DoF) multi-entity motion in 3D space for controllable video generation, establishing a new benchmark for fine-grained motion

Zhou Xian (@zhou_xian_) 's Twitter Profile Photo

Everything you love about generative models — now powered by real physics! Announcing the Genesis project — after a 24-month large-scale research collaboration involving over 20 research labs — a generative physics engine able to generate 4D dynamical worlds powered by a physics

NVIDIA AI (@nvidiaai) 's Twitter Profile Photo

📣 Announcing: #NVIDIACosmos The world foundation model development platform for advancing physical #AI systems such as autonomous vehicles and robots. Learn more 👉 nvda.ws/4fGKiDX

📣 Announcing: #NVIDIACosmos

The world foundation model development platform for advancing physical #AI systems such as autonomous vehicles and robots.

Learn more 👉 nvda.ws/4fGKiDX
Jim Fan (@drjimfan) 's Twitter Profile Photo

Introducing NVIDIA Cosmos, an open-source, open-weight Video World Model. It's trained on 20M hours of videos and weighs from 4B to 14B. Cosmos offers two flavors: diffusion (continuous tokens) and autoregressive (discrete tokens); and two generation modes: text->video and

Andrew Carr (e/🤸) (@andrew_n_carr) 's Twitter Profile Photo

The Cosmos world model from Nvidia (available on huggingface) is very impressive. I asked for a first person view of a robot walking through a forest. Nicely done

juju (@juxuan_27) 's Twitter Profile Photo

🥳Excited to share our work #FullDiT: Multi-Task Video Generative Foundation Model with Full Attention huggingface.co/papers/2503.19…, a unified foundation model for video generation that seamlessly integrates multiple conditions via unified full-attention mechanisms.