chen zhuoming (@chenzhuoming911) 's Twitter Profile
chen zhuoming

@chenzhuoming911

Ph.D. @SCSATCMU; undergraduate @Tsinghua_Uni

ID: 1571117216543932417

calendar_today17-09-2022 12:42:03

56 Tweet

318 Followers

82 Following

Beidi Chen (@beidichen) 's Twitter Profile Photo

🤯🥳 Thrilled to see our MagicPIG (lsh-ai.com) inspiring DeepSeek Native Sparse Attention design! We believe #sparsity is the key to scaling next-gen intelligence—from model parameters and contextual memory to lightning-fast inference. Instead of brute-force

Beidi Chen (@beidichen) 's Twitter Profile Photo

🚀 so excited to see industry releases of longcontext solutions!! Curious to see if attention alternatives like DeepSeek NSA, Kimi.ai MoBA, and Qwen -Max 1M can truly reason over million-token contexts while capturing sparse relationships in noisy data? Our

CMU School of Computer Science (@scsatcmu) 's Twitter Profile Photo

Huge thank you to NVIDIA Data Center for gifting a brand new #NVIDIADGX B200 to CMU’s Catalyst Research Group! This AI supercomputing system will afford Catalyst the ability to run and test their work on a world-class unified AI platform.

Huge thank you to <a href="/NVIDIADC/">NVIDIA Data Center</a> for gifting a brand new #NVIDIADGX B200 to CMU’s Catalyst Research Group! This AI supercomputing system will afford Catalyst the ability to run and test their work on a world-class unified AI platform.
chen zhuoming (@chenzhuoming911) 's Twitter Profile Photo

🚨 Thrilled to present our Spotlight at #ICLR2025: "MagicPIG: LSH Sampling for Efficient LLM Generation" by Ranajoy Sadhukhan 🎉 💡 MagicPIG enables KV compression for long-context LLMs — where top-k falls short, sampling shines. ⚙️ Introduces CPU-GPU heterogeneous serving to boost

🚨 Thrilled to present our Spotlight at #ICLR2025:
 "MagicPIG: LSH Sampling for Efficient LLM Generation" by <a href="/RJ_Sadhukhan/">Ranajoy Sadhukhan</a> 🎉
💡 MagicPIG enables KV compression for long-context LLMs — where top-k falls short, sampling shines.
⚙️ Introduces CPU-GPU heterogeneous serving to boost
chen zhuoming (@chenzhuoming911) 's Twitter Profile Photo

🚀 Thrilled to present MagicDec: Breaking the Latency–Throughput Tradeoff for Long Context Generation with Speculative Decoding at #ICLR2025! We break the long-standing inefficacy of speculative decoding — enabling ⚡ Lower latency 📈 Higher throughput 🔥 Bigger speedups at

Xinle Cheng (@xinlec295) 's Twitter Profile Photo

⚡ 30+ FPS video generation is HERE! 💡 Our Next-Frame Diffusion (NFD) achieves 30+ FPS autoregressive video generation on an A100 GPU with SOTA quality! Try the demo: nextframed.github.io Also huge thanks to Tianyu He

Infini-AI-Lab (@infiniailab) 's Twitter Profile Photo

🔥 We introduce Multiverse, a new generative modeling framework for adaptive and lossless parallel generation. 🚀 Multiverse is the first open-source non-AR model to achieve AIME24 and AIME25 scores of 54% and 46% 🌐 Website: multiverse4fm.github.io 🧵 1/n

Infini-AI-Lab (@infiniailab) 's Twitter Profile Photo

🚀 Excited to introduce our latest work GRESO: a method that identifies and skips millions of uninformative prompts before rollout and achieves up to 2.0x wall-clock time speedup in training. More rollouts lead to better model performance, but they’re also a major bottleneck in

🚀 Excited to introduce our latest work GRESO: a method that identifies and skips millions of uninformative prompts before rollout and achieves up to 2.0x wall-clock time speedup in training.

More rollouts lead to better model performance, but they’re also a major bottleneck in
chen zhuoming (@chenzhuoming911) 's Twitter Profile Photo

Actually, a very useful functionality. When I evaluate AIME for the first time, it takes me two to three days to find a repository that gives a correct result. Some evaluations will take Huggingface several days to run, and SGLang/vLLM is just too complicated (though faster),

chen zhuoming (@chenzhuoming911) 's Twitter Profile Photo

This is my very first time to win a real paper award! Thanks to On-Device Learning for Foundation Models @ICML 25 ! I hold a belief that sparsity will enable a real AGI accessible to everyone before it becomes a circus within the 1M+ GPU cluster.