Dinghuai Zhang 张鼎怀 (@zdhnarsil) 's Twitter Profile
Dinghuai Zhang 张鼎怀

@zdhnarsil

Researcher at @MSFTResearch. Prev: PhD at @Mila_Quebec, intern at @Apple MLR and FAIR Labs @MetaAI, math undergraduate at @PKU1898.

ID: 2489747113

linkhttp://zdhnarsil.github.io calendar_today11-05-2014 11:55:14

652 Tweet

3,3K Followers

1,1K Following

Leo Dianbo Liu (@dianboliu) 's Twitter Profile Photo

We're pleased to have Prof. Yoshua Bengio (Professor of Computer Science, Université de Montréal) as a distinguished speaker at the National University of Singapore (NUS) 120 Distinguished Speaker Series! Registration page: lnkd.in/gcPfwZ3T

We're pleased to have Prof. Yoshua Bengio (Professor of Computer Science, Université de Montréal) as a distinguished speaker at the National University of Singapore (NUS) 120 Distinguished Speaker Series!

Registration page: lnkd.in/gcPfwZ3T
Yuanqi Du (@yuanqid) 's Twitter Profile Photo

Scientific Knowledge Emerges in LLMs and YOU CAN Access It (via sampling)! 🔥🔥🔥New blog to summarize what we have learned from evaluating LLMs for several optimization, decision-making, and planning problems in science with truly impressive performances!

Scientific Knowledge Emerges in LLMs and YOU CAN Access It (via sampling)! 

🔥🔥🔥New blog to summarize what we have learned from evaluating LLMs for several optimization, decision-making, and planning problems in science with truly impressive performances!
Jiatao Gu (@thoma_gu) 's Twitter Profile Photo

I will be attending #ICLR2025 in person during Apr 24-28, and presenting our research: DART: Denoising Autoregressive Transformer 📌Fri 25 Apr 3 p.m. +08 — 5:30 p.m. +08 This is my first time visiting Singapore, and I am looking forward to chatting with old and new friends!

YCY (@yoyolicoris) 's Twitter Profile Photo

github.com/pytorch/audio/… Torchaudio just announced it will be pure Python again. That means dropping efficient kernels like filter, RNN-Tranducer, etc. It's an unwise and disruptive decision tbh. If your work will be affected by this, please leave a comment there... PyTorch

github.com/pytorch/audio/…
Torchaudio just announced it will be pure Python again. That means dropping efficient kernels like filter, RNN-Tranducer, etc. It's an unwise and disruptive decision tbh. 

If your work will be affected by this, please leave a comment there... 
<a href="/PyTorch/">PyTorch</a>
Carles Domingo-Enrich (@cdomingoenrich) 's Twitter Profile Photo

🚀Excited to open source the code for Adjoint Matching --- as part of a new repo centered around reward fine-tuning via stochastic optimal control! github.com/microsoft/soc-…

Qwen (@alibaba_qwen) 's Twitter Profile Photo

Introducing Qwen3! We release and open-weight Qwen3, our latest large language models, including 2 MoE models and 6 dense models, ranging from 0.6B to 235B. Our flagship model, Qwen3-235B-A22B, achieves competitive results in benchmark evaluations of coding, math, general

Introducing Qwen3! 

We release and open-weight Qwen3, our latest large language models, including 2 MoE models and 6 dense models, ranging from 0.6B to 235B. Our flagship model, Qwen3-235B-A22B, achieves competitive results in benchmark evaluations of coding, math, general
Zhihong Shao (@zhs05232838) 's Twitter Profile Photo

We just released DeepSeek-Prover V2. - Solves nearly 90% of miniF2F problems - Significantly improves the SoTA performance on the PutnamBench - Achieves a non-trivial pass rate on AIME 24 & 25 problems in their formal version Github: github.com/deepseek-ai/De…

We just released DeepSeek-Prover V2.
- Solves nearly 90% of miniF2F problems
- Significantly improves the SoTA performance on the PutnamBench
- Achieves a non-trivial pass rate on AIME 24 &amp; 25 problems in their formal version

Github: github.com/deepseek-ai/De…
Yinuo Ren (@yinuo_ren) 's Twitter Profile Photo

We establish a fundamental link between the time-reversal of Markov processes and the generalized Doob’s h-transform. This connection enables the design of denoising generative models with an arbitrary generator. Check our new paper: arxiv.org/abs/2504.01938 (1/4)

We establish a fundamental link between the time-reversal of Markov processes and the generalized Doob’s h-transform. This connection enables the design of denoising generative models with an arbitrary generator.

Check our new paper: arxiv.org/abs/2504.01938
(1/4)
Sam Rodriques (@sgrodriques) 's Twitter Profile Photo

Chenghao Liu will work with FutureHouse and Nobel laureate Frances Arnold Frances Arnold at Caltech to develop closed-loop generative machine learning workflows for de novo enzyme discovery. Chenghao was co-founder of Dreamfold and known for combining physical organic chemistry

Chenghao Liu will work with FutureHouse and Nobel laureate Frances Arnold <a href="/francesarnold/">Frances Arnold</a> at Caltech to develop closed-loop generative machine learning workflows for de novo enzyme discovery. Chenghao was co-founder of Dreamfold and known for combining physical organic chemistry
机器之心 JIQIZHIXIN (@synced_global) 's Twitter Profile Photo

Self-Evolving Curriculum for LLM Reasoning This paper, from Mila – Quebec AI Institute and ServiceNow Research, tackles a key challenge in reinforcement learning (RL)-based fine-tuning of LLMs: how to choose which problems to train on, and in what order, for best learning and

Self-Evolving Curriculum for LLM Reasoning

This paper, from Mila – Quebec AI Institute and ServiceNow Research, tackles a key challenge in reinforcement learning (RL)-based fine-tuning of LLMs: how to choose which problems to train on, and in what order, for best learning and
Chongxuan Li (@lichongxuan) 's Twitter Profile Photo

🚀 Excited to share our latest work: "Scaling Diffusion Transformers Efficiently via μP"! Diffusion Transformers are essential in visual generative models, but hyperparameter tuning for scaling remains challenging. We adapt μP, proving it also applies to diffusion Transformers!

Zhengyang Geng (@zhengyanggeng) 's Twitter Profile Photo

Excited to share our work with my amazing collaborators, Goodeat, Xingjian Bai, Zico Kolter, and Kaiming. In a word, we show an “identity learning” approach for generative modeling, by relating the instantaneous/average velocity in an identity. The resulting model,

Excited to share our work with my amazing collaborators, <a href="/Goodeat258/">Goodeat</a>, <a href="/SimulatedAnneal/">Xingjian Bai</a>, <a href="/zicokolter/">Zico Kolter</a>, and Kaiming.

In a word, we show an “identity learning” approach for generative modeling, by relating the instantaneous/average velocity in an identity. The resulting model,
宝玉 (@dotey) 's Twitter Profile Photo

卧槽,Rick Rubin 这篇《The Timeless Art of Vibe Coding我看魔障了,用道德经来解释 Vibe Coding!居然还是个西方人写的! 这篇文章将道与代码的类比:「道」即无名,「代码」即有形 道德经开篇: 道可道,非常道。名可名,非常名。 无名天地之始,有名万物之母。 Rubin 改编为: “The code that

Weijie Su (@weijie444) 's Twitter Profile Photo

Happy to share that our paper "The ICML 2023 Ranking Experiment: Examining Author Self-Assessment in ML/AI Peer Review" will appear in JASA as a Discussion Paper: arxiv.org/abs/2408.13430 It's a privilege to work with such a wonderful team: Buxin, Jiayao, Natalie Collina,

Harry Zhao (@theharryzhao) 's Twitter Profile Photo

Our paper on rejecting hallucinated planning targets is now accepted at ICML Conference 2025! 📜: arxiv.org/abs/2410.07096 💿: github.com/mila-iqia/delu… "Rejecting Hallucinated State Targets during Planning" - Authors: Harry Zhao, Tristan, Romain Laroche, Doina Precup, Yoshua Bengio

Our paper on rejecting hallucinated planning targets is now accepted at <a href="/icmlconf/">ICML Conference</a> 2025!
📜: arxiv.org/abs/2410.07096
💿: github.com/mila-iqia/delu…

"Rejecting Hallucinated State Targets during Planning"
- Authors: <a href="/TheHarryZhao/">Harry Zhao</a>, <a href="/TiSU32/">Tristan</a>, <a href="/LarocheRomain/">Romain Laroche</a>, Doina Precup, <a href="/Yoshua_Bengio/">Yoshua Bengio</a>
Tianyuan Zhang (@tianyuanzhang99) 's Twitter Profile Photo

Bored of linear recurrent memories (e.g., linear attention) and want a scalable, nonlinear alternative? Our new paper “Test-Time Training Done Right” propose LaCT (Large Chunk Test-Time Training) — a highly efficient, massively scalable nonlinear memory with: 💡 Pure PyTorch

Jiatao Gu (@thoma_gu) 's Twitter Profile Photo

I will be attending #CVPR2025 and presenting our latest research at Apple MLR! Specifically, I will present our highlight poster--world consistent video diffusion (cvpr.thecvf.com/virtual/2025/p…), and three workshop invited talks which includes our recent preprint ★STARFlow★! (0/n)

Yichen Li (@antheayli) 's Twitter Profile Photo

How to equip robot with super human sensory capabilities? Come join us at RSS 2025 workshop, June21, on Multimodal Robotics with Multisensory capabilities to learn more. Featuring speakers: Jitendra MALIK, Katherine J. Kuchenbecker, Kristen Grauman, Yunzhu Li, Boyi Li