Yushun Zhang (@ericzhang0410) 's Twitter Profile
Yushun Zhang

@ericzhang0410

Phd student at The Chinese University of Hong Kong, shenzhen, China,

Working on optimization and LLMs zyushun.github.io

ID: 1239780017040580610

calendar_today17-03-2020 05:06:12

326 Tweet

279 Followers

357 Following

Kyunghyun Cho (@kchonyc) 's Twitter Profile Photo

on my way back to NYC, i met wise Leon Bottou in the airport. we talked. then i told him "you should tweet that!" and, he delivered much more than a tweet: a blog post with thoughts and insights on AI research only he can deliver this clearly and succinctly.

on my way back to NYC, i met wise Leon Bottou in the airport. we talked. then i told him "you should tweet that!"

and, he delivered much more than a tweet: a blog post with thoughts and insights on AI research only he can deliver this clearly and succinctly.
arXiv math.OC Optimization and Control (@mathocb) 's Twitter Profile Photo

Henry Shugart, Jason M. Altschuler: Negative Stepsizes Make Gradient-Descent-Ascent Converge arxiv.org/abs/2505.01423 arxiv.org/pdf/2505.01423 arxiv.org/html/2505.01423

Yushun Zhang (@ericzhang0410) 's Twitter Profile Photo

Check out this excellent work led by Dmitry Dmitry Rybin ! We discovered a new algorithm to compute the matrix product XX^t with 5% fewer number of multiplications

Yacine Mahdid (@yacinelearning) 's Twitter Profile Photo

man, scientists working on optimizing matrix multiplications have oppenheimer level of aura - use a RL agent to spit out heckload of bilinear products - slap two MILP to combine and filter those - iterate on top of a Large Neighborhood Search flow until it’s fast fast what the

man, scientists working on optimizing matrix multiplications have oppenheimer level of aura

- use a RL agent to spit out heckload of bilinear products
- slap two MILP to combine and filter those
- iterate on top of a Large Neighborhood Search flow until it’s fast fast

what the
Yushun Zhang (@ericzhang0410) 's Twitter Profile Photo

Dear Professors who are running the ICML board election, single column please, and we will definitely vote for you 😀 #icml2025 #ICML

Kimi.ai (@kimi_moonshot) 's Twitter Profile Photo

🚀 Hello, Kimi K2! Open-Source Agentic Model! 🔹 1T total / 32B active MoE model 🔹 SOTA on SWE Bench Verified, Tau2 & AceBench among open models 🔹Strong in coding and agentic tasks 🐤 Multimodal & thought-mode not supported for now With Kimi K2, advanced agentic intelligence

🚀 Hello, Kimi K2!  Open-Source Agentic Model!
🔹 1T total / 32B active MoE model
🔹 SOTA on SWE Bench Verified, Tau2 & AceBench among open models
🔹Strong in coding and agentic tasks
🐤 Multimodal & thought-mode not supported for now

With Kimi K2, advanced agentic intelligence
Yuchen Jin (@yuchenj_uw) 's Twitter Profile Photo

Holy shit. Kimi K2 was pre-trained on 15.5T tokens using MuonClip with zero training spike. Muon has officially scaled to the 1-trillion-parameter LLM level. Many doubted it could scale, but here we are. So proud of the Moum team: Keller Jordan, Vlado Boza, You Jiacheng,

Holy shit.

Kimi K2 was pre-trained on 15.5T tokens using MuonClip with zero training spike.

Muon has officially scaled to the 1-trillion-parameter LLM level. Many doubted it could scale, but here we are.

So proud of the Moum team: <a href="/kellerjordan0/">Keller Jordan</a>, <a href="/bozavlado/">Vlado Boza</a>, <a href="/YouJiacheng/">You Jiacheng</a>,