Chenhui Zhang (@danielz2333) 's Twitter Profile
Chenhui Zhang

@danielz2333

Learning about our planet Earth from a Bird's-eye view @mitidss | Prev. @IllinoisCS @IllinoisStat 23' | Remote Sensing & Climate Change | Occasionally LLM

ID: 1328152428

linkhttps://danielz.ch/ calendar_today05-04-2013 01:41:45

4,4K Tweet

433 Followers

4,4K Following

Simone Scardapane (@s_scardapane) 's Twitter Profile Photo

*Model Merging in Pre-training of LLMs* by Yunshui Li et al. They investigate model merging in the pre-training phase (ie, averaging multiple checkpoints), showing performance comparable to learning rate annealing. arxiv.org/abs/2505.12082

*Model Merging in Pre-training of LLMs*
by <a href="/cloud2water/">Yunshui Li</a> et al.

They investigate model merging in the pre-training phase (ie, averaging multiple checkpoints), showing performance comparable to learning rate annealing.

arxiv.org/abs/2505.12082
TTIC (@ttic_connect) 's Twitter Profile Photo

TTIC has named Professor Avrim Blum, current Chief Academic Officer and established leader in theoretical #CS and #ML, as Interim President, effective September 2025. Read the full announcement here: buff.ly/DvyV2GD

TTIC has named Professor Avrim Blum, current Chief Academic Officer and established leader in theoretical #CS and #ML, as Interim President, effective September 2025. Read the full announcement here: buff.ly/DvyV2GD
Pushmeet Kohli (@pushmeet) 's Twitter Profile Photo

Imagine trying to listen for a whisper in the middle of a rock concert. This is similar to what the LIGO Gravitational wave observatory has to do every day. Today in Science Magazine, our team Google DeepMind shows how AI can help & give astronomers a deeper view of universe.

Imagine trying to listen for a whisper in the middle of a rock concert. This is similar to what the LIGO  Gravitational wave observatory has to do every day.  Today in <a href="/ScienceMagazine/">Science Magazine</a>, our team <a href="/GoogleDeepMind/">Google DeepMind</a> shows how AI can help &amp; give astronomers a deeper view of universe.
Justin Johnson (@jcjohnss) 's Twitter Profile Photo

10 years ago, deep learning was in its infancy. PyTorch didn't exist. Language models were recurrent, and not large. But it felt important: a new technology that would change everything. That's why Fei-Fei Li , Andrej Karpathy, and I started CS231N Staff back in 2015 - to teach the world's

Pushmeet Kohli (@pushmeet) 's Twitter Profile Photo

Excited to share an important advance in AI and math. Together with mathematicians from Brown University, New York University and Stanford University, we developed a new AI-powered method that has discovered an entirely new family of solutions to several complex equations in fluid dynamics.

Simran Arora (@simran_s_arora) 's Twitter Profile Photo

Very excited to share that I've finished my phd @stanford and will be joining @caltech’s cms department as an assistant professor. Looking forward to working with students and colleagues on ml systems! Grateful to my amazing advisor and labmates @hazyresearch for the best time

Very excited to share that I've finished my phd @stanford and will be joining @caltech’s cms department as an assistant professor. Looking forward to working with students and colleagues on ml systems! Grateful to my amazing advisor and labmates @hazyresearch for the best time
Surya Ganguli (@suryaganguli) 's Twitter Profile Photo

Teaching a new course Stanford University this quarter on explainable AI, motivated by neuroscience. I have curated a paper list 4 pages long (link in comment). What are your favorite papers on explainable AI/mechanistic interpretability that I am missing? Please comment or DM. thanks!

Teaching a new course <a href="/Stanford/">Stanford University</a> this quarter on explainable AI, motivated by neuroscience.  I have curated a paper list 4 pages long (link in comment).  What are your favorite papers on explainable AI/mechanistic interpretability that I am missing? Please comment or DM. thanks!
Stanford HAI (@stanfordhai) 's Twitter Profile Photo

“When only a few have the resources to build and benefit from AI, we leave the rest of the world waiting at the door,” said Stanford HAI Senior Fellow Yejin Choi during her address to the United Nations Security Council. Read her full speech here: hai.stanford.edu/policy/yejin-c…

“When only a few have the resources to build and benefit from AI, we leave the rest of the world waiting at the door,” said <a href="/StanfordHAI/">Stanford HAI</a> Senior Fellow <a href="/YejinChoinka/">Yejin Choi</a> during her address to the <a href="/UN/">United Nations</a> Security Council. Read her full speech here: hai.stanford.edu/policy/yejin-c…
Yilun Du (@du_yilun) 's Twitter Profile Photo

Excited to share Equilibrium Matching (EqM)! EqM simplifies and outperforms flow matching, enabling strong generative performance of FID 1.96 on ImageNet 256x256. EqM learns a single static EBM landscape for generation, enabling a simple gradient-based generation procedure.

Raluca Ada Popa (@ralucaadapopa) 's Twitter Profile Photo

I am proud to share the announcement about our CodeMender project at Google DeepMind, an agent that can automatically fix a range of code security vulnerabilities. From only a modest-compute run, our agent submitted 72 high-quality fixes to vulnerable code in popular codebases,

I am proud to share the announcement about our CodeMender project at <a href="/GoogleDeepMind/">Google DeepMind</a>, an agent that can automatically fix a range of code security vulnerabilities.  From only a modest-compute run, our agent submitted 72 high-quality fixes to vulnerable code in popular codebases,
Pushmeet Kohli (@pushmeet) 's Twitter Profile Photo

Following up on the AlphaEvolve code opt. agent, I am happy to share how our team at Google DeepMind has developed the CodeMender agent to design/apply patches to fix security vulnerabilities in large scale open source projects. #AI4code Read more at: deepmind.google/discover/blog/…

Kevin Patrick Murphy (@sirbayes) 's Twitter Profile Photo

I am pleased to announce our new paper, which provides an extremely sample-efficient way to create an agent that can perform well in multi-agent, partially-observed, symbolic environments. The key idea is to use LLM-powered code synthesis to learn a code world model (in the form

I am pleased to announce our new paper, which provides an extremely sample-efficient way to create an agent that can perform well in multi-agent, partially-observed, symbolic environments.

The key idea is to use LLM-powered code synthesis to learn a code world model (in the form
Demis Hassabis (@demishassabis) 's Twitter Profile Photo

We processed over 1.3 Quadrillion tokens last month - that's 1,300,000,000,000,000 tokens! or to put it another way that's 500M tokens a second or 1.8 Trillion tokens an hour... 🤯

We processed over 1.3 Quadrillion tokens last month - that's 1,300,000,000,000,000 tokens! or to put it another way that's 500M tokens a second or 1.8 Trillion tokens an hour... 🤯