Michael Zhang (@mzhangio) Twitter Tweets • TwiCopy

Michael Zhang

@mzhangio

+ Follow

CS PhD Student @hazyresearch, @StanfordAILab. Foundations of foundation models. Making them more reliable and efficient. Also do new things.

ID: 701189916387049472

linkhttp://michaelzhang.xyz calendar_today20-02-2016 23:41:18

268 Tweet

1,1K Followers

503 Following

Gate.io

@gate_io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

new thoughts from the advisor, reflecting on building foundation models for X* - we got bitter-lesson / llm-pilled in our own way - many greats like math + rigor, but sometimes stupid just works better - this “clarity” might challenge how we should think about LLMs - our lab

thumb_up_off_alt20

chat_bubble_outline0

repeat0

shareShare

Michael Zhang

@mzhangio

7 months ago

yay more self-improving systems use LLM to write kernels, make test-time compute cheaper put those kernels back into the LLMs, so they can do more test-time compute + come up w even better kernels repeat ???

thumb_up_off_alt29

chat_bubble_outline2

repeat3

shareShare

Neel Guha

@neelguha

7 months ago

What's (1) a "drink of fresh fruit pureed with milk, yogurt, or ice cream" and (2) an unsupervised algorithm for test-time LLM routing? Our #NeurIPS2024 paper, Smoothie! 🥤 arxiv.org/abs/2412.04692 1/9

thumb_up_off_alt70

chat_bubble_outline2

repeat23

shareShare

Michael Zhang

@mzhangio

7 months ago

new AI-made music video (2024 wrapped OST) happy holidays from my (LLM-focused) GenAI family to yours ^_^

thumb_up_off_alt17

chat_bubble_outline0

repeat2

shareShare

Piero Molino

@w4nderlus7

6 months ago

Today, I’m excited to unveil a project that’s incredibly close to my heart. As a lifelong gamer, I’ve always dreamed of pushing the boundaries of what’s possible in video games using my expertise in AI. Over the past year, my team at Studio Atelico and I have been blending the

thumb_up_off_alt138

chat_bubble_outline11

repeat26

shareShare

Michael Zhang

@mzhangio

5 months ago

good read! Jared Dunnmon yes china used llama for military* but when the nation-states start post-training their own LLMs (for military or otherwise; not everyone wants to share their prompt data) seems like it'd be nice if they built on US AI infra ? *reuters.com/technology/art…

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Benjamin F Spector

@bfspector

4 months ago

(1/7) Inspired by DeepSeek's FlashMLA, we're releasing ThunderMLA—a fused megakernel optimized for variable-prompt decoding! ⚡️🐱ThunderMLA is up to 35% faster than FlashMLA and just 400 LoC. Blog: bit.ly/4kubAAK With Aaryan Singhal, Dan Fu, and @hazyresearch!

thumb_up_off_alt370

chat_bubble_outline7

repeat70

shareShare

Michael Zhang

@mzhangio

4 months ago

new thoughts from the adviser! Don't agree w all the aesthetic delivery, but I do believe better to have world's AI be on familiar tech I do wonder how US AI wins on consumer, and how we should probs do stuff here. China 🇨🇳 cares less about privacy + has the super-apps (++data)

thumb_up_off_alt8

chat_bubble_outline0

repeat0

shareShare