Jordan Dotzel (@amishacorns) 's Twitter Profile
Jordan Dotzel

@amishacorns

AGI is hard;
Cornell ECE PhD '24; CS BA '17,
LLM efficiency at Google;
Ex?-Halo Pro;
Eternal Halo Combine Kid

ID: 242528918

linkhttps://www.jordandotzel.com/ calendar_today25-01-2011 00:23:57

239 Tweet

1,1K Followers

425 Following

Mohamed Abdelfattah (@mohsaied) 's Twitter Profile Photo

Find me or my students (Jordan Dotzel and @akhauriyash123) at ICML Conference next week to discuss our work on LLM quantization and NAS. DM me to get coffee or to hang out and talk about ML efficiency. Papers: - arxiv.org/abs/2405.03103 - arxiv.org/abs/2403.02484

Yash Akhauri (@yashakha) 's Twitter Profile Photo

Excited to see Jason Weston highlighting the importance of contextual behavior in transformers! In our #EMNLP2024 paper 'ShadowLLM', we show that a tiny neural network can contextually predict which heads and neurons to prune. arxiv.org/abs/2406.16635 (1/4)

Excited to see <a href="/jaseweston/">Jason Weston</a>  highlighting the importance of contextual behavior in transformers! In our #EMNLP2024 paper 'ShadowLLM', we show that a tiny neural network can contextually predict which heads and neurons to prune.

arxiv.org/abs/2406.16635
(1/4)
Mohamed Abdelfattah (@mohsaied) 's Twitter Profile Photo

Thanks Dr. Masahiro Tanaka from Microsoft Deepspeed for an impressive guest lecture in my Deep Learning Efficiency course. youtube.com/watch?v=9FogHL…

Yash Akhauri (@yashakha) 's Twitter Profile Photo

Why Many Token When Few Do Trick? Meet Attamba – Attamba replaces Key-Value projections with SSMs, unlocking multi-token compression before attention, improving perplexity by 24% over transformers of similar memory footprint! Paper: arxiv.org/abs/2411.17685 🧵

Why Many Token When Few Do Trick?

Meet Attamba – Attamba replaces Key-Value projections with SSMs, unlocking multi-token compression before attention, improving perplexity by 24% over transformers of similar memory footprint!
 
Paper: arxiv.org/abs/2411.17685 

🧵
TuringPost (@theturingpost) 's Twitter Profile Photo

What if a small model could solve most of a task and call on a bigger, more powerful one only when it hits a really hard part? It's exactly an idea behind SplitReason, a new method from Cornell University researchers. To make SlipReason efficient on hardware, they also introduced a new

What if a small model could solve most of a task and call on a bigger, more powerful one only when it hits a really hard part?

It's exactly an idea behind SplitReason, a new method from <a href="/Cornell/">Cornell University</a> researchers.

To make SlipReason efficient on hardware, they also introduced a new
Google DeepMind (@googledeepmind) 's Twitter Profile Photo

Video, meet audio. 🎥🤝🔊 With Veo 3, our new state-of-the-art generative video model, you can add soundtracks to clips you make. Create talking characters, include sound effects, and more while developing videos in a range of cinematic styles. 🧵

Taelin (@victortaelin) 's Twitter Profile Photo

Not consulting AI models in 2025 is medical malpractice. No other way to put it. I follow the field closely to know how to use it, and that saved my life. Not every patient does. People are dying today for misdiagnoses that o3 would get right 10 out of 10 times