Jordan Dotzel (@amishacorns) Twitter Tweets • TwiCopy

Find me or my students (Jordan Dotzel and @akhauriyash123) at ICML Conference next week to discuss our work on LLM quantization and NAS. DM me to get coffee or to hang out and talk about ML efficiency. Papers: - arxiv.org/abs/2405.03103 - arxiv.org/abs/2403.02484

thumb_up_off_alt16

chat_bubble_outline0

repeat4

shareShare

Jordan Dotzel

@amishacorns

14 years ago

Im back

thumb_up_off_alt1

chat_bubble_outline1

repeat1

shareShare

Yash Akhauri

@yashakha

a year ago

Excited to see Jason Weston highlighting the importance of contextual behavior in transformers! In our #EMNLP2024 paper 'ShadowLLM', we show that a tiny neural network can contextually predict which heads and neurons to prune. arxiv.org/abs/2406.16635 (1/4)

Excited to see <a href="/jaseweston/">Jason Weston</a> highlighting the importance of contextual behavior in transformers! In our #EMNLP2024 paper 'ShadowLLM', we show that a tiny neural network can contextually predict which heads and neurons to prune.

arxiv.org/abs/2406.16635
(1/4)

thumb_up_off_alt15

chat_bubble_outline1

repeat5

shareShare

Mohamed Abdelfattah

@mohsaied

a year ago

Thanks Dr. Masahiro Tanaka from Microsoft Deepspeed for an impressive guest lecture in my Deep Learning Efficiency course. youtube.com/watch?v=9FogHL…

thumb_up_off_alt15

chat_bubble_outline0

repeat2

shareShare

Yash Akhauri

@yashakha

10 months ago

Why Many Token When Few Do Trick? Meet Attamba – Attamba replaces Key-Value projections with SSMs, unlocking multi-token compression before attention, improving perplexity by 24% over transformers of similar memory footprint! Paper: arxiv.org/abs/2411.17685 🧵

thumb_up_off_alt223

chat_bubble_outline1

repeat46

shareShare

TuringPost

@theturingpost

6 months ago

What if a small model could solve most of a task and call on a bigger, more powerful one only when it hits a really hard part? It's exactly an idea behind SplitReason, a new method from Cornell University researchers. To make SlipReason efficient on hardware, they also introduced a new

thumb_up_off_alt102

chat_bubble_outline4

repeat26

shareShare

Google DeepMind

@googledeepmind

5 months ago

Video, meet audio. 🎥🤝🔊 With Veo 3, our new state-of-the-art generative video model, you can add soundtracks to clips you make. Create talking characters, include sound effects, and more while developing videos in a range of cinematic styles. 🧵

thumb_up_off_alt7,7K

chat_bubble_outline267

repeat1,1K

shareShare

Taelin

@victortaelin

3 months ago

Not consulting AI models in 2025 is medical malpractice. No other way to put it. I follow the field closely to know how to use it, and that saved my life. Not every patient does. People are dying today for misdiagnoses that o3 would get right 10 out of 10 times

thumb_up_off_alt490

chat_bubble_outline13

repeat50

shareShare