
Michael Goin
@mgoin_
Engineering Lead @neuralmagic @redhat | Committer @vllm_project | Compressing LLMs and making fast software
ID: 1516059175910092803
https://github.com/mgoin 18-04-2022 14:21:01
576 Tweet
733 Followers
278 Following










The recording of Erwan Gallen's and my PyTorch Day France 2025 and GOSIM Foundation talk, "Scaling LLM Inference with vLLM," is now available on PyTorch’s YouTube channel. youtube.com/watch?v=XYh6Xf…




Happy 4th July! Speed is the Moat & Anush Elangovan & his team Keeps Running Faster & Faster. Still lots of areas where ROCm has gaps but many are already closing

We genuinely want to solve this problem. As many (Tan TJian samsja Daniel Han Eldar Kurtić and more!) chimed in, the reason includes attention kernels, matmul reduction order, precisions in various operators, and more!




