Eldar Kurtic (@_eldarkurtic) 's Twitter Profile
Eldar Kurtic

@_eldarkurtic

Efficient inference @RedHat_AI & @ISTAustria

ID: 1017492704970801152

calendar_today12-07-2018 19:35:39

207 Tweet

614 Followers

590 Following

Eldar Kurtic (@_eldarkurtic) 's Twitter Profile Photo

Today at 15:00 CEST, I’ll give a talk at OpenSource@Siemens on efficient inference with LLMs. 📺 The talk will be live-streamed at opensource.siemens.com, followed by a live Q&A. Feel free to tune in and bring your questions! It’s a tutorial-style session covering the basics

Eldar Kurtic (@_eldarkurtic) 's Twitter Profile Photo

Want to quickly get a feeling for how fast an LLM runs under different workloads (and in different engines)? Look no further, Charles 🎉 Frye and Modal built a really cool app for it. Pro tip: don't skip the "Executive Summary" and "How to Benchmark", well worth the read!

Andrej Jovanović (@itsmaddox_j) 's Twitter Profile Photo

Join me to hear about decentralised training, why it works and what opportunities it can unlock 🚀. Many thanks to harsha for the invitation!

Eldar Kurtic (@_eldarkurtic) 's Twitter Profile Photo

The recording of Erwan Gallen's and my PyTorch Day France 2025 and GOSIM Foundation talk, "Scaling LLM Inference with vLLM," is now available on PyTorch’s YouTube channel. youtube.com/watch?v=XYh6Xf…

Eldar Kurtic (@_eldarkurtic) 's Twitter Profile Photo

Want to learn more about GuideLLM, the tool used by Charles 🎉 Frye and Modal' LLM Engine Advisor to easily benchmark LLM inference stack? Join the next vLLM office hours with Saša , Michael Goin , Jenny Yi, and Mark Kurtz . More details in the thread below 👇

Eldar Kurtic (@_eldarkurtic) 's Twitter Profile Photo

The Hugging Face folks deserve far more credit for being a pillar of open-source and still managing to push out SOTA results across the board, along with a full write-up of the entire model’s lifecycle.

Eldar Kurtic (@_eldarkurtic) 's Twitter Profile Photo

FP4 models and inference kernels ready for Blackwell GPUs! GPTQ and Hadamard for accuracy, and fused Hadamard for runtime. Check out more details about our work in the thread below 👇

Red Hat AI (@redhat_ai) 's Twitter Profile Photo

.vLLM office hours return next week! Alongside project updates from Michael Goin, vLLM committers and HPC experts Robert Shaw + Tyler Michael Smith will share how to scale MoE models with llm-d and lessons from real world multi-node deployments. Register: red.ht/office-hours

.<a href="/vllm_project/">vLLM</a> office hours return next week!

Alongside project updates from <a href="/mgoin_/">Michael Goin</a>, vLLM committers and HPC experts <a href="/robertshaw21/">Robert Shaw</a> + <a href="/tms_jr/">Tyler Michael Smith</a> will share how to scale MoE models with llm-d and lessons from real world multi-node deployments.

Register: red.ht/office-hours