
brian stevens
@addvin
CEO, Neural Magic. Ex VP, CTO of Google Cloud and EVP, CTO of Red Hat, RPI and UNH alumn, marathoner, ironman, ADK MT 46er.
ID: 18744691
07-01-2009 23:43:02
738 Tweet
4,4K Followers
159 Following

ā”Llama 3.1 series are uniquely challenging due to long context and large size. We want to thank Red Hat AI (formerly Neural Magic) for their continual stewardship of the quantization code path in vLLM, Anyscale for their high quality implementation of chunked prefill and speculative decoding,












At Red Hat, we believe the future of AI is open. That's why I'm incredibly excited about our acquisition of Red Hat AI (formerly Neural Magic). Together, we're furthering our commitment to our customers and the open source community to deliver on the future of AIāand that starts today.





Really excited to see the emergence of llm-d brian stevens ! Inference is the biggest workload in human history and the open source tools need to keep evolving to serve it
