black_samorez (@black_samorez) 's Twitter Profile
black_samorez

@black_samorez

ML PhD @ISTAustria and @EllisForEurope

ID: 1304537032367239169

calendar_today11-09-2020 21:48:57

24 Tweet

111 Followers

103 Following

Dan Alistarh (@dalistarh) 's Twitter Profile Photo

Announcing AQLM v1.1! Featuring: 1. New model collection with SOTA accuracy huggingface.co/collections/IS… 2. Gemma-2B support, running within 1.5GB; 3. LoRA integration for training Mixtral-8x7 on Colab; 4. Faster generation (3x) via CUDA graphs. Check it out: github.com/Vahe1994/AQLM

black_samorez (@black_samorez) 's Twitter Profile Photo

Tomorrow I will be presenting AQLM at the #ICML2024 13:30-15:00 poster session, stand 608. If you're interested in how we compressed LLMs down to 2bit per weight and how you can run Lllama-3-70b on RTX4090 with #vLLM, pay us a visit! Conference link: icml.cc/virtual/2024/p…

Tomorrow I will be presenting AQLM at the #ICML2024 13:30-15:00 poster session, stand 608.
If you're interested in how we compressed LLMs down to 2bit per weight and how you can run Lllama-3-70b on RTX4090 with #vLLM, pay us a visit!
Conference link: icml.cc/virtual/2024/p…
Dan Alistarh (@dalistarh) 's Twitter Profile Photo

Introducing Panza V2, our personalized LLM writing assistant, running entirely on-device! Now faster and easier to use: * Local serving via GMail extension * Cloud training via Lightning AI ⚡️ Studio * More models, including AI at Meta LLama-3.2 * Inference w/ ollama! Details:

Introducing Panza V2, our personalized LLM writing assistant, running entirely on-device! Now faster and easier to use: 
* Local serving via GMail extension
* Cloud training via <a href="/LightningAI/">Lightning AI ⚡️</a> Studio
* More models, including <a href="/AIatMeta/">AI at Meta</a> LLama-3.2
* Inference w/ <a href="/ollama/">ollama</a>!
Details:
harsha (@sree_harsha_n) 's Twitter Profile Photo

Excited to host black_samorez PhD student @ IST who will present 'Pushing the limits of LLM quantization via the linearity theorem' on Jan 10 @ 1800 CET at Cohere For AI. Really cool results and look forward to the talk. Join the community: tinyurl.com/C4AICommunityA…

Excited to host <a href="/black_samorez/">black_samorez</a> PhD student @ IST who will present 'Pushing the limits of LLM quantization via the linearity theorem' on Jan 10 @ 1800 CET at <a href="/CohereForAI/">Cohere For AI</a>. Really cool results and look forward to the talk. Join the community: tinyurl.com/C4AICommunityA…
harsha (@sree_harsha_n) 's Twitter Profile Photo

We will have black_samorez presenting his work on low-bit pre-training at Cohere For AI next week (stable training at 1 bit weights + activations) -- continuing our theme of low bit training. Looking forward :) To join in, fill the form at: tinyurl.com/C4AICommunityA…

We will have <a href="/black_samorez/">black_samorez</a> presenting his work on low-bit pre-training at <a href="/CohereForAI/">Cohere For AI</a> next week (stable training at 1 bit weights + activations) -- continuing our theme of low bit training. Looking forward :)

To join in, fill the form at: tinyurl.com/C4AICommunityA…
black_samorez (@black_samorez) 's Twitter Profile Photo

We'll be presenting this on April 27th in Singapore. For now, you can check out this recording of the Cohere For AI efficiency seminar on this topic: youtube.com/watch?v=e3ClKT…

Dan Alistarh (@dalistarh) 's Twitter Profile Photo

We are introducing Quartet, a fully FP4-native training method for Large Language Models, achieving optimal accuracy-efficiency trade-offs on NVIDIA Blackwell GPUs! Quartet can be used to train billion-scale models in FP4 faster than FP8 or FP16, at matching accuracy. [1/4]

We are introducing Quartet, a fully FP4-native training method for Large Language Models, achieving optimal accuracy-efficiency trade-offs on NVIDIA Blackwell GPUs! Quartet can be used to train billion-scale models in FP4 faster than FP8 or FP16, at matching  accuracy. 
[1/4]
harsha (@sree_harsha_n) 's Twitter Profile Photo

We're excited to welcome back one of our most frequent speakers, @black_samorezis, to the ml-efficiency group and @cohere_labs! Join us on July 2 at 9 AM PST to hear about Quartet: Native FP4 Training of LLMs.

We're excited to welcome back one of our most frequent speakers, @black_samorezis, to the ml-efficiency group and @cohere_labs! Join us on July 2 at 9 AM PST to hear about Quartet: Native FP4 Training of LLMs.
black_samorez (@black_samorez) 's Twitter Profile Photo

ChatGPT agent mode apparently has enough RAM to load and run a full-fledged 8B LLM in browser. Kudos to Vladimir Malinovskii for implementing AQLM 2-bit quantization in WebAssembly at galqiwi.github.io/aqlm-rs

ChatGPT agent mode apparently has enough RAM to load and run a full-fledged 8B LLM in browser. Kudos to <a href="/galqiwi/">Vladimir Malinovskii</a> for implementing AQLM 2-bit quantization in WebAssembly at galqiwi.github.io/aqlm-rs
Egor Zverev @ICLR 2025 (@egor_zverev_ai) 's Twitter Profile Photo

🎉 Excited to announce the Workshop on Foundations of LLM Security at #EurIPS2025! 🇩🇰 Dec 6–7, Copenhagen! 📢 Call for contributed talks is now open! See details at llmsec-eurips.github.io Kathrin Grosse Ilia Shumailov🦔 Verena Rieser sahar selim taher @thegrue Mario Fritz EurIPS Conference

🎉 Excited to announce the Workshop on Foundations of LLM Security at #EurIPS2025!
🇩🇰 Dec 6–7, Copenhagen!
📢 Call for contributed talks is now open! See details at llmsec-eurips.github.io

<a href="/KathrinGrosse/">Kathrin Grosse</a> <a href="/iliaishacked/">Ilia Shumailov🦔</a> <a href="/verena_rieser/">Verena Rieser</a> <a href="/sahar/">sahar selim taher</a>
@thegrue <a href="/mariojfritz/">Mario Fritz</a>  <a href="/EurIPSConf/">EurIPS Conference</a>