Georgi Gerganov (@ggerganov) 's Twitter Profile
Georgi Gerganov

@ggerganov

24th at the Electrica puzzle challenge | github.com/ggml-org

ID: 3300401027

linkhttps://github.com/ggerganov calendar_today27-05-2015 12:56:54

1,1K Tweet

50,50K Followers

279 Following

Qwen (@alibaba_qwen) 's Twitter Profile Photo

We will release the quantized models of Qwen3 to you in the following days. Today we release the AWQ and GGUFs of Qwen3-14B and Qwen3-32B, which enables using the models with limited GPU memory. Qwen3-32B-AWQ: huggingface.co/Qwen/Qwen3-32B… Qwen3-32B-GGUF: huggingface.co/Qwen/Qwen3-32B…

We will release the quantized models of Qwen3 to you in the following days. Today we release the AWQ and GGUFs of Qwen3-14B and Qwen3-32B, which enables using the models with limited GPU memory.

Qwen3-32B-AWQ: huggingface.co/Qwen/Qwen3-32B…
Qwen3-32B-GGUF: huggingface.co/Qwen/Qwen3-32B…
Simon Willison (@simonw) 's Twitter Profile Photo

llama.cpp shipped new support for vision models this morning, including macOS binaries (albeit quarantined so you have to take extra steps to run them) that let you run vision models in a terminal or as a localhost web UI

llama.cpp shipped new support for vision models this morning, including macOS binaries (albeit quarantined so you have to take extra steps to run them) that let you run vision models in a terminal or as a localhost web UI
Georgi Gerganov (@ggerganov) 's Twitter Profile Photo

Son has been doing an outstanding job at maintaining the llama-server implementation and now bringing full-blown vision input support to llama.cpp! Massive kudos and thanks for your valuable contributions to the project!

Julien Chaumond (@julien_c) 's Twitter Profile Photo

llama.cpp is now fully compatible with VLMs 💥 HUGE kudos to Xuan-Son Nguyen from HF and to the ggml team 💟 Here are a selection of pre-quantized models, ready to be used, from: - Google DeepMind Gemma - Mistral AI Pixtral - Qwen VL - Hugging Face SmolVLM Give them a

llama.cpp is now fully compatible with VLMs 💥

HUGE kudos to <a href="/ngxson/">Xuan-Son Nguyen</a> from HF and to the <a href="/ggml_org/">ggml</a> team 💟

Here are a selection of pre-quantized models, ready to be used, from:
- <a href="/GoogleDeepMind/">Google DeepMind</a> Gemma
- <a href="/MistralAI/">Mistral AI</a> Pixtral
- <a href="/Alibaba_Qwen/">Qwen</a> VL
- <a href="/huggingface/">Hugging Face</a> SmolVLM

Give them a
Georgi Gerganov (@ggerganov) 's Twitter Profile Photo

PSA for applications that use local AI models - here is how to do it right: More and more applications are adding support for local AI models, which is great. But I notice that they are doing it the wrong way (see the screenshots below). The right way to do it is to add a

PSA for applications that use local AI models - here is how to do it right:

More and more applications are adding support for local AI models, which is great. But I notice that they are doing it the wrong way (see the screenshots below).

The right way to do it is to add a
Olivier Chafik (@ochafik) 's Twitter Profile Photo

llama.cpp streaming support for tool calling & thoughts was just merged: please test & report any issues 😅 github.com/ggml-org/llama… #llamacpp

Simon Willison (@simonw) 's Twitter Profile Photo

llm-llama-server now supports tools, which means this local Gemma demo should work (if you have 3.2GB free): brew install llama.cpp llama-server --jinja -hf unsloth/gemma-3-4b-it-GGUF:Q4_K_XL uvx --with llm-llama-server llm -m llama-server-tools -T llm_time 'what time is it?'

llm-llama-server now supports tools, which means this local Gemma demo should work (if you have 3.2GB free):

brew install llama.cpp
llama-server --jinja -hf unsloth/gemma-3-4b-it-GGUF:Q4_K_XL
uvx --with llm-llama-server llm -m llama-server-tools -T llm_time 'what time is it?'
PlayAI (@playaiofficial) 's Twitter Profile Photo

🎙️ After serving millions of users through our text-to-speech platform, one need kept coming up: fine-grained AI speech editing - the ability to modify existing speech. Today, we’re open-sourcing PlayDiffusion, a diffusion-based inpainting model built for that exact purpose.

Yavor Ivanov (@yavorgi) 's Twitter Profile Photo

Diffusion in-painting and TTS model. Give it a try! We are expecting you to build some great things with it. Let me know if you need any help.

Vaibhav (VB) Srivastav (@reach_vb) 's Twitter Profile Photo

Massive QoL update: You can now filter the models on the hub based on their size 🔥 Find the model that fits YOUR needs faster ⚡️

Massive QoL update: You can now filter the models on the hub based on their size 🔥

Find the model that fits YOUR needs faster ⚡️