Haihao Shen (@haihaoshen) 's Twitter Profile
Haihao Shen

@haihaoshen

Creator of #intel Neural Compressor and AutoRound for LLMs; HF Optimum-Intel Maintainer; OPEA Founding member; Opinions my own

ID: 1438706609400651777

linkhttps://github.com/intel/auto-round calendar_today17-09-2021 03:29:57

544 Tweet

3,3K Followers

2,2K Following

Haihao Shen (@haihaoshen) 's Twitter Profile Photo

👇DeepSeek Janus Pro (An Unified Multimodal Understanding and Generation Models) is well optimized on Intel Gaudi. Check it out and have a try! 🎯Brief blog in English: news.futunn.com/en/flash/18384… 🔥Detailed blog in Chinese: finance.sina.com.cn/roll/2025-02-0…

Haihao Shen (@haihaoshen) 's Twitter Profile Photo

💡INC v3.3 released with enhanced FP8 and INT4 quantization performance on Intel platforms. Validated on most recent LLMs e.g., DeepSeekV3/R1, Falcon, Phi etc. github.com/intel/neural-c…🌟

Haihao Shen (@haihaoshen) 's Twitter Profile Photo

🫡 Sharing #pytorch landscape, a great initiative to promote its ecosystem. Check it out: landscape.pytorch.org 👐Besides the direct upstream to PyTorch, you may want to know more contributions from Intel such as INC: github.com/intel/intel-ex…, and IPEX github.com/intel/neural-c….

Haihao Shen (@haihaoshen) 's Twitter Profile Photo

🪜Step by step to generate INT4 DeepSeek-R1 model using Intel's AutoRound tool, while delivering higher accuracy than FP8 or any other open-sourced INT4 model (see #evaluate section in model card) 🤖Tool: github.com/intel/auto-rou… 🎯Model: huggingface.co/OPEA/DeepSeek-…

Haihao Shen (@haihaoshen) 's Twitter Profile Photo

📢Sharing a printed "gold" necklace, designed by akey. Loved it 😍 with my Github ID. 🎯Released 3D modelling file (not LLM modelling!!!) which can be further customized with your Github ID. Enjoy! makerworld.com/en/models/1209…

📢Sharing a printed "gold" necklace, designed by <a href="/delock/">akey</a>. Loved it 😍 with my Github ID.
🎯Released 3D modelling file (not LLM modelling!!!) which can be further customized with your Github ID. Enjoy!
makerworld.com/en/models/1209…
Haihao Shen (@haihaoshen) 's Twitter Profile Photo

👇Strongly recommending watching the video created by akey and I believe you would like it as I do.🥳 youtube.com/shorts/kNCMf2C…

Haihao Shen (@haihaoshen) 's Twitter Profile Photo

🔥AutoRound, a leading low-precision library for LLMs/LVMs developed by Intel, officially landed on Hugging Face transformers. Congrats to Wenhua, Weiwei, Heng! Thanks to Ilyas, Marc, Mohamed from HF team! github.com/huggingface/tr… #intel clem 🤗 Julien Chaumond

Haihao Shen (@haihaoshen) 's Twitter Profile Photo

Congrats Junyang Lin and Qwen team! Really excited to have #intel Neural Compressor (github.com/intel/neural-c…) be part of SW partners for day0 launch! Cheers 🍻

Haihao Shen (@haihaoshen) 's Twitter Profile Photo

🔥AutoRound supports day0 Qwen3 launch! Check out the sample code from github.com/intel/auto-rou…. Now you can get the best INT4/3 Qwen3 models using AutoRound.

🔥AutoRound supports day0 Qwen3 launch! Check out the sample code from github.com/intel/auto-rou…. Now you can get the best INT4/3 Qwen3 models using AutoRound.
Mohamed (@mekkcyber) 's Twitter Profile Photo

AutoRound is now in 🤗 Transformers! Intel’s PTQ tool brings accurate INT2–INT8 quantization to LLMs & VLMs — fast, flexible, and hardware-friendly (CPU, CUDA, Intel GPU). > Mixed-bit, GPTQ/AWQ/GGUF export, 72B in minutes > Models on Hugging Face (OPEA, etc.) 🔗 Benchmarks +

AutoRound is now in 🤗 Transformers!

Intel’s PTQ tool brings accurate INT2–INT8 quantization to LLMs &amp; VLMs — fast, flexible, and hardware-friendly (CPU, CUDA, Intel GPU).

&gt; Mixed-bit, GPTQ/AWQ/GGUF export, 72B in minutes
&gt; Models on Hugging Face (OPEA, etc.)

🔗 Benchmarks +
Haihao Shen (@haihaoshen) 's Twitter Profile Photo

🎯Thrilled to share our blog on "AutoRound", an advanced quantization approach for LLMs 👀Blog: huggingface.co/blog/autoround Thanks to wenhuach Kele Ding & Intel teams and Marc Sun & HF friends!

Haihao Shen (@haihaoshen) 's Twitter Profile Photo

🎯AutoRound is now part of vLLM with more VLMs, higher accuracy, more formats (AWQ/GPTQ/GGUF). Congrats wenhuach & thanks Michael Goin 💡We highly recommend using AutoRound to generate AWQ models in the future, as AutoAWQ is no longer maintained. github.com/vllm-project/v…

Haihao Shen (@haihaoshen) 's Twitter Profile Photo

💡Intel Neural Compressor v3.4 is released, supporting more quantization recipes, e.g., W4A8 (FP8). In the past weeks, we've contributed the algorithm AutoRound to HF Transformers and vLLM, and now we are making the contribution to SGLang. Stay tuned.😀 🎯github.com/intel/neural-c…

Haihao Shen (@haihaoshen) 's Twitter Profile Photo

🎯Sharing the best-quality Qwen3 INT4 models powered by Intel Neural Compressor and AutoRound algorithm👇 huggingface.co/Intel/Qwen3-30… huggingface.co/Intel/Qwen3-14… huggingface.co/Intel/Qwen3-8B… Hope these models would help and you would get the accuracy benefits in your deployment.