Haihao Shen (@haihaoshen) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

👇DeepSeek Janus Pro (An Unified Multimodal Understanding and Generation Models) is well optimized on Intel Gaudi. Check it out and have a try! 🎯Brief blog in English: news.futunn.com/en/flash/18384… 🔥Detailed blog in Chinese: finance.sina.com.cn/roll/2025-02-0…

thumb_up_off_alt52

chat_bubble_outline1

repeat12

shareShare

Haihao Shen

@haihaoshen

5 months ago

💡INC v3.3 released with enhanced FP8 and INT4 quantization performance on Intel platforms. Validated on most recent LLMs e.g., DeepSeekV3/R1, Falcon, Phi etc. github.com/intel/neural-c…🌟

thumb_up_off_alt81

chat_bubble_outline0

repeat18

shareShare

Haihao Shen

@haihaoshen

5 months ago

🫡 Sharing #pytorch landscape, a great initiative to promote its ecosystem. Check it out: landscape.pytorch.org 👐Besides the direct upstream to PyTorch, you may want to know more contributions from Intel such as INC: github.com/intel/intel-ex…, and IPEX github.com/intel/neural-c….

thumb_up_off_alt35

chat_bubble_outline2

repeat5

shareShare

Haihao Shen

@haihaoshen

5 months ago

🤖Faster Inference of LLMs using FP8 on Intel Gaudi: arxiv.org/pdf/2503.09975

thumb_up_off_alt43

chat_bubble_outline1

repeat7

shareShare

Haihao Shen

@haihaoshen

5 months ago

🪜Step by step to generate INT4 DeepSeek-R1 model using Intel's AutoRound tool, while delivering higher accuracy than FP8 or any other open-sourced INT4 model (see #evaluate section in model card) 🤖Tool: github.com/intel/auto-rou… 🎯Model: huggingface.co/OPEA/DeepSeek-…

thumb_up_off_alt22

chat_bubble_outline0

repeat6

shareShare

Haihao Shen

@haihaoshen

4 months ago

📢Sharing a printed "gold" necklace, designed by akey. Loved it 😍 with my Github ID. 🎯Released 3D modelling file (not LLM modelling!!!) which can be further customized with your Github ID. Enjoy! makerworld.com/en/models/1209…

📢Sharing a printed "gold" necklace, designed by <a href="/delock/">akey</a>. Loved it 😍 with my Github ID.
🎯Released 3D modelling file (not LLM modelling!!!) which can be further customized with your Github ID. Enjoy!
makerworld.com/en/models/1209…

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

Haihao Shen

@haihaoshen

4 months ago

👇Strongly recommending watching the video created by akey and I believe you would like it as I do.🥳 youtube.com/shorts/kNCMf2C…

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Haihao Shen

@haihaoshen

3 months ago

🔥AutoRound, a leading low-precision library for LLMs/LVMs developed by Intel, officially landed on Hugging Face transformers. Congrats to Wenhua, Weiwei, Heng! Thanks to Ilyas, Marc, Mohamed from HF team! github.com/huggingface/tr… #intel clem 🤗 Julien Chaumond

thumb_up_off_alt42

chat_bubble_outline0

repeat14

shareShare

Haihao Shen

@haihaoshen

3 months ago

Congrats Junyang Lin and Qwen team! Really excited to have #intel Neural Compressor (github.com/intel/neural-c…) be part of SW partners for day0 launch! Cheers 🍻

thumb_up_off_alt24

chat_bubble_outline0

repeat3

shareShare

Haihao Shen

@haihaoshen

3 months ago

🔥AutoRound supports day0 Qwen3 launch! Check out the sample code from github.com/intel/auto-rou…. Now you can get the best INT4/3 Qwen3 models using AutoRound.

thumb_up_off_alt23

chat_bubble_outline1

repeat2

shareShare

Mohamed

@mekkcyber

3 months ago

AutoRound is now in 🤗 Transformers! Intel’s PTQ tool brings accurate INT2–INT8 quantization to LLMs & VLMs — fast, flexible, and hardware-friendly (CPU, CUDA, Intel GPU). > Mixed-bit, GPTQ/AWQ/GGUF export, 72B in minutes > Models on Hugging Face (OPEA, etc.) 🔗 Benchmarks +

thumb_up_off_alt27

chat_bubble_outline0

repeat9

shareShare

Haihao Shen

@haihaoshen

3 months ago

🎯Thrilled to share our blog on "AutoRound", an advanced quantization approach for LLMs 👀Blog: huggingface.co/blog/autoround Thanks to wenhuach Kele Ding & Intel teams and Marc Sun & HF friends!

thumb_up_off_alt24

chat_bubble_outline0

repeat6

shareShare

Haihao Shen

@haihaoshen

3 months ago

👉Qwen3 support on Intel platforms across data center, edge, and client: intel.com/content/www/us…

thumb_up_off_alt47

chat_bubble_outline0

repeat10

shareShare

Haihao Shen

@haihaoshen

3 months ago

🎯AutoRound is now part of vLLM with more VLMs, higher accuracy, more formats (AWQ/GPTQ/GGUF). Congrats wenhuach & thanks Michael Goin 💡We highly recommend using AutoRound to generate AWQ models in the future, as AutoAWQ is no longer maintained. github.com/vllm-project/v…

thumb_up_off_alt21

chat_bubble_outline0

repeat7

shareShare

Haihao Shen

@haihaoshen

2 months ago

🔥A nice intro and demo video for AutoRound - check it out and give a try!

thumb_up_off_alt7

chat_bubble_outline0

repeat0

shareShare

Haihao Shen

@haihaoshen

2 months ago

💡Intel Neural Compressor v3.4 is released, supporting more quantization recipes, e.g., W4A8 (FP8). In the past weeks, we've contributed the algorithm AutoRound to HF Transformers and vLLM, and now we are making the contribution to SGLang. Stay tuned.😀 🎯github.com/intel/neural-c…

thumb_up_off_alt83

chat_bubble_outline1

repeat23

shareShare

Haihao Shen

@haihaoshen

2 months ago

🎯AutoRound is now helping vLLM deliver more accurate and faster LLM inference. Check it out and give a try!

thumb_up_off_alt18

chat_bubble_outline1

repeat5

shareShare

Haihao Shen

@haihaoshen

2 months ago

🎯Sharing the best-quality Qwen3 INT4 models powered by Intel Neural Compressor and AutoRound algorithm👇 huggingface.co/Intel/Qwen3-30… huggingface.co/Intel/Qwen3-14… huggingface.co/Intel/Qwen3-8B… Hope these models would help and you would get the accuracy benefits in your deployment.

thumb_up_off_alt29

chat_bubble_outline0

repeat7

shareShare