Shenzhi Wang🌟 (@shenzhiwang_thu) 's Twitter Profile
Shenzhi Wang🌟

@shenzhiwang_thu

PhD Candidate @Tsinghua_Uni | Developer of 🔥Xwen-7B&72B-Chat🔥Llama3-8B&70B-Chinese-Chat & 🔥Mistral-7B-v0.3-Chinese-Chat | Research Focuses: RL+LLM+Agent

ID: 1676443035184316416

linkhttps://shenzhi-wang.netlify.app calendar_today05-07-2023 04:09:12

330 Tweet

1,1K Followers

406 Following

Shenzhi Wang🌟 (@shenzhiwang_thu) 's Twitter Profile Photo

(6/n) Download #Xwen #LLM now! BF16 and all kinds of GGUF models are provided. HuggingFace🤗: huggingface.co/collections/sh…

宝玉 (@dotey) 's Twitter Profile Photo

由清华大学等高校成员组成的 Xwen Team 开源的 Xwen 模型! 基于Qwen base模型训练而成,包含Xwen-72B-Chat与Xwen-7B-Chat两种大小,表现不错,有关注本地部署小模型的可以关注一下

Adina Yakup (@adinayakup) 's Twitter Profile Photo

Xwen 🔥 a series of open models based on Qwen2.5 models, developed by a brilliant research team of PhD students from the Chinese community. huggingface.co/collections/sh… ✨ 7B/72B ✨ Apache 2.0 ✨ Xwen-72B-Chat outperformed DeepSeek V3 on Arena Hard Auto

Shenzhi Wang🌟 (@shenzhiwang_thu) 's Twitter Profile Photo

lmarena.ai (formerly lmsys.org) OpenAI Hey lmarena.ai (formerly lmsys.org) team, we'd love to see our open-sourced model, Xwen-72B-Chat, included in the Chatbot Arena! 🥹 It supports both English and Chinese, with strong performance across multiple benchmarks. We have sufficient inference compute for API integration. We've sent several

<a href="/lmarena_ai/">lmarena.ai (formerly lmsys.org)</a> <a href="/OpenAI/">OpenAI</a> Hey <a href="/lmarena_ai/">lmarena.ai (formerly lmsys.org)</a> team, we'd love to see our open-sourced model, Xwen-72B-Chat, included in the Chatbot Arena! 🥹 It supports both English and Chinese, with strong performance across multiple benchmarks. We have sufficient inference compute for API integration.

We've sent several
Shenzhi Wang🌟 (@shenzhiwang_thu) 's Twitter Profile Photo

lmarena.ai (formerly lmsys.org) OpenAI Xwen-72B-Chat achieves 86.1 on Arena-Hard-Auto, surpassing DeepSeek-V3 (671B) with only ~1/10th the parameters! 😆We believe Xwen-72B-Chat can perform well on Chatbot Arena, possibly ranking in the top 10.

<a href="/lmarena_ai/">lmarena.ai (formerly lmsys.org)</a> <a href="/OpenAI/">OpenAI</a> Xwen-72B-Chat achieves 86.1 on Arena-Hard-Auto, surpassing DeepSeek-V3 (671B) with only ~1/10th the parameters!
😆We believe Xwen-72B-Chat can perform well on Chatbot Arena, possibly ranking in the top 10.
LLaMA Factory (@llamafactory_ai) 's Twitter Profile Photo

In modern RLHF frameworks, we see many "batch size" configs. They are designed for maximizing the GPU utilization. However, these configs also confuse users who are not familiar with systems. To provide a clear view for those who want to train their reasoning models, we explain

In modern RLHF frameworks, we see many "batch size" configs. They are designed for maximizing the GPU utilization. However, these configs also confuse users who are not familiar with systems. To provide a clear view for those who want to train their reasoning models, we explain
Qwen (@alibaba_qwen) 's Twitter Profile Photo

Today, we release QwQ-32B, our new reasoning model with only 32 billion parameters that rivals cutting-edge reasoning model, e.g., DeepSeek-R1. Blog: qwenlm.github.io/blog/qwq-32b HF: huggingface.co/Qwen/QwQ-32B ModelScope: modelscope.cn/models/Qwen/Qw… Demo: huggingface.co/spaces/Qwen/Qw… Qwen Chat:

Today, we release QwQ-32B, our new reasoning model with only 32 billion parameters that rivals cutting-edge reasoning model, e.g., DeepSeek-R1.

Blog: qwenlm.github.io/blog/qwq-32b
HF: huggingface.co/Qwen/QwQ-32B
ModelScope: modelscope.cn/models/Qwen/Qw…
Demo: huggingface.co/spaces/Qwen/Qw…
Qwen Chat:
Binyuan Hui (@huybery) 's Twitter Profile Photo

🚀 Recently, I've been focusing on RL for LLM, and I'm excited to introduce QwQ-32B—the best open-source reasoning model under 100B scale. RL indeed holds some fascinating yet unexplored mysteries. You're all welcome to continue building more interesting things based on Qwen!

🚀 Recently, I've been focusing on RL for LLM, and I'm excited to introduce QwQ-32B—the best open-source reasoning model under 100B scale. RL indeed holds some fascinating yet unexplored mysteries. You're all welcome to continue building more interesting things based on Qwen!
LLaMA Factory (@llamafactory_ai) 's Twitter Profile Photo

LLaMA-Factory now supports the fine-tuning (Full/LoRA/QLoRA) of the QwQ-32B model. Time to release the power of reasoning models on your personal data 💥

Rui Lu (@raylu_thu) 's Twitter Profile Photo

🚨Ever wonder why diffusion models generate nonsensical text? Our latest study at #ICLR2025 uncovers "Local Generation Bias"—a hidden training bias causing textual hallucinations! 🧠 Key finding: Diffusion models independently generate symbols locally without global context.

🚨Ever wonder why diffusion models generate nonsensical text? Our latest study at #ICLR2025 uncovers "Local Generation Bias"—a hidden training bias causing textual hallucinations!
🧠 Key finding: Diffusion models independently generate symbols locally without global context.
Shenzhi Wang🌟 (@shenzhiwang_thu) 's Twitter Profile Photo

Cooragent by our Tsinghua LeapLab LeapLab@THU : Open-source multi-agent collaboration framework! 🚀 Tell it to "build an AI intelligence secretary" → auto-scans, curates updates, delivers daily reports. MIT Licensed | Dev-friendly. Try ⬇️ github.com/LeapLabTHU/coo… #AgenticAI

Yang Yue (@yangyue_thu) 's Twitter Profile Photo

Does RL Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model? Our new paper investigate the question and has sparked active discussions. In video, freq Q&A starts at 1:28, covering common questions on pass@k, the takeaway and etc. see limit-of-RLVR.github.io

Qwen (@alibaba_qwen) 's Twitter Profile Photo

Introducing Qwen3! We release and open-weight Qwen3, our latest large language models, including 2 MoE models and 6 dense models, ranging from 0.6B to 235B. Our flagship model, Qwen3-235B-A22B, achieves competitive results in benchmark evaluations of coding, math, general

Introducing Qwen3! 

We release and open-weight Qwen3, our latest large language models, including 2 MoE models and 6 dense models, ranging from 0.6B to 235B. Our flagship model, Qwen3-235B-A22B, achieves competitive results in benchmark evaluations of coding, math, general
Andrew Zhao (@andrewz45732491) 's Twitter Profile Photo

❄️Introducing Absolute Zero Reasoner: Our reasoner learns to both propose tasks that maximize learnability and improve reasoning by solving them, entirely through self-play—with no external data! It overall outperforms other "zero" models in math & coding domains. 🧵 1/

❄️Introducing Absolute Zero Reasoner: Our reasoner learns to both propose tasks that maximize learnability and improve reasoning by solving them, entirely through self-play—with no external data! It overall outperforms other "zero" models in math &amp; coding domains.
🧵 1/
Shenzhi Wang🌟 (@shenzhiwang_thu) 's Twitter Profile Photo

🔥 Excited to introduce our work: Absolute Zero—training reasoning LLMs with NO DATA via RLVR! 🚀 A new “Absolute Zero” paradigm: models learn to propose and solve tasks, evolving through self‑play. 🏆 AZ Reasoner: SoTA overall performance in math & coding with no human data.

Shenzhi Wang🌟 (@shenzhiwang_thu) 's Twitter Profile Photo

🔥 Checkout our new survey on scaffolded language models that learn beyond parametric update! 🚀Applications could be in software agents like Codex to continuously learn and adapt to an user’s needs