Wenhu Chen (@wenhuchen) Twitter Tweets • TwiCopy

Wenhu Chen

@wenhuchen

+ Follow

AI researcher. Interested in Reasoning, Multimodal. I direct TIGER-Lab. Author of PoT, MMMU, MMLU-Pro, MAmmoTH, CFT, LongRAG, MAP-Neo, YuE, Mocha, SuTI

ID: 727242818452897796

linkhttps://wenhuchen.github.io/ calendar_today02-05-2016 21:06:14

2,2K Tweet

19,19K Followers

638 Following

Gate.io

@gate_io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

Wenhu Chen

@wenhuchen

2 months ago

Happy to collaborate with Raghuveer, Bhuwan and others to work on batch mining for VLM2Vec. Now it's the SoTA on the MMEB benchmark. huggingface.co/spaces/TIGER-L…

thumb_up_off_alt5

chat_bubble_outline0

repeat0

shareShare

Wenhu Chen

@wenhuchen

2 months ago

Finally, the crazy weeks of NeurIPS ddl, ICCV rebuttal and EMNLP ddl have passed. Now it's time to take a rest till Sep!

thumb_up_off_alt103

chat_bubble_outline3

repeat3

shareShare

Our General Reasoner paper is coming out on Arxiv at arxiv.org/abs/2505.14652 We have re-trained our general-reasoner models to obtain much better performance! - Our 4B General Reasoner can even beat the NVDIA's Nemotron-CrossThink-7B significantly. - Our 14B General-Reasoner

thumb_up_off_alt136

chat_bubble_outline3

repeat15

shareShare

Wenhu Chen

@wenhuchen

2 months ago

Thanks for sharing! - Our 4B General Reasoner can even beat the NVDIA's Nemotron-CrossThink-7B significantly. - Our 14B General-Reasoner (Qwen3) can already achieve MMLU-Pro of 70.3%, GPQA of 56%, SuperGPQA of 39.9%, and TheoremQA of 54.4%. It's one of the most powerful

thumb_up_off_alt16

chat_bubble_outline1

repeat0

shareShare

Wenhu Chen

@wenhuchen

2 months ago

Veo 3 blew people's mind in generating talking characters! It's so exciting! But we need evaluation benchmark for that! Cong just released the MochaBench used in our Mocha paper to evaluate talking character models. Github: github.com/congwei1230/Mo… Mocha Paper:

thumb_up_off_alt8

chat_bubble_outline1

repeat4

shareShare

Wenhu Chen

@wenhuchen

2 months ago

Arxiv is going crazy after NeurIPS/EMNLP deadline. It's impossible to catch up.

thumb_up_off_alt154

chat_bubble_outline5

repeat8

shareShare

Wenhu Chen

@wenhuchen

2 months ago

Thanks for sharing our paper! We are able to incentivize VLMs to conduct reasoning in the pixel/image space in the o3-style. Paper: arxiv.org/abs/2505.15966

thumb_up_off_alt74

chat_bubble_outline2

repeat12

shareShare