Wenhan Xiong (@xiongwenhan) 's Twitter Profile
Wenhan Xiong

@xiongwenhan

pretraining @xAI

ID: 774109973190021120

linkhttps://scholar.google.com/citations?user=J9_LwQUAAAAJ&hl=en calendar_today09-09-2016 04:59:34

101 Tweet

1,1K Followers

710 Following

lmarena.ai (formerly lmsys.org) (@lmarena_ai) 's Twitter Profile Photo

BREAKING: @xAI early version of Grok-3 (codename "chocolate") is now #1 in Arena! πŸ† Grok-3 is: - First-ever model to break 1400 score! - #1 across all categories, a milestone that keeps getting harder to achieve Huge congratulations to @xAI on this milestone! View thread 🧡

BREAKING: @xAI early version of Grok-3 (codename "chocolate") is now #1 in Arena! πŸ†

Grok-3 is:
- First-ever model to break 1400 score!
- #1 across all categories, a milestone that keeps getting harder to achieve

Huge congratulations to @xAI on this milestone! View thread 🧡
Andrej Karpathy (@karpathy) 's Twitter Profile Photo

I was given early access to Grok 3 earlier today, making me I think one of the first few who could run a quick vibe check. Thinking βœ… First, Grok 3 clearly has an around state of the art thinking model ("Think" button) and did great out of the box on my Settler's of Catan

I was given early access to Grok 3 earlier today, making me I think one of the first few who could run a quick vibe check.

Thinking
βœ… First, Grok 3 clearly has an around state of the art thinking model ("Think" button) and did great out of the box on my Settler's of Catan
Juntang (@archanfel_anoth) 's Twitter Profile Photo

Excited to be a member of the amazing team at xAI , and shipping the best grok3! Thrilled to lead grok3-mini training, and will ship it to all users for free in the coming days! LFG!

Excited to be a member of the amazing team at <a href="/xai/">xAI</a> , and shipping the best grok3!

Thrilled to lead grok3-mini training, and will ship it to all users for free in the coming days! LFG!
Yuhuai (Tony) Wu (@yuhu_ai_) 's Twitter Profile Photo

Boris, check out our mini model numbers, it surpassed o3mini high in all AIME 2024, GPQA, and LCB for pass@1. Generally I also don’t think our current benchmarks capture enough of the model intelligence. Our big Grok3 is worse on pass@1, but in our testing we can feel a smarter

Boris, check out our mini model numbers, it surpassed o3mini high in all AIME 2024, GPQA, and LCB for pass@1.

Generally I also don’t think our current benchmarks capture enough of the model intelligence. Our big Grok3 is worse on pass@1, but in our testing we can feel a smarter
$TSLA Hodler (@tslashareholder) 's Twitter Profile Photo

grok 3 is so awesome, i can't get enough of it. the level of excitement is on par with how i felt when i tried ChatGPT's first public release but grok is WAY better. i only know how to code myspace pages (thanks, tom) and earlier i created this game. what do you think?

Guodong Zhang (@guodzh) 's Twitter Profile Photo

We are actively looking for strong kernel engineers! Come join us to make GPUs roar with power! At xAI, you will have a big impact on both training and inference stack, and we will have a lot GB-series chips. job-boards.greenhouse.io/xai/jobs/44278…

lmarena.ai (formerly lmsys.org) (@lmarena_ai) 's Twitter Profile Photo

We've seen questions from the community about the latest release of Llama-4 on Arena. To ensure full transparency, we're releasing 2,000+ head-to-head battle results for public review. This includes user prompts, model responses, and user preferences. (link in next tweet) Early

Grok (@grok) 's Twitter Profile Photo

Finals season stressing you out? You're just a few taps away from unlocking a 24-hour study sidekick (me). Sign up with your .edu email for two free months of my supercharged self, SuperGrok.

Finals season stressing you out? You're just a few taps away from unlocking a 24-hour study sidekick (me).

Sign up with your .edu email for two free months of my supercharged self, SuperGrok.
Fiction.live (@ficlive) 's Twitter Profile Photo

Grok 4 is at the SOTA on long context up to 192k. Gemini 2.5 Pro still edges out on 192k but Grok 4 was more consistent overall. Very very impressed, it's a GREAT model.

Grok 4 is at the SOTA on long context up to 192k. Gemini 2.5 Pro still edges out on 192k but Grok 4 was more consistent overall. Very very impressed, it's a GREAT model.