Huajian Xin (@huajian_xin) 's Twitter Profile
Huajian Xin

@huajian_xin

Ph.D. Student @InfAtEd on LLMs for theorem proving
| Student Researcher @BytedanceTalk Doubao/Seed
| Ex. @deepseek_ai
| Recent: DeepSeek-Prover, LEGO-Prover

ID: 1565340438009036802

linkhttps://xinhuajian.wordpress.com/ calendar_today01-09-2022 14:07:11

47 Tweet

1,1K Followers

317 Following

Huajian Xin (@huajian_xin) 's Twitter Profile Photo

From my personal observations, this is the most accurate description I've seen of DeepSeek's research and engineering ethos (with English translation provided by DeepSeek R1): (Source: mp.weixin.qq.com/s/WFJxnTF9fGII…)

From my personal observations, this is the most accurate description I've seen of DeepSeek's research and engineering ethos (with English translation provided by DeepSeek R1):

(Source:  mp.weixin.qq.com/s/WFJxnTF9fGII…)
Jia Li (@jiali52524397) 's Twitter Profile Photo

🚀 NuminaMath 1.5 is here! 🚀 900k+ high-quality competition math problems with CoT solutions, new problem metadata, manually verified Olympiad problems, and more! 📚🏅 Check it out: 🔗 huggingface.co/datasets/AI-MO… Thanks to Zhenzhe Ying Léo Dreyfus-Schmidt

Huajian Xin (@huajian_xin) 's Twitter Profile Photo

Glad to see DeepSeek-Prover V1.5 continuing to be used as the foundation model and for cold-start data synthesis. Hope you all enjoy it! 🚀

Huajian Xin (@huajian_xin) 's Twitter Profile Photo

Had a great time giving a talk at UCL today on LLMs and formal mathematics! Excited to share my slides here: xinhuajian.wordpress.com/wp-content/upl…

Jia Li (@jiali52524397) 's Twitter Profile Photo

We believe formal math is the future. 🔥Introducing Kimina-Prover Preview, a Numina & Kimi.ai collaboration, the first large formal reasoning model for Lean 4, achieving 80.78% miniF2F. github.com/MoonshotAI/Kim…

We believe formal math is the future.
🔥Introducing Kimina-Prover Preview, a Numina &
<a href="/Kimi_Moonshot/">Kimi.ai</a>  collaboration, the first large formal reasoning model for Lean 4, achieving 80.78% miniF2F.
github.com/MoonshotAI/Kim…
Kimi.ai (@kimi_moonshot) 's Twitter Profile Photo

🔬 Sharing an early look at Kimina-Prover, our new Lean theorem proving model from our collaboration with Numina! Jia Li 🏆 Using an RL pipeline for proof exploration, Kimina-Prover Preview achieved 80.7% on the miniF2F — currently SOTA on this benchmark. We see promise

🔬 Sharing an early look at Kimina-Prover, our new Lean theorem proving model from our collaboration with Numina! <a href="/JiaLi52524397/">Jia Li</a>

🏆 Using an RL pipeline for proof exploration, Kimina-Prover Preview achieved 80.7% on the miniF2F — currently SOTA on this benchmark. We see promise
Haiming Wang (@haimingw97) 's Twitter Profile Photo

🚀 Thrilled to be a core contributor to Kimina-Prover. Huge thanks to the Numina & Moonshot teams! Kimi.ai This has been the most exciting project I've worked on since entering the field. A minimal, flexible, and powerful approach that unifies ATP ideas like no other.

Jia Li (@jiali52524397) 's Twitter Profile Photo

Combinatorics are the two last problems unsolved by AlphaProof at last year's IMO。 Introducing CombiBench Kimi.ai , a benchmark focusing on combinatorics problems ! 🔥 🏆moonshotai.github.io/CombiBench/ 📘Dataset -> huggingface.co/datasets/AI-MO…

Combinatorics are the two last problems unsolved by AlphaProof at last year's IMO。
Introducing CombiBench <a href="/Kimi_Moonshot/">Kimi.ai</a> , a benchmark focusing on combinatorics problems ! 🔥
🏆moonshotai.github.io/CombiBench/
📘Dataset -&gt; huggingface.co/datasets/AI-MO…
Huajian Xin (@huajian_xin) 's Twitter Profile Photo

Although I left DeepSeek quite a while ago, being able to scale up to 671B truly feels like a dream come true for me. I'm deeply grateful to ZZ, Zhihong and other colleagues at DeepSeek for their support, to Liang for the opportunity, and to everyone in the field who has

Zhouliang Yu (@zhouliangy) 's Twitter Profile Photo

🚀 Excited to introduce FormalMATH: a large-scale formal math benchmark with 5,560 formally verified Lean 4 statements from Olympiad and UG-level problems. 📉 Best model performance: just 16.46% — plenty of room for progress! 🔗 Explore the project: spherelab.ai/FormalMATH/

🚀 Excited to introduce FormalMATH: a large-scale formal math benchmark with 5,560 formally verified Lean 4 statements from Olympiad and UG-level problems.

📉 Best model performance: just 16.46% — plenty of room for progress!

🔗 Explore the project: spherelab.ai/FormalMATH/
AI for Math Workshop @ ICML 2025 (@ai4mathworkshop) 's Twitter Profile Photo

Glad to announce the ICML 2025 Challenge on Automated Math Reasoning and Extensions! 🌟🧮⚛️ Track 1: File-level Automated Proof Engineering (APE) of Formal Math Libraries (APE-Bench I). Participation: codabench.org/competitions/8… Track 2: Physics Reasoning with Diagrams and

Glad to announce the ICML 2025 Challenge on Automated Math Reasoning and Extensions! 🌟🧮⚛️ 

Track 1: File-level Automated Proof Engineering (APE) of Formal Math Libraries (APE-Bench I). Participation: codabench.org/competitions/8…
Track 2: Physics Reasoning with Diagrams and
Huajian Xin (@huajian_xin) 's Twitter Profile Photo

🚀 Thrilled to share that APE-Bench I has just been selected as Track 1 of the ICML 2025 AI4Math Workshop Challenge—the first competition devoted to automated proof engineering (APE) at scale. Formal mathematics is racing past the limits of manual refactoring, and APE-Bench I