Tim Xiao (@timzxiao) 's Twitter Profile
Tim Xiao

@timzxiao

PhD student in Machine Learning @ University of Tübingen · IMPRS-IS scholar

ID: 618657764

linkhttp://timx.me calendar_today26-06-2012 02:44:37

248 Tweet

231 Followers

316 Following

Weiyang Liu (@besteuler) 's Twitter Profile Photo

📢Glad to introduce FormalMATH, a large-scale Lean4 benchmark comprising 5,560 formally verified problems. 📖The benchmark spans from high-school Olympiad challenges to undergraduate-level theorems across diverse domains. The best LLM prover only achieved 16.46% accuracy. 1/4

📢Glad to introduce FormalMATH, a large-scale Lean4 benchmark comprising 5,560 formally verified problems. 

📖The benchmark spans from high-school Olympiad challenges to undergraduate-level theorems across diverse domains.

The best LLM prover only achieved 16.46% accuracy.

1/4
Katrin Renz (@katrinrenz) 's Twitter Profile Photo

📣 Excited to share our #CVPR2025 Spotlight paper and my internship project Wayve: SimLingo. A Vision-Language-Action (VLA) model that achieves state-of-the-art driving performance with language capabilities. Code: github.com/RenzKa/simlingo Paper: arxiv.org/abs/2503.09594

Zhen Liu (@itsthezhen) 's Twitter Profile Photo

I was surprised when I first saw that the black magic of prompt engineering can marry classical ML methods in such a natural way - simply asking an LLM to do rejection sampling makes it a more rational agent. Cannot wait to see how we may similarly design better "LLM algorithms".

Robert Bamler (@robamler) 's Twitter Profile Photo

Great paper by my students Tim Xiao and Johannes Zenn and collaborators that applies ideas from Monte Carlo sampling to (black-box) LLM execution to turn LLMs into better calibrated stochastic samplers.

Weiyang Liu (@besteuler) 's Twitter Profile Photo

Verbalized machine learning treats LLMs with prompts as function approximators. Building on this, Tim Xiao came up with the idea of studying whether LLMs can act as samplers. It turns out they’re often biased, even when they appear to understand the target distribution.

Weiyang Liu (@besteuler) 's Twitter Profile Photo

Muon is gaining attention for its use of orthogonalization, making it a natural point of comparison with POET. We computed singular value entropy over training steps and find that POET always maintains high entropy. A recent study (arxiv.org/abs/2502.16982) suggests that this is a

Muon is gaining attention for its use of orthogonalization, making it a natural point of comparison with POET. We computed singular value entropy over training steps and find that POET always maintains high entropy. A recent study (arxiv.org/abs/2502.16982) suggests that this is a
Weiyang Liu (@besteuler) 's Twitter Profile Photo

We have added some new experiments and analyses to the new version of our paper. Check it out here: arxiv.org/abs/2506.08001. We discovered that despite being generalized to spectrum-preserving training, POET can still preserve minimum hyperspherical energy. This property only

We have added some new experiments and analyses to the new version of our paper. Check it out here: arxiv.org/abs/2506.08001. We discovered that despite being generalized to spectrum-preserving training, POET can still preserve minimum hyperspherical energy. This property only