Oliver Stanley (@_oliverstanley) 's Twitter Profile
Oliver Stanley

@_oliverstanley

ML engineer

ID: 150352165

calendar_today31-05-2010 18:29:32

2,2K Tweet

663 Followers

274 Following

Zafir Stojanovski (@zafstojano) 's Twitter Profile Photo

Super excited to share 💪🧠Reasoning Gym! 🧵 We provide over 100 data generators and verifiers spanning several domains (algebra, arithmetic, code, geometry, logic, games) for training the next generation of reasoning models. In essence, we can generate an infinite amount of

Super excited to share 💪🧠Reasoning Gym! 🧵

We provide over 100 data generators and verifiers spanning several domains (algebra, arithmetic, code, geometry, logic, games) for training the next generation of reasoning models. 

In essence, we can generate an infinite amount of
Oliver Stanley (@_oliverstanley) 's Twitter Profile Photo

This takes me back to 2023 building Open Assistant. Too many users for the limited GPUs we had for inference, so one idea was to prioritise users who provided more feedback data. Granular feedback from highly heterogenous human raters is very messy, though.

Oliver Stanley (@_oliverstanley) 's Twitter Profile Photo

Congrats to Prime on the release of their new open reasoning dataset. Nice to see Reasoning Gym tasks used extensively in SYNTHETIC-2!

Jean Kaddour (@jeankaddour) 's Twitter Profile Photo

Stop overfitting to GSM8K! Reasoning Gym - 100+ RL envs for LLM RL - got accepted to NeurIPS as Spotlight! Frontier LLMs still struggle with many hard env configs. arxiv.org/abs/2505.24760 github.com/open-thought/r…

Stop overfitting to GSM8K!

Reasoning Gym - 100+ RL envs for LLM RL - got accepted to NeurIPS as Spotlight! 

Frontier LLMs still struggle with many hard env configs.

arxiv.org/abs/2505.24760
github.com/open-thought/r…
Oliver Stanley (@_oliverstanley) 's Twitter Profile Photo

I'm late to this but a couple of weeks ago Reasoning Gym was accepted as a spotlight at NeurIPS ⭐️ I'll be attending, in San Diego from 2nd December until the 7th - if anyone wants to chat, reach out!