Kevin Frans (@kvfrans) 's Twitter Profile
Kevin Frans

@kvfrans

@berkeley_ai @reflection_ai prev mit, read my thoughts: kvfrans.com

ID: 1665897810

linkhttp://kvfrans.com calendar_today12-08-2013 20:09:16

440 Tweet

2,2K Followers

479 Following

Seohong Park (@seohong_park) 's Twitter Profile Photo

We found a way to do RL *only* with BC policies. The idea is simple: 1. Train a BC policy π(a|s) 2. Train a conditional BC policy π(a|s, z) 3. Amplify(!) the difference between π(a|s, z) and π(a|s) using CFG Here, z can be anything (e.g., goals for goal-conditioned RL). 🧵↓

We found a way to do RL *only* with BC policies.

The idea is simple:

1. Train a BC policy π(a|s)
2. Train a conditional BC policy π(a|s, z)
3. Amplify(!) the difference between π(a|s, z) and π(a|s) using CFG

Here, z can be anything (e.g., goals for goal-conditioned RL).

🧵↓
Kevin Frans (@kvfrans) 's Twitter Profile Photo

I really liked this work because of the solid science. There are 17 pages of experiments in the appendix… We systematically tried to scale every axis we could think of (data, model size, compute) and over 1000+ trials found only one thing consistently mattered.

N8 Programs (@n8programs) 's Twitter Profile Photo

Replicated in MLX on MNIST. S+ is an intriguing optimizer that excels at both memorizing the training data and generalizing well. Very intriguing and different from most other optimizers I've tested. Takes some time to get going but typically ends up doing slightly better than

Replicated in MLX on MNIST. S+ is an intriguing optimizer that excels at both memorizing the training data and generalizing well. Very intriguing and different from most other optimizers I've tested. Takes some time to get going but typically ends up doing slightly better than
Kevin Zakka (@kevin_zakka) 's Twitter Profile Photo

We’re super thrilled to have received the Outstanding Demo Paper Award for MuJoCo Playground at RSS 2025! Huge thanks to everyone who came by our booth and participated, asked questions, and made the demo so much fun! Carlo Sferrazza Qiayuan Liao Arthur Allshire

We’re super thrilled to have received the Outstanding Demo Paper Award for MuJoCo Playground at RSS 2025!
Huge thanks to everyone who came by our booth and participated, asked questions, and made the demo so much fun!

<a href="/carlo_sferrazza/">Carlo Sferrazza</a> <a href="/qiayuanliao/">Qiayuan Liao</a> <a href="/arthurallshire/">Arthur Allshire</a>