Quentin Gallouédec
@qgallouedec
PhD - Research engineer @huggingface 🤗
TRL maintainer
📦➡️🦋 bsky.app/profile/qgallo…
ID: 1127913981526540288
13-05-2019 12:30:23
495 Tweet
2,2K Followers
553 Following
Sharing the slides from yesterday's talk about "Fine Tuning with TRL" from the Together AI x Hugging Face workshop we hosted in our Paris office 🎃!
On-policy distillation is powerful, but Thinking Machines's tinker only supports distilling from a teacher model within the same family, making it impossible for qwen to learn from deepseek, gpt-oss, etc. For the first time, we enabled model-agnostic distillations natively using