tl;dr submit a training algorithm* that is faster** than Adam*** and win $10,000 💸🚀
*a set of hparams, self-tuning algorithm, and/or update rule
**see rules for how we measure speed
***beat all submissions, currently the best is NAdamW in wallclock and DistShampoo in steps
MLCommons #AlgoPerf results are in! 🏁
$50K prize competition yielded 28% faster neural net training with non-diagonal preconditioning beating Nesterov Adam. New SOTA for hyperparameter-free algorithms too! Full details in our blog. mlcommons.org/2024/08/mlc-al…
#AIOptimization #AI
Congratulations to everyone who submitted to the MLCommons AlgoPerf training algorithms competition! We were delighted to provide compute resources for evaluating so many exciting submissions.
Hi there! This account will post about the AlgoPerf benchmark and leaderboard updates for faster neural network training via better training algorithms. But let's start with what AlgoPerf is, what we have done so far, and how you can train neural nets ~30% faster.
Lecture 11: benchmarking optimizers
1. the problem: comparing optimizers (sgd, adam, etc.) in deep learning is tricky.
2. challenge 1: defining "speed". curves cross, so use time-to-result.
3. challenge 2: hyperparameter tuning trap. protocol matters more than algo? (choi et