
Linying Lv
@linyinglyu
Visiting Finance PhD student at Washington University in St Louis
ID: 1678142542653661184
09-07-2023 20:42:23
2 Tweet
21 Followers
150 Following

nanoGPT speedrun: Nice work from Keller Jordan adapting the nanoGPT/llmc PyTorch training code into a benchmark training a 124M Transformer to a fixed validation loss target. Current SOTA is 3.8X more token-efficient training (2.7B vs. 10B tokens)
