Edward Z. Yang (@ezyang) 's Twitter Profile
Edward Z. Yang

@ezyang

I work on PyTorch at Meta. Chatty alt at @difficultyang. Currently on parental leave and doing a lot of AI coding, including authoring codemcp.

ID: 14930686

linkhttp://blog.ezyang.com calendar_today28-05-2008 05:22:04

8,8K Tweet

12,12K Followers

1,1K Following

Sam Tobin-Hochstadt (@samth) 's Twitter Profile Photo

Over the 10 days, I wrote a new Racket library for "expect testing", a style that Yaron (Ron) Minsky has advocated for in OCaml. github.com/samth/recspecs/ As an experiment, I built it entirely with Codex (the OpenAI async AI programming tool). I have some thoughts.

Edward Z. Yang (@ezyang) 's Twitter Profile Photo

A lot of folks on Twitter swearing by voice transcription, but personally, I find it easier to think clearly typing things out; like I would literally prefer to type and speak simultaneously than to only speak

Edward Z. Yang (@ezyang) 's Twitter Profile Photo

Hi Twitter, what are the current SOTA OSS models that have reproducible training recipes? Pre/post train OK. EleutherAI/pythia is an old example in this space. infly/OpenCoder a newer post-train one.

Lucas Beyer (bl16) (@giffmana) 's Twitter Profile Photo

Oh wow, did you guys know that torch.compile can compile numpy code? And even run it on GPU? This is pretty neat for all kinds of "surrounding" code besides the model (like evals and fancy metrics) that I used to do with numba/numexpr (cuz CPU-XLA was pretty meh). Poll below

Oh wow, did you guys know that torch.compile can compile numpy code? And even run it on GPU?

This is pretty neat for all kinds of "surrounding" code besides the model (like evals and fancy metrics) that I used to do with numba/numexpr (cuz CPU-XLA was pretty meh).

Poll below
Edward Z. Yang (@ezyang) 's Twitter Profile Photo

I rarely say this, but it would actually be pretty useful to have a PL-style formulation in Greek of DTensor propagation rules

Edward Z. Yang (@ezyang) 's Twitter Profile Photo

Been spending some time with the GSPMD paper recently. It's funny seeing all the work on making convolution work; 2021 truly was a different era