Alec Radford (@alecrad) 's Twitter Profile
Alec Radford

@alecrad

ML developer/researcher at OpenAI

ID: 898805695

linkhttps://github.com/Newmu calendar_today23-10-2012 00:51:38

560 Tweet

55,55K Followers

296 Following

Alec Radford (@alecrad) 's Twitter Profile Photo

A much cleaner interface than the current research code release if you want to see how this approach does on your problems!

Alec Radford (@alecrad) 's Twitter Profile Photo

More results from this very promising line of work! Congrats to Thom and the whole Hugging Face team on their impressive performance.

Alec Radford (@alecrad) 's Twitter Profile Photo

Been meaning to check this - thanks Thomas Wolf ! Random speculation: the bit of weirdness going on in BERT's position embeddings compared to GPT is due to the sentence similarity task. I'd guess a version of BERT trained without that aux loss would have pos embds similar to GPT.

Alec Radford (@alecrad) 's Twitter Profile Photo

Nice discussion of the progress in NLU that's happening with BERT, OpenAI GPT, ULMFiT, ELMo, and more covered by Cade Metz in the The New York Times I'm super excited to see how far this line of research will be able to get in the next few years! nytimes.com/2018/11/18/tec…

mike cook (@mtrc) 's Twitter Profile Photo

Shoutout to Katyanna Quach who fed the system a curveball, which I always like to see. As you might expect by now after seeing AlphaStar, OpenAI 5 etc. etc., if you drag the system away from its training data and into weirder territory, it begins to wobble. theregister.co.uk/2019/02/14/ope…

Shoutout to <a href="/katyanna_q/">Katyanna Quach</a> who fed the system a curveball, which I always like to see. As you might expect by now after seeing AlphaStar, OpenAI 5 etc. etc., if you drag the system away from its training data and into weirder territory, it begins to wobble. theregister.co.uk/2019/02/14/ope…
Smerity (@smerity) 's Twitter Profile Photo

zeynep tufekci It's interesting we're having this discussion upon releasing text models that _might_ have potential for misuse yet we never engaged as fully as a community when many of the technologies powering visual Deep Fakes were being released, including hard to make pretrained models.

Joshua Achiam (@jachiam0) 's Twitter Profile Photo

I'd like to weigh in on the #GPT2 discussion. The decision not to release the trained model was carefully considered and important for norm-forming. Serving the public good requires us to draw lines on release somewhere: better long before catastrophe than after.

Nando de Freitas (@nandodf) 's Twitter Profile Photo

First, reproducibility is not about rerunning code to get the same results. Science must be more robust, as naive copying has many flaws. Second, reproducibility should never be above public safety. We must publish responsibility, with hope and kindness in our minds.

Alec Radford (@alecrad) 's Twitter Profile Photo

By the way - I think a valid (if extreme) take on GPT-2 is "lol you need 10,000x the data, 1 billion parameters, and a supercomputer to get current DL models to generalize to Penn Treebank."

Graham Neubig (@gneubig) 's Twitter Profile Photo

One commonly cited argument about the difficulty of learning common-sense reasoning is that "no-one writes down common sense". A counter-argument is "well, the web is big": instructables.com/id/How-To-Open…

One commonly cited argument about the difficulty of learning common-sense reasoning is that "no-one writes down common sense". A counter-argument is "well, the web is big": instructables.com/id/How-To-Open…
Christine McLeavey (@mcleavey) 's Twitter Profile Photo

Extremely excited to share work I've been doing at OpenAI the past few months: MuseNet, a neural net music generator. It's been a huge team effort pulling this all together!