Andrej Karpathy (@karpathy) 's Twitter Profile
Andrej Karpathy

@karpathy

Building @EurekaLabsAI. Previously Director of AI @ Tesla, founding team @ OpenAI, CS231n/PhD @ Stanford. I like to train large deep neural nets 🧠🤖💥

ID: 33836629

linkhttps://karpathy.ai calendar_today21-04-2009 06:49:15

9,9K Tweet

1,2M Followers

972 Following

Andrej Karpathy (@karpathy) 's Twitter Profile Photo

So so so cool. Llama 1B batch one inference in one single CUDA kernel, deleting synchronization boundaries imposed by breaking the computation into a series of kernels called in sequence. The *optimal* orchestration of compute and memory is only achievable in this way.

Andrej Karpathy (@karpathy) 's Twitter Profile Photo

An attempt to explain (current) ChatGPT versions. I still run into many, many people who don't know that: - o3 is the obvious best thing for important/hard things. It is a reasoning model that is much stronger than 4o and if you are using ChatGPT professionally and not using o3

An attempt to explain (current) ChatGPT versions.

I still run into many, many people who don't know that:
- o3 is the obvious best thing for important/hard things. It is a reasoning model that is much stronger than 4o and if you are using ChatGPT professionally and not using o3
Andrej Karpathy (@karpathy) 's Twitter Profile Photo

Very impressed with Veo 3 and all the things people are finding on r/aivideo etc. Makes a big difference qualitatively when you add audio. There are a few macro aspects to video generation that may not be fully appreciated: 1. Video is the highest bandwidth input to brain. Not

Andrej Karpathy (@karpathy) 's Twitter Profile Photo

Products with extensive/rich UIs lots of sliders, switches, menus, with no scripting support, and built on opaque, custom, binary formats are ngmi in the era of heavy human+AI collaboration. If an LLM can't read the underlying representations and manipulate them and all of the

Andrej Karpathy (@karpathy) 's Twitter Profile Photo

My sleep scores during recent travel were in the 90s. Now back in SF I am consistently back down to 70s, 80s. I am increasingly convinced that this is due to traffic noise from a nearby road/intersection where I live - every ~10min, a car, truck, bus, or motorcycle with a very

Andrej Karpathy (@karpathy) 's Twitter Profile Photo

Congrats to Simon Willison (Simon Willison) on 23 years (!!) of blogging. Really excellent LLM blog, I sub & read everything: simonwillison.net (e.g. I sub via RSS/Atom on NetNewsWire) +If you consistently enjoy the content like I do, sponsor on GitHub: github.com/sponsors/simonw

Andrej Karpathy (@karpathy) 's Twitter Profile Photo

Pleasure to come by the YC AI Startup School today! I'm told the recordings will be up "in the coming weeks", I'll link to it then and include the slides. Thank you YC for organizing and bringing together an awesome group of builders! events.ycombinator.com/ai-sus Fun fact is that

Andrej Karpathy (@karpathy) 's Twitter Profile Photo

Cool demo of a GUI for LLMs! Obviously it has a bit silly feel of a “horseless carriage” in that it exactly replicates conventional UI in the new paradigm, but the high level idea is to generate a completely ephemeral UI on demand depending on the specific task at hand.

Andrej Karpathy (@karpathy) 's Twitter Profile Photo

Mildly obsessed with what the "highest grade" pretraining data stream looks like for LLM training, if 100% of the focus was on quality, putting aside any quantity considerations. Guessing something textbook-like content, in markdown? Or possibly samples from a really giant model?