Andrej Karpathy (@karpathy) Twitter Tweets • TwiCopy

Andrej Karpathy

@karpathy

+ Follow

Building @EurekaLabsAI. Previously Director of AI @ Tesla, founding team @ OpenAI, CS231n/PhD @ Stanford. I like to train large deep neural nets 🧠🤖💥

ID: 33836629

linkhttps://karpathy.ai calendar_today21-04-2009 06:49:15

9,9K Tweet

1,2M Followers

972 Following

Andrej Karpathy

@karpathy

4 months ago

Imagine you do 1 hour of intellectually difficult work just to learn that your grade is 0.32 lol

thumb_up_off_alt4,4K

chat_bubble_outline156

repeat130

shareShare

Andrej Karpathy

@karpathy

4 months ago

LLMs are chmod a+w artifacts yay

thumb_up_off_alt3,3K

chat_bubble_outline162

repeat185

shareShare

So so so cool. Llama 1B batch one inference in one single CUDA kernel, deleting synchronization boundaries imposed by breaking the computation into a series of kernels called in sequence. The *optimal* orchestration of compute and memory is only achievable in this way.

thumb_up_off_alt2,2K

chat_bubble_outline63

repeat299

shareShare

Andrej Karpathy

@karpathy

3 months ago

An attempt to explain (current) ChatGPT versions. I still run into many, many people who don't know that: - o3 is the obvious best thing for important/hard things. It is a reasoning model that is much stronger than 4o and if you are using ChatGPT professionally and not using o3

thumb_up_off_alt11,11K

chat_bubble_outline558

repeat1,1K

shareShare

Andrej Karpathy

@karpathy

3 months ago

Very impressed with Veo 3 and all the things people are finding on r/aivideo etc. Makes a big difference qualitatively when you add audio. There are a few macro aspects to video generation that may not be fully appreciated: 1. Video is the highest bandwidth input to brain. Not

thumb_up_off_alt6,6K

chat_bubble_outline311

repeat682

shareShare

Andrej Karpathy

@karpathy

3 months ago

Products with extensive/rich UIs lots of sliders, switches, menus, with no scripting support, and built on opaque, custom, binary formats are ngmi in the era of heavy human+AI collaboration. If an LLM can't read the underlying representations and manipulate them and all of the

thumb_up_off_alt5,5K

chat_bubble_outline334

repeat613

shareShare

Andrej Karpathy

@karpathy

3 months ago

Making slides manually feels especially painful now that you know Cursor for slides should exist but doesn’t.

thumb_up_off_alt12,12K

chat_bubble_outline983

repeat570

shareShare

Andrej Karpathy

@karpathy

3 months ago

My sleep scores during recent travel were in the 90s. Now back in SF I am consistently back down to 70s, 80s. I am increasingly convinced that this is due to traffic noise from a nearby road/intersection where I live - every ~10min, a car, truck, bus, or motorcycle with a very

thumb_up_off_alt11,11K

chat_bubble_outline1,1K

repeat761

shareShare

Andrej Karpathy

@karpathy

3 months ago

🥹

thumb_up_off_alt4,4K

chat_bubble_outline139

repeat341

shareShare

Andrej Karpathy

@karpathy

3 months ago

Congrats to Simon Willison (Simon Willison) on 23 years (!!) of blogging. Really excellent LLM blog, I sub & read everything: simonwillison.net (e.g. I sub via RSS/Atom on NetNewsWire) +If you consistently enjoy the content like I do, sponsor on GitHub: github.com/sponsors/simonw

thumb_up_off_alt5,5K

chat_bubble_outline74

repeat466

shareShare

Andrej Karpathy

@karpathy

3 months ago

Pleasure to come by the YC AI Startup School today! I'm told the recordings will be up "in the coming weeks", I'll link to it then and include the slides. Thank you YC for organizing and bringing together an awesome group of builders! events.ycombinator.com/ai-sus Fun fact is that

thumb_up_off_alt3,3K

chat_bubble_outline79

repeat298

shareShare

Andrej Karpathy

@karpathy

3 months ago

Part 2 of this mystery. Spotted on reddit. In my test not 100% reproducible but still quite reproducible. 🤔

thumb_up_off_alt9,9K

chat_bubble_outline1,1K

repeat768

shareShare

Andrej Karpathy

@karpathy

3 months ago

Cool demo of a GUI for LLMs! Obviously it has a bit silly feel of a “horseless carriage” in that it exactly replicates conventional UI in the new paradigm, but the high level idea is to generate a completely ephemeral UI on demand depending on the specific task at hand.

thumb_up_off_alt4,4K

chat_bubble_outline148

repeat446

shareShare

Andrej Karpathy

@karpathy

3 months ago

Mildly obsessed with what the "highest grade" pretraining data stream looks like for LLM training, if 100% of the focus was on quality, putting aside any quantity considerations. Guessing something textbook-like content, in markdown? Or possibly samples from a really giant model?

thumb_up_off_alt1,1K

chat_bubble_outline199

repeat89

shareShare