Lawrence Chan (@justanotherlaw) Twitter Tweets • TwiCopy

Lawrence Chan

@justanotherlaw

+ Follow

I do AI Alignment Research. Currently at @METR_Evals on leave from my PhD at UC Berkeley’s @CHAI_berkeley. Opinions are my own.

ID: 824308056351735809

linkhttps://chanlawrence.me/ calendar_today25-01-2017 17:28:50

434 Tweet

1,1K Followers

158 Following

Gate.io

@gate_io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

Putting aside what this means for ML automation or agency overhang for a second, I was really impressed by Tao Lin's work here. Working basically alone, and using ~$50k in token + compute costs and 4 weeks of total engineering time, he substantially advance SOTA on

thumb_up_off_alt69

chat_bubble_outline0

repeat8

shareShare

METR

@metr_evals

5 months ago

When will AI systems be able to carry out long projects independently? In new research, we find a kind of “Moore’s Law for AI agents”: the length of tasks that AIs can do is doubling about every 7 months.

thumb_up_off_alt4,4K

chat_bubble_outline158

repeat826

shareShare

Toby Ord

@tobyordoxford

3 months ago

Is there a half-life for the success rates of AI agents? I show that the success rates of AI agents on longer-duration tasks can be explained by an extremely simple mathematical model — a constant rate of failing during each minute a human would take to do the task. 🧵 1/

thumb_up_off_alt267

chat_bubble_outline17

repeat35

shareShare

METR

@metr_evals

25 days ago

We ran a randomized controlled trial to see how much AI coding tools speed up experienced open-source developers. The results surprised us: Developers thought they were 20% faster with AI tools, but they were actually 19% slower when they had access to AI than when they didn't.

thumb_up_off_alt5,5K

chat_bubble_outline200

repeat1,1K

shareShare

david rein

@idavidrein

25 days ago

I was pretty skeptical that this study was worth running, because I thought that *obviously* we would see significant speedup. x.com/METR_Evals/sta…

thumb_up_off_alt1,1K

chat_bubble_outline29

repeat122

shareShare

Chris Painter

@chrispainteryup

25 days ago

METR a few months ago had two projects going in parallel: a project experimenting with AI researcher interviews to track degree of AI R&D acceleration/delegation, and this project. When the results started coming back from this project, we put the survey-only project on ice.

thumb_up_off_alt85

chat_bubble_outline2

repeat9

shareShare