Markus Anderljung (@manderljung) 's Twitter Profile
Markus Anderljung

@manderljung

Trying to design good AI policy. Director of Policy & Research @GovAI_, Adjunct Fellow @CNASdc

ID: 381501277

calendar_today28-09-2011 13:00:40

1,1K Tweet

2,2K Followers

860 Following

Markus Anderljung (@manderljung) 's Twitter Profile Photo

This is an excellent piece. I really recommend people read it. Well done, Peter! It lays out the case for a more supervision-style approach to frontier AI regulation, drawing inspiration from the financial sector and general trends in regulatory practice over many decades.

Centre for the Governance of AI (GovAI) (@govai_) 's Twitter Profile Photo

We're hiring a Research Management Associate to help run our Seasonal Fellowships in London! Help shape the next generation of AI governance researchers, with potential to become a Research Manager. Visa sponsorship available. Apply by Sunday 9 March 2025. bit.ly/3QoP4fa

We're hiring a Research Management Associate to help run our Seasonal Fellowships in London! Help shape the next generation of AI governance researchers, with potential to become a Research Manager. Visa sponsorship available. Apply by Sunday 9 March 2025. bit.ly/3QoP4fa
Markus Anderljung (@manderljung) 's Twitter Profile Photo

What the public thinks about AI really matters. Dr Noemi Dreksler and colleagues recently put together the most comprehensive review of the literature out there.

Markus Anderljung (@manderljung) 's Twitter Profile Photo

Interesting! When they trained AI systems to have nice chains of thought, to not think abt cheating on a test, it ended up hiding its intent rather than learning not to cheat. Instead, it may be better to let the system be "free" in its CoT and monitor it for intentions to cheat

Interesting! When they trained AI systems to have nice chains of thought, to not think abt cheating on a test, it ended up hiding its intent rather than learning not to cheat.

Instead, it may be better to let the system be "free" in its CoT and monitor it for intentions to cheat
Shayne Longpre (@shayneredford) 's Twitter Profile Photo

What are 3 concrete steps that can improve AI safety in 2025? 🤖⚠️ Our new paper, “In House Evaluation is Not Enough” has 3 calls-to-action to empower independent evaluators: 1️⃣ Standardized AI flaw reports 2️⃣ AI flaw disclosure programs + safe harbors. 3️⃣ A coordination

What are 3 concrete steps that can improve AI safety in 2025? 🤖⚠️

Our new paper, “In House Evaluation is Not Enough” has 3 calls-to-action to empower independent evaluators:

1️⃣ Standardized AI flaw reports
2️⃣  AI flaw disclosure programs + safe harbors.
3️⃣ A coordination
Markus Anderljung (@manderljung) 's Twitter Profile Photo

Frontier AI regulation faces a challenge: even if an upstream developer ensures their model doesn't pose unacceptable risks, downstream developers may be able to modify models to undo their efforts. In a new paper, we offer potential ways to address the challenge.

Markus Anderljung (@manderljung) 's Twitter Profile Photo

Kudos to Anthropic to put out these kinds of posts. Policymakers need to know the extent to which risks from frontier models are increasing.

Centre for British Progress (@britishprogress) 's Twitter Profile Photo

💫 We’re launching the Centre for British Progress Our founding essay: Rediscovering British Progress is a case for growth that drives shared progress, rooted in Britain's values and industrial heritage. It all starts with a postcard from 1870 👇 britishprogress.org/articles/redis…

💫 We’re launching the Centre for British Progress

Our founding essay: Rediscovering British Progress is a case for growth that drives shared progress, rooted in Britain's values and industrial heritage.

It all starts with a postcard from 1870 👇

britishprogress.org/articles/redis…
Dr Noemi Dreksler (@noemidreksler) 's Twitter Profile Photo

Thrilled to share our new The Brookings Institution piece on how the public perceives AI and its governance implications. Check it out here: brookings.edu/articles/what-…

Lennart Heim (@ohlennart) 's Twitter Profile Photo

Some claim if the US controls AI chips, countries will immediately turn to China to backfill. Critics rightly identify this as a crucial consideration—but I don't think it's an immediate and strong threat. Here's why: 1/

Markus Anderljung (@manderljung) 's Twitter Profile Photo

To design sensible frontier AI governance we need to know how many frontier models there are + will be. This is the most thorough public analysis I've seen asking that question, trying to forecast how many models there will be at different FLOP levels by 2028.

Markus Anderljung (@manderljung) 's Twitter Profile Photo

Lots of interesting content in the Claude 4 System Card. I'd recommend having a browse through: www-cdn.anthropic.com/4263b940cabb54…