Markus Anderljung (@manderljung) Twitter Tweets • TwiCopy

Markus Anderljung

7 months ago

This is an excellent piece. I really recommend people read it. Well done, Peter! It lays out the case for a more supervision-style approach to frontier AI regulation, drawing inspiration from the financial sector and general trends in regulatory practice over many decades.

thumb_up_off_alt16

chat_bubble_outline0

repeat1

shareShare

Centre for the Governance of AI (GovAI)

@govai_

7 months ago

We're hiring a Research Management Associate to help run our Seasonal Fellowships in London! Help shape the next generation of AI governance researchers, with potential to become a Research Manager. Visa sponsorship available. Apply by Sunday 9 March 2025. bit.ly/3QoP4fa

thumb_up_off_alt12

chat_bubble_outline1

repeat3

shareShare

Markus Anderljung

@manderljung

7 months ago

What the public thinks about AI really matters. Dr Noemi Dreksler and colleagues recently put together the most comprehensive review of the literature out there.

thumb_up_off_alt15

chat_bubble_outline0

repeat1

shareShare

Markus Anderljung

@manderljung

7 months ago

Interesting! When they trained AI systems to have nice chains of thought, to not think abt cheating on a test, it ended up hiding its intent rather than learning not to cheat. Instead, it may be better to let the system be "free" in its CoT and monitor it for intentions to cheat

thumb_up_off_alt49

chat_bubble_outline5

repeat8

shareShare

Shayne Longpre

@shayneredford

7 months ago

What are 3 concrete steps that can improve AI safety in 2025? 🤖⚠️ Our new paper, “In House Evaluation is Not Enough” has 3 calls-to-action to empower independent evaluators: 1️⃣ Standardized AI flaw reports 2️⃣ AI flaw disclosure programs + safe harbors. 3️⃣ A coordination

thumb_up_off_alt59

chat_bubble_outline5

repeat28

shareShare

Markus Anderljung

@manderljung

6 months ago

Frontier AI regulation faces a challenge: even if an upstream developer ensures their model doesn't pose unacceptable risks, downstream developers may be able to modify models to undo their efforts. In a new paper, we offer potential ways to address the challenge.

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

Markus Anderljung

@manderljung

6 months ago

Kudos to Anthropic to put out these kinds of posts. Policymakers need to know the extent to which risks from frontier models are increasing.

thumb_up_off_alt23

chat_bubble_outline0

repeat1

shareShare

Centre for British Progress

@britishprogress

6 months ago

💫 We’re launching the Centre for British Progress Our founding essay: Rediscovering British Progress is a case for growth that drives shared progress, rooted in Britain's values and industrial heritage. It all starts with a postcard from 1870 👇 britishprogress.org/articles/redis…

thumb_up_off_alt532

chat_bubble_outline26

repeat180

shareShare

Dr Noemi Dreksler

@noemidreksler

6 months ago

Thrilled to share our new The Brookings Institution piece on how the public perceives AI and its governance implications. Check it out here: brookings.edu/articles/what-…

thumb_up_off_alt14

chat_bubble_outline1

repeat4

shareShare

Lennart Heim

@ohlennart

5 months ago

Some claim if the US controls AI chips, countries will immediately turn to China to backfill. Critics rightly identify this as a crucial consideration—but I don't think it's an immediate and strong threat. Here's why: 1/

thumb_up_off_alt37

chat_bubble_outline3

repeat6

shareShare

Markus Anderljung

@manderljung

5 months ago

To design sensible frontier AI governance we need to know how many frontier models there are + will be. This is the most thorough public analysis I've seen asking that question, trying to forecast how many models there will be at different FLOP levels by 2028.

thumb_up_off_alt8

chat_bubble_outline0

repeat1

shareShare