
Evan Hubinger
@evanhub
Head of Alignment Stress-Testing @AnthropicAI. Opinions my own. Previously: MIRI, OpenAI, Google, Yelp, Ripple. (he/him/his)
ID: 138923554
https://www.alignmentforum.org/users/evhub 01-05-2010 01:28:15
480 Tweet
6,6K Followers
2,2K Following

We conducted, for the first time, a pre-deployment alignment audit of a new model. See Sam Bowman's thread for some object-level takeaways about Opus. In this thread, I'll discuss some higher-level takeaways about why I think this alignment audit was useful.





here's what Dario Amodei said about President Trump’s megabill that would ban state-level AI regulation for 10 years wired.com/story/anthropi…














