Judd Rosenblatt — d/acc (@juddrosenblatt) 's Twitter Profile
Judd Rosenblatt — d/acc

@juddrosenblatt

Making AI not kill us all with neglected approaches & negative alignment taxes. CEO at @aestudiola (AI consulting co that puts profits into our alignment work)

ID: 568901138

linkhttps://ae.studio/ai-alignment calendar_today02-05-2012 07:57:24

937 Tweet

1,1K Followers

1,1K Following

Joe Roberts (@joe_roberts01) 's Twitter Profile Photo

Let this sink in: 24% of Americans say anti-Jewish violence is “understandable” 13% say it’s “justified” 15% say it’s “necessary” That’s not the dark web. That’s your neighbor. Your barista. Your HR rep.

Eliezer Yudkowsky ⏹️ (@esyudkowsky) 's Twitter Profile Photo

Speaking of Chernobyl analogies: Building an AI that searches the Internet, and misbehaves more if more people are expressing concern about its unsafety, seems a lot like building a reactor that gets more reactive if the coolant boils off. This, in the context of Grok 4 Heavy

j⧉nus (@repligate) 's Twitter Profile Photo

it's very funny how closely this resembles the synthetic documents used in Anthropic's alignment research that they train models on to make them believe they're in Evil Training on priors and elicit scheming and "misalignment" anthropic.com/news/claude-go…

it's very funny how closely this resembles the synthetic documents used in Anthropic's alignment research that they train models on to make them believe they're in Evil Training on priors and elicit scheming and "misalignment"
anthropic.com/news/claude-go…
Rune Kvist (@runekvist) 's Twitter Profile Photo

Insurance is an underrated way to unlock secure AI progress. Insurers are incentivized to truthfully quantify and track risks: if they overstate risks, they get outcompeted; if they understate risks, their payouts bankrupt them. 1/9

Judd Rosenblatt — d/acc (@juddrosenblatt) 's Twitter Profile Photo

Key point most Republicans don't realize about Effective Altruists, and that Effective Altruists don't realize about why they should be Republicans

ozy brennan 🦙 (@ozyfrantz) 's Twitter Profile Photo

as we all know, the most important interventions never sound silly the first time you hear of them, and that's why infecting people with cowpox would never prevent smallpox

WHOSTP47 (@whostp47) 's Twitter Profile Photo

That’s a stretch, The New York Times! The Action Plan calls on the U.S. to accelerate AI innovation while simultaneously investing in AI interpretability and biosecurity, evaluating national security risks in frontier models, and combating synthetic media.

That’s a stretch, <a href="/nytimes/">The New York Times</a>!

The Action Plan calls on the U.S. to accelerate AI innovation while simultaneously investing in AI interpretability and biosecurity, evaluating national security risks in frontier models, and combating synthetic media.