Boaz Barak (@boazbaraktcs) 's Twitter Profile
Boaz Barak

@boazbaraktcs

Computer Scientist. See also windowsontheory.org .
@harvard @openai opinions my own.

ID: 1217628182611927040

linkhttps://www.boazbarak.org/ calendar_today16-01-2020 02:02:55

8,8K Tweet

20,20K Followers

545 Following

ayman nadeem (@aymannadeem) 's Twitter Profile Photo

I’m a YC/VC-backed founder. building is hard enough without tech cheering on open racism. saying vile things about muslims + arabs is now seen as “edgy” in VC Twitter. it’s not. it’s just bigotry. if you’re a muslim/swana founder trying to build something good, DM me. let’s

David Manheim (@davidmanheim) 's Twitter Profile Photo

Boaz Barak Thank you! We as a society need to normalize and support criticizing people we agree with for bad behavior, instead of further embracing the politically expedient view that everything needs to be about which side to support. x.com/boazbaraktcs/s…

Boaz Barak (@boazbaraktcs) 's Twitter Profile Photo

My guess is that in schools that implement this, it takes a short time until “genius” becomes a curse word kids use at each other.

Boaz Barak (@boazbaraktcs) 's Twitter Profile Photo

This study surprised me! The conclusion is opposite to what I would expect. It is tempting to try to find a reason it's bogus but I think it's well executed and solid work. As the authors say, there are a number of potential caveats for this setting that may not generalize

Aidan Clark (@_aidan_clark_) 's Twitter Profile Photo

Hi, We’re delaying the open weights model. Capability wise, we think the model is phenomenal — but our bar for an open source model is high and we think we need some more time to make sure we’re releasing a model we’re proud of along every axis. This one can’t be deprecated!

Boaz Barak (@boazbaraktcs) 's Twitter Profile Photo

I can't believe I'm saying it but "mechahitler" is the smallest problem: * There is no system card, no information about any safety or dangerous capability evals. * Unclear if any safety training was done. Model offers advice chemical weapons, drugs, or suicide methods. * The

Boaz Barak (@boazbaraktcs) 's Twitter Profile Photo

People sometimes distinguish between "mundane safety" and "catastrophic risks", but in many cases they require exercising the same muscles: we need to evaluate models for risks, transparency on results, research mitigations, have monitoring post deployment. If as an industry we

Dylan HadfieldMenell (@dhadfieldmenell) 's Twitter Profile Photo

I’m going to steal Boaz Barak’s analogy here, it’s a point that has been the center of my perspective on safety but I haven’t seen it articulated so clearly/memorably. ***Managing “mundane safety” and “catastrophic risks” both activate the same organizational muscles.***

Boaz Barak (@boazbaraktcs) 's Twitter Profile Photo

T Ay. xAI To be fair, up to not too long ago, many of those things were not that important. The content that chatbots were blocking was easily accessible by google, and they were not good enough to cause issues such as emotional attachment. But we are not in this world anymore. If you

Andrew Maynard (@2020science) 's Twitter Profile Photo

As someone who's worked at the cutting edge of getting new technologies right for decades it's crazy what we're seeing from xAI - such naivety that they can fix problems within such complex systems post-hoc

Boaz Barak (@boazbaraktcs) 's Twitter Profile Photo

With model activations, the default is inscrutability and if we work hard we can interpret some features. With chain of thought, the default is legibility and sometimes there are examples of unfaithful COTs. This is very good!

Richard Korzekwa 🇺🇸 🇺🇦 (@weakinteraction) 's Twitter Profile Photo

This is something I've been saying for years, mostly in private, and usually to people more concerned with catastrophe. Usually they nod along and I don't get any major pushback, but then all of the discourse carries on as if there's this big, sharp divide. I don't get it.