Boaz Barak (@boazbaraktcs) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

I’m a YC/VC-backed founder. building is hard enough without tech cheering on open racism. saying vile things about muslims + arabs is now seen as “edgy” in VC Twitter. it’s not. it’s just bigotry. if you’re a muslim/swana founder trying to build something good, DM me. let’s

thumb_up_off_alt974

chat_bubble_outline36

repeat73

shareShare

David Manheim

@davidmanheim

19 days ago

Boaz Barak Thank you! We as a society need to normalize and support criticizing people we agree with for bad behavior, instead of further embracing the politically expedient view that everything needs to be about which side to support. x.com/boazbaraktcs/s…

thumb_up_off_alt5

chat_bubble_outline0

repeat2

shareShare

Boaz Barak

@boazbaraktcs

19 days ago

Worth reading. A brave woman from Kabul.

thumb_up_off_alt6

chat_bubble_outline0

repeat0

shareShare

Boaz Barak

@boazbaraktcs

19 days ago

My guess is that in schools that implement this, it takes a short time until “genius” becomes a curse word kids use at each other.

thumb_up_off_alt31

chat_bubble_outline2

repeat0

shareShare

Boaz Barak

@boazbaraktcs

15 days ago

This study surprised me! The conclusion is opposite to what I would expect. It is tempting to try to find a reason it's bogus but I think it's well executed and solid work. As the authors say, there are a number of potential caveats for this setting that may not generalize

thumb_up_off_alt91

chat_bubble_outline6

repeat8

shareShare

Aidan Clark

@_aidan_clark_

14 days ago

Hi, We’re delaying the open weights model. Capability wise, we think the model is phenomenal — but our bar for an open source model is high and we think we need some more time to make sure we’re releasing a model we’re proud of along every axis. This one can’t be deprecated!

thumb_up_off_alt635

chat_bubble_outline41

repeat18

shareShare

Boaz Barak

@boazbaraktcs

10 days ago

I can't believe I'm saying it but "mechahitler" is the smallest problem: * There is no system card, no information about any safety or dangerous capability evals. * Unclear if any safety training was done. Model offers advice chemical weapons, drugs, or suicide methods. * The

thumb_up_off_alt603

chat_bubble_outline20

repeat29

shareShare

Boaz Barak

@boazbaraktcs

10 days ago

People sometimes distinguish between "mundane safety" and "catastrophic risks", but in many cases they require exercising the same muscles: we need to evaluate models for risks, transparency on results, research mitigations, have monitoring post deployment. If as an industry we

thumb_up_off_alt352

chat_bubble_outline9

repeat11

shareShare

Dylan HadfieldMenell

@dhadfieldmenell

10 days ago

I’m going to steal Boaz Barak’s analogy here, it’s a point that has been the center of my perspective on safety but I haven’t seen it articulated so clearly/memorably. ***Managing “mundane safety” and “catastrophic risks” both activate the same organizational muscles.***

thumb_up_off_alt9

chat_bubble_outline0

repeat3

shareShare

Dylan HadfieldMenell

@dhadfieldmenell

10 days ago

I also endorse these comments from OpenAI employee (and Harvard University prof) Boaz Barak. xAI is clearly out of step with the rest of the industry here.

thumb_up_off_alt7

chat_bubble_outline0

repeat2

shareShare

Boaz Barak

@boazbaraktcs

10 days ago

T Ay. xAI To be fair, up to not too long ago, many of those things were not that important. The content that chatbots were blocking was easily accessible by google, and they were not good enough to cause issues such as emotional attachment. But we are not in this world anymore. If you

thumb_up_off_alt199

chat_bubble_outline9

repeat8

shareShare

Andrew Maynard

@2020science

10 days ago

As someone who's worked at the cutting edge of getting new technologies right for decades it's crazy what we're seeing from xAI - such naivety that they can fix problems within such complex systems post-hoc

thumb_up_off_alt8

chat_bubble_outline0

repeat3

shareShare

Boaz Barak

@boazbaraktcs

10 days ago

With model activations, the default is inscrutability and if we work hard we can interpret some features. With chain of thought, the default is legibility and sometimes there are examples of unfaithful COTs. This is very good!

thumb_up_off_alt10

chat_bubble_outline0

repeat0

shareShare

Richard Korzekwa 🇺🇸 🇺🇦

@weakinteraction

10 days ago

This is something I've been saying for years, mostly in private, and usually to people more concerned with catastrophe. Usually they nod along and I don't get any major pushback, but then all of the discourse carries on as if there's this big, sharp divide. I don't get it.

thumb_up_off_alt6

chat_bubble_outline0

repeat2

shareShare

Boaz Barak

Gate.io

ayman nadeem

David Manheim

Boaz Barak

Boaz Barak

Boaz Barak

Aidan Clark

Boaz Barak

Boaz Barak

Dylan HadfieldMenell

Dylan HadfieldMenell

Boaz Barak

Andrew Maynard

Boaz Barak

Richard Korzekwa 🇺🇸 🇺🇦