jasmine (@j_asminewang) Twitter Tweets • TwiCopy

jasmine

@j_asminewang

+ Follow

control empirics lead @AISecurityInst. cofounded @verses_xyz @kernel_magazine @readtrellis @copysmith_ai

ID: 1295193837258727424

linkhttps://jasminew.me calendar_today17-08-2020 03:01:00

2,2K Tweet

6,6K Followers

1,1K Following

Gate.io

@gate_io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

We all began building to solve human problems and make life better (for everyone). Somewhere along the way, the systems we created to transform our ideas into meaningful applications began to consume the value we intended for users. A thread on returning users to the center of

thumb_up_off_alt131

chat_bubble_outline4

repeat28

shareShare

jasmine

@j_asminewang

a month ago

do folks have recs for somatic therapists in sf? need one urgently for a dear pal!

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Chris Beiser

@ctbeiser

a month ago

The pendulums are swinging back. The time has come for Woke 2. We are gonna lib out, make people mad, have fun, and create a beautiful future together. I wrote up some ideas on how:

thumb_up_off_alt295

chat_bubble_outline15

repeat30

shareShare

Jack Clark

@jackclarksf

a month ago

As I said in my testimony yesterday, we have a short window of time to get a sensible federal policy framework in place before an accident or a misuse leads to a reactive and likely bad regulatory response.

thumb_up_off_alt173

chat_bubble_outline19

repeat21

shareShare

jasmine

@j_asminewang

17 days ago

so happy and proud Scott! ❤️

thumb_up_off_alt6

chat_bubble_outline0

repeat0

shareShare

Mikita Balesni 🇺🇦

@balesni

11 days ago

A simple AGI safety technique: AI’s thoughts are in plain English, just read them We know it works, with OK (not perfect) transparency! The risk is fragility: RL training, new architectures, etc threaten transparency Experts from many orgs agree we should try to preserve it:

thumb_up_off_alt403

chat_bubble_outline26

repeat98

shareShare

jasmine

@j_asminewang

11 days ago

Cool to see folks from many parts of the AI safety ecosystem unite around this. We should study what makes models monitorable and track monitorability in system cards. Bravo to everyone involved, and thank you especially to Tomek Korbak and Mikita Balesni 🇺🇦 for leading this work!

thumb_up_off_alt6

chat_bubble_outline1

repeat0

shareShare

Bowen Baker

@bobabowen

11 days ago

Modern reasoning models think in plain English. Monitoring their thoughts could be a powerful, yet fragile, tool for overseeing future AI systems. I and researchers across many organizations think we should work to evaluate, preserve, and even improve CoT monitorability.

thumb_up_off_alt734

chat_bubble_outline49

repeat138

shareShare

Tomek Korbak

@tomekkorbak

11 days ago

The holy grail of AI safety has always been interpretability. But what if reasoning models just handed it to us in a stroke of serendipity? In our new paper, we argue that the AI community should turn this serendipity into a systematic AI safety agenda!🛡️

thumb_up_off_alt94

chat_bubble_outline6

repeat13

shareShare

Wojciech Zaremba

@woj_zaremba

11 days ago

When models start reasoning step-by-step, we suddenly get a huge safety gift: a window into their thought process. We could easily lose this if we're not careful. We're publishing a paper urging frontier labs: please don't train away this monitorability. Authored and endorsed

thumb_up_off_alt194

chat_bubble_outline8

repeat23

shareShare

Daniel Kokotajlo

@dkokotajlo

11 days ago

I'm very happy to see this happen. I think that we're in a vastly better position to solve the alignment problem if we can see what our AIs are thinking, and I think that we sorta mostly can right now, but that by default in the future companies will move away from this paradigm

thumb_up_off_alt175

chat_bubble_outline7

repeat12

shareShare

Tomek Korbak

@tomekkorbak

10 days ago

thumb_up_off_alt39

chat_bubble_outline1

repeat3

shareShare

Toby Ord

@tobyordoxford

10 days ago

Mikita Balesni 🇺🇦 If someone works out how to trade away this transparency in exchange for more efficiency and ushers in a new era of opaque thoughts, they may have done more than any other individual to lower the chance humanity survives this century.

thumb_up_off_alt16

chat_bubble_outline1

repeat2

shareShare

Xander Davies

@alxndrdavies

9 days ago

We at AI Security Institute worked with OpenAI to test & improve Agent’s safeguards prior to release. A few notes on our experience🧵 1/4

We at <a href="/AISecurityInst/">AI Security Institute</a> worked with <a href="/OpenAI/">OpenAI</a> to test & improve Agent’s safeguards prior to release. A few notes on our experience🧵 1/4

thumb_up_off_alt135

chat_bubble_outline3

repeat24

shareShare

jasmine

Gate.io

Inworld AI

jasmine

Chris Beiser

Jack Clark

jasmine

Mikita Balesni 🇺🇦

jasmine

Bowen Baker

Tomek Korbak

Wojciech Zaremba

Daniel Kokotajlo

Tomek Korbak

Toby Ord

Xander Davies