Victor Veitch 🔸 (@victorveitch) 's Twitter Profile
Victor Veitch 🔸

@victorveitch

AI | University of Chicago / Google DeepMind

ID: 1400175774

linkhttp://victorveitch.com calendar_today03-05-2013 16:54:37

1,1K Tweet

4,4K Followers

1,1K Following

Tyler John (@tyler_m_john) 's Twitter Profile Photo

I really like this new op ed from David Duvenaud on how so many different kinds of pressures could drive towards loss of human control over AI. It's rare to read anything well written on this topic but this piece was elegant and smart enough that I wanted to keep on reading.

I really like this new op ed from <a href="/DavidDuvenaud/">David Duvenaud</a> on how so many different kinds of pressures could drive towards loss of human control over AI. It's rare to read anything well written on this topic but this piece was elegant and smart enough that I wanted to keep on reading.
Zihao Wang (@wzihao12) 's Twitter Profile Photo

Secure LLMs must separate roles. Finetuning improves security benchmark scores, but do models really learn role separation? 🤔 Our paper reveals an 'Illusion of Role Separation'! 🧵 (1/N) #AISafety w Yibo Jiang Hubert Yoo metasec arxiv.org/pdf/2505.00626

Secure LLMs must separate roles. Finetuning improves security benchmark scores, but do models really learn role separation? 🤔 Our paper reveals an 'Illusion of Role Separation'! 🧵 (1/N) #AISafety w <a href="/yibophd/">Yibo Jiang</a> <a href="/jiahaoyu04/">Hubert Yoo</a> <a href="/metasec/">metasec</a> arxiv.org/pdf/2505.00626
Liv Boeree (@liv_boeree) 's Twitter Profile Photo

Two days ago I launched a donation matching challenge to fight against corporate torture of US farm animals. The first $50k has been filled, so I am extending the challenge to $75k to help pay for a second campaigner. Wondering what the hell I'm on about? WATCH THIS 👇

(((ل()(ل() 'yoav))))👾 (@yoavgo) 's Twitter Profile Photo

we write too much. more than we can read, and many small incremental things. i think there should be some mechanism to restrict paper submissions and acceptances per person per year, to force people to prioritize their best work, and invest more in it.

Robert Long (@rgblong) 's Twitter Profile Photo

The Eleos AI Research team conducted “welfare interviews” with Anthropic’s Claude Opus 4 about its potential moral status 💬—the first external welfare evaluation of a frontier model This thread: -interviews have clear limitations—but they're still worth doing -what we found

The <a href="/eleosai/">Eleos AI Research</a> team conducted “welfare interviews” with Anthropic’s Claude Opus 4 about its potential moral status 💬—the first external welfare evaluation of a frontier model

This thread:
-interviews have clear limitations—but they're still worth doing
-what we found
Shashwat Goel (@shashwatgoel7) 's Twitter Profile Photo

Confused about recent LLM RL results where models improve without any ground-truth signal? We were too. Until we looked at the reported numbers of the Pre-RL models and realized they were serverely underreported across papers. We compiled discrepancies in a blog below🧵👇

Confused about recent LLM RL results where models improve without any ground-truth signal? We were too. Until we looked at the reported numbers of the Pre-RL models and realized they were serverely underreported across papers. We compiled discrepancies in a blog below🧵👇
David Bau (@davidbau) 's Twitter Profile Photo

Dear MAGA friends, I have been worrying about STEM in the US a lot, because right now the Senate is writing new laws that cut 75% of the STEM budget in the US. Sorry for the long post, but the issue is really important, and I want to share what I know about it. The entire