Fazl Barez (@fazlbarez) Twitter Tweets • TwiCopy

Fazl Barez

@fazlbarez

+ Follow

Making AI safe one Google doc at a time| Let's build AI's we can trust!

ID: 1341019917005537280

linkhttps://fbarez.github.io calendar_today21-12-2020 13:57:26

464 Tweet

1,1K Followers

729 Following

Maheep Chaudhary | महीप चौधरी💡

@chaudharymaheep

4 months ago

Do apply!!! Over the past few months, I have had the pleasure of working with Fazl, and it has been a highly productive experience. He is currently accepting students at Oxford for this summer.

thumb_up_off_alt4

chat_bubble_outline0

repeat1

shareShare

Tingchen Fu (seek for PhD at 25 fall)

@tingchenfu

4 months ago

1/🧵 New research alerted🚨: "Scaling Reasoning, Losing Control: Evaluating Instruction Following in Large Reasoning Models" (Paper:arxiv.org/pdf/2505.14810). We found a surprising paradox between instruction-following ability and reasoning ability. Here’s why ↓

thumb_up_off_alt30

chat_bubble_outline2

repeat3

shareShare

Fazl Barez

@fazlbarez

4 months ago

Want to remove undesired knowledge from LLMs? 👇 Great work lead by Yoav Gur Arieh and advised by Mor Geva

thumb_up_off_alt9

chat_bubble_outline0

repeat0

shareShare

Rohan Paul

@rohanpaul_ai

4 months ago

LLMs can produce harmful content, sometimes triggered by specific inputs or through hidden vulnerabilities. Safety-Net monitors LLMs' internal states in real-time to predict harmful outputs before generation occurs. It uses an unsupervised method, treating normal behavior as a

thumb_up_off_alt7

chat_bubble_outline0

repeat2

shareShare

Xin Cynthia Chen

@xincynthiachen

4 months ago

🎉 Announcing our ICML2025 Spotlight paper: Learning Safety Constraints for Large Language Models We introduce SaP (Safety Polytope) - a geometric approach to LLM safety that learns and enforces safety constraints in LLM's representation space, with interpretable insights. 🧵

thumb_up_off_alt231

chat_bubble_outline4

repeat41

shareShare

Fazl Barez

@fazlbarez

4 months ago

This is such a beautiful talk by Chen Sun 🤖🧠🇨🇦 -- highly recommend it!

thumb_up_off_alt3

chat_bubble_outline1

repeat0

shareShare

Jan Kulveit

@jankulveit

3 months ago

Human-aligned AI Summer School Human-aligned AI Summer School has an updated list of speakers Fazl Barez (Oxford) Lewis Hammond (Cooperative AI Foundation) Evan Hubinger (Anthropic) gavin leech (Future Intelligence) Nathaniel Sauerberg (Foundations of Cooperative AI Lab (FOCAL) at CMU) Noah Y. Siegel (Google DeepMind) Stanislav Fort and Torben Swoboda. Apply ~now

thumb_up_off_alt50

chat_bubble_outline1

repeat8

shareShare

David Krueger

@davidskrueger

3 months ago

I will likely be looking for students at the University of Montreal / Mila to start January 2026. The deadline to apply is September 1, 2025. I will share more details later, but wanted to start getting it on people's radar!

thumb_up_off_alt148

chat_bubble_outline1

repeat17

shareShare

David Duvenaud

@davidduvenaud

3 months ago

It's hard to plan for AGI without knowing what outcomes are even possible, let alone good. So we’re hosting a workshop! Post-AGI Civilizational Equilibria: Are there any good ones? Vancouver, July 14th Featuring: Joe Carlsmith Richard Ngo Emmett Shear 🧵

thumb_up_off_alt80

chat_bubble_outline7

repeat8

shareShare

David Duvenaud

@davidduvenaud

3 months ago

And Anna Yelizarova Fazl Barez Cas (Stephen Casper) Beatrice Erkers We'll draw from political theory, cooperative AI, economics, mechanism design, history, and hierarchical agency. Apply to attend at: post-agi.org

thumb_up_off_alt15

chat_bubble_outline1

repeat1

shareShare