
Usman Anwar
@usmananwar391
Deep Learning & AI Safety @Cambridge_uni
ID: 3623883973
http://uzman-anwar.github.io 20-09-2015 05:59:06
1,1K Tweet
678 Followers
1,1K Following


1/ Controlling LLMs with steering vectors is unreliable, but why? Our paper, "Understanding (Un)Reliability of Steering Vectors in Language Models," at the #ICLR2025 Foundation Models in the Wild @ ICLR 2025 Workshop investigates this! What did we find?











🤖 Calling all philosophers and AI researchers! Our team at UConn Computer Science & Engineering's RIET Lab is hosting a virtual workshop on Machine Ethics and Reasoning (MERe) on July 18, 2025. We're bringing together philosophy PhDs, CS researchers & AI folks to advance computational moral reasoning 🧵



