
Guy Davidson
@guyd33
PhD @NYUDataScience, visiting researcher @AIatMeta, interested in AI & CogSci, specifically in goals and their representations in minds and machines (he/him).
ID: 1117859056817823745
https://guydavidson.me 15-04-2019 18:35:42
925 Tweet
968 Followers
1,1K Following

Fantastic new work by John (Yueh-Han) Chen (with Brenden Lake and me trying not to cause too much trouble). We study systematic generalization in a safety setting and find LLMs struggle to consistently respond safely when we vary how we ask naive questions. More fun analyses in the paper!

Our paper Prisma: An Open Source Toolkit for Mechanistic Interpretability in Vision and Video received an Oral at the Mechanistic Interpretability for Vision Workshop at CVPR 2025! 🎉 We’ll be in Nashville next week. Come say hi 👋 #CVPR2025 Mechanistic Interpretability for Vision @ CVPR2025







We've been using smile to develop behavioral web experiments in the lab for the last year+. Everything from the simplest survey-like judgment collections to complex game-like designs (e.g., exps.gureckislab.org/e/laugh-melted…) is easier to develop and deploy. Consider it for your next exp!