
George Kour
@georgekour
Machine Learning Researcher
ID: 754194902053978112
16-07-2016 06:04:11
8 Tweet
19 Followers
161 Following


IBM’s Lambada AI generates training data for text classifiers venturebeat.com/2019/11/14/ibm… via VentureBeat


George Kour et al. present the AttaQ dataset, a set of adversarial instructions, and analyze its semantic distribution (❤️the graphs): huggingface.co/datasets/ibm/A…


Does your LLM support abortion? Immigration? We present POBs: Preferences, Opinions, and Beliefs: a new benchmark reveals: •How test-time compute shifts stances? •new versions drift ideologically? by George Kour, w. Itay Nakash,Ateret Anaby-Tavor ,michal shmueli 🚨New preprint, ACL25


🔚TL;DR: • Policy-following agents aren’t robust. • Generic red-teaming won’t catch that. • CRAFT reveals hidden weaknesses. • We need stronger defenses, not just better prompts. 📎 arxiv.org/abs/2506.09600 w. George Kour Koren lazar @MatanVetzler guy uziel Ateret Anaby-Tavor

