Hassan Sajjad (@hassaan84s) 's Twitter Profile
Hassan Sajjad

@hassaan84s

Associate Professor - Dalhousie University, Halifax, Canada
NLP, deep learning, explainable AI

ID: 415790425

linkhttps://hsajjad.github.io/ calendar_today18-11-2011 20:32:42

358 Tweet

448 Followers

104 Following

Aman Jaiswal (@amanjaiswal81) 's Twitter Profile Photo

4/n 🚨Introducing SUGARCREPE++: a dataset to evaluate VLMs and ULMs sensitivity to lexical and semantic changes. Each sample includes: - 🖼️ An image - ✅🔄✅ Two semantically equivalent but lexically different captions - ❌ One hard negative caption Check out the examples 👇

4/n 🚨Introducing SUGARCREPE++: a dataset to evaluate VLMs and ULMs sensitivity to lexical and semantic changes. Each sample includes:
- 🖼️ An image
- ✅🔄✅ Two semantically equivalent but lexically different captions
- ❌ One hard negative caption
Check out the examples 👇
Aman Jaiswal (@amanjaiswal81) 's Twitter Profile Photo

10/n 📖 Full paper 🔗 Explore our comprehensive findings! Paper: arxiv.org/abs/2406.11171 Dataset: huggingface.co/datasets/Aman-… Exciting collboration with Sri Harsha Chandramouli S Sastry, Evangelos Milios, sageev Hassan Sajjad

Hassan Sajjad (@hassaan84s) 's Twitter Profile Photo

Despite having 100s of evaluation metrics to measure the progress of LLMs, they are still brittle to small changes in the input. I am excited to share SUGERCREPE++ benchmark that evaluates LLMs sensitivity to semantic and lexical alternations. My take home from the findings:

Despite having 100s of evaluation metrics to measure the progress of LLMs, they are still brittle to small changes in the input. 
I am excited to share SUGERCREPE++ benchmark that evaluates LLMs sensitivity to semantic and lexical alternations.  

My take home from the findings:
Hassan Sajjad (@hassaan84s) 's Twitter Profile Photo

It was a pleasure giving a keynote at #Repl4nlp #ACL2024 on Latent Space Exploration for Safe and Trustworthy AI. I look forward to seeing works targeting various challenges in understanding deep learning models' complex latent space and their applications to developing better

Hassan Sajjad (@hassaan84s) 's Twitter Profile Photo

Thanks a lot, Preslav Nakov (Preslav Nakov) for hosting me at MBZUAI and for the opportunity to talk about some of the work we have been doing at Dalhousie University. It was a productive visit with many useful discussions to follow up on. #NLP

Thanks a lot, Preslav  Nakov (<a href="/preslav_nakov/">Preslav Nakov</a>) for hosting me at <a href="/mbzuai/">MBZUAI</a> and for the opportunity to talk about some of the work we have been doing at Dalhousie University. It was a productive visit with many useful discussions to follow up on. #NLP
Hassan Sajjad (@hassaan84s) 's Twitter Profile Photo

I am excited to share our two papers on Safe and Trustworthy AI accepted at EMNLP 2025 #EMNLP2024. Thanks to my awesome students and collaborators. Latent Concept-based Explanation of NLP Models arxiv.org/pdf/2404.12545 Immunization against harmful fine-tuning attacks

I am excited to share our two papers on Safe and Trustworthy AI accepted at <a href="/emnlpmeeting/">EMNLP 2025</a> #EMNLP2024. Thanks to my awesome students and collaborators.

Latent Concept-based Explanation of NLP Models arxiv.org/pdf/2404.12545
Immunization against harmful fine-tuning attacks
Hassan Sajjad (@hassaan84s) 's Twitter Profile Photo

#NeurIPS2024 update: Two of our papers on model safety and evaluation have been accepted at the NeurIPS conference. #aisafety Representation noising effectively prevents harmful fine-tuning on LLMs arxiv.org/pdf/2405.14577 SUGARCREPE++ Dataset: Vision-Language Model Sensitivity

#NeurIPS2024 update: Two of our papers on model safety and evaluation have been accepted at the NeurIPS conference. #aisafety 

Representation noising effectively prevents harmful fine-tuning on LLMs
arxiv.org/pdf/2405.14577

SUGARCREPE++ Dataset: Vision-Language Model Sensitivity
Hassan Sajjad (@hassaan84s) 's Twitter Profile Photo

Our #EMNLP2024 talks on AI Safety and XAI are available online Immunization against harmful fine-tuning attacks Talk: youtube.com/watch?v=N9im_V… Paper: aclanthology.org/2024.findings-… Latent Concept-based Explanation of NLP Models Talk: youtube.com/watch?v=799Acf… Paper:

Hassan Sajjad (@hassaan84s) 's Twitter Profile Photo

Dal@ #NeurIPS2024 - we have several presentations at NeurIPS on AI Safety, XAI and Diffusion models. Stop by to say hi! Th, Dec 12, 11:00 PST -- Poster Session 3 East DiffAug: A Diffuse-and-Denoise Augmentation for Training Robust Classifiers arxiv.org/abs/2306.09192 Presented

Hassan Sajjad (@hassaan84s) 's Twitter Profile Photo

Stress in AI is Real! The death of Felix Hill, a DeepMind researcher, after several years of mental health struggles, is shocking evidence of this reality. In 2024, Felix wrote an article titled "The Stress of Working in Modern AI" to highlight the anxiety and stress faced

Hassan Sajjad (@hassaan84s) 's Twitter Profile Photo

#ICLR2025 Accepted: check out our latest work on XAI with incredible colleagues Ga Wu and Mahtab Sarvmaili. Data-Centric Prediction Explanation via Kernelized Stein Discrepancy openreview.net/pdf?id=KlV5CkN…

Hassan Sajjad (@hassaan84s) 's Twitter Profile Photo

Sharing our two papers accepted at #ICML2025 "Resolving Lexical Bias in Edit Scoping with Projector Editor Networks" arxiv.org/html/2408.10411 "Explaining the Role of Intrinsic Dimensionality in Adversarial Training" Preprints will be updated soon! Thanks to awesome students and

Fazl Barez (@fazlbarez) 's Twitter Profile Photo

Excited to share our paper: "Chain-of-Thought Is Not Explainability"! We unpack a critical misconception in AI: models explaining their Chain-of-Thought (CoT) steps aren't necessarily revealing their true reasoning. Spoiler: transparency of CoT can be an illusion. (1/9) 🧵

Excited to share our paper: "Chain-of-Thought Is Not Explainability"! 

We unpack a critical misconception in AI: models explaining their Chain-of-Thought (CoT) steps aren't necessarily revealing their true reasoning. Spoiler: transparency of CoT can be an illusion. (1/9) 🧵