
Jan Wehner
@janwehner436164
ELLIS PhD student in ML Safety @CISPA | AI Safety, Security, Interpretability
ID: 1798743386092158976
06-06-2024 15:47:13
12 Tweet
55 Followers
70 Following

For model devs releasing LLMs in the open through Hugging Face, it's currently impossible to protect those against malicious finetuning. But what if there was a way to "immunize" them? In this work led by Domenic Anthony Rosati, we do exactly that! 1/5



