
Mor Geva
@megamor2
ID: 850356925535531009
https://mega002.github.io/ 07-04-2017 14:37:44
450 Tweet
1,1K Followers
509 Following

Hi ho! New work: arxiv.org/pdf/2503.14481 With amazing collabs Jacob Eisenstein Reza Aghajani Adam Fisch dheeru dua Fantine Huot βοΈ ICLR 25 Mirella Lapata Vicky Zayats Some things are easier to learn in a social setting. We show agents can learn to faithfully express their beliefs (along... 1/3




ΧΧͺ Χ©Χ ΧͺΧΧΧ ΧΧΧΧ ΧΧΧΧ¨ ΧΧΧ’Χ§Χ ΧΧ¨ΧΧ ΧΧ€Χ Χ Χ©ΧΧΧΧ ΧΧΧΧ¨ ΧΧΧΧΧ Χ ΧΧΧΧΧͺ ΧΧΧͺΧ¨ ΧΧΧ Χ€Χ©ΧΧΧΧ ΧΧ ΧΧΧΧͺΧ. Χ’ΧΧ©ΧΧ ΧΧ ΧΧͺΧΧΧΧ ΧΧ‘ΧΧΧ ΧΧΧΧΧΧͺ ΧΧͺΧ¨ ΧΧ§ΧΧ¨ΧΧͺ ΧΧ§ΧΧΧΧͺ Χ©Χ ΧΧΧΧΧΧ Χ‘ ΧΧ Χ©Χ ΧΧΧΧΧ Χ¦ΧΧ’Χ§ΧΧ ΧΧΧ’Χ§Χ

π Our Actionable Interpretability workshop has been accepted to #ICML2025! π >> Follow Actionable Interpretability Workshop ICML2025 Tal Haklay Anja Reusch Marius Mosbach Sarah Wiegreffe Ian Tenney (@[email protected]) Mor Geva Paper submission deadline: May 9th!



π¨ Call for Papers is Out! The First Workshop on ππππ’π¨π§πππ₯π ππ§πππ«π©π«πππππ’π₯π’ππ² will be held at ICML 2025 in Vancouver! π Submission Deadline: May 9 Follow us >> Actionable Interpretability Workshop ICML2025 π§ Topics of interest include: π












Removing knowledge from LLMs is HARD. Yoav Gur Arieh proposes a powerful approach that disentangles the MLP parameters to edit them in high resolution and remove target concepts from the model. Check it out!