Mor Geva (@megamor2) 's Twitter Profile
Mor Geva

@megamor2

ID: 850356925535531009

linkhttps://mega002.github.io/ calendar_today07-04-2017 14:37:44

450 Tweet

1,1K Followers

509 Following

Jonathan Berant (@jonathanberant) 's Twitter Profile Photo

Hi ho! New work: arxiv.org/pdf/2503.14481 With amazing collabs Jacob Eisenstein Reza Aghajani Adam Fisch dheeru dua Fantine Huot ✈️ ICLR 25 Mirella Lapata Vicky Zayats Some things are easier to learn in a social setting. We show agents can learn to faithfully express their beliefs (along... 1/3

Hi ho!

New work: arxiv.org/pdf/2503.14481
With amazing collabs <a href="/jacobeisenstein/">Jacob Eisenstein</a> <a href="/jdjdhekchbdjd/">Reza Aghajani</a> <a href="/adamjfisch/">Adam Fisch</a> <a href="/ddua17/">dheeru dua</a> <a href="/fantinehuot/">Fantine Huot ✈️ ICLR 25</a> <a href="/mlapata/">Mirella Lapata</a> <a href="/vicky_zayats/">Vicky Zayats</a>

Some things are easier to learn in a social setting. We show agents can learn to faithfully express their beliefs (along... 1/3
Sohee Yang (@soheeyang_) 's Twitter Profile Photo

Excited to share that the code and datasets for our papers on latent multi-hop reasoning are finally available on GitHub: github.com/google-deepmin… We hope these resources support further research in this area. Thanks for your patience as we worked through the release process!

Mor Geva (@megamor2) 's Twitter Profile Photo

בת שנתיים למדה לומר אזעקה הרבה לפני שלמדה לומר מילים נחמדות יותר כמו פשטידה או חביתה. עכשיו גם התחילה לסבול מהכללת יתר וקוראת לקולות של אמבולנס או של ילדים צועקים אזעקה

neuronpedia (@neuronpedia) 's Twitter Profile Photo

Announcement: we're open sourcing Neuronpedia! 🚀 This includes all our mech interp tools: the interpretability API, steering, UI, inference, autointerp, search, plus 4 TB of data - cited by 35+ research papers and used by 50+ write-ups. What you can do with OSS Neuronpedia: 🧵

Tal Haklay (@tal_haklay) 's Twitter Profile Photo

🚨 Call for Papers is Out! The First Workshop on 𝐀𝐜𝐭𝐢𝐨𝐧𝐚𝐛𝐥𝐞 𝐈𝐧𝐭𝐞𝐫𝐩𝐫𝐞𝐭𝐚𝐛𝐢𝐥𝐢𝐭𝐲 will be held at ICML 2025 in Vancouver! 📅 Submission Deadline: May 9 Follow us >> Actionable Interpretability Workshop ICML2025 🧠Topics of interest include: 👇

🚨 Call for Papers is Out!

The First Workshop on 𝐀𝐜𝐭𝐢𝐨𝐧𝐚𝐛𝐥𝐞 𝐈𝐧𝐭𝐞𝐫𝐩𝐫𝐞𝐭𝐚𝐛𝐢𝐥𝐢𝐭𝐲 will be held at ICML 2025 in Vancouver!

📅 Submission Deadline: May 9
Follow us &gt;&gt; <a href="/ActInterp/">Actionable Interpretability Workshop ICML2025</a>

🧠Topics of interest include: 👇
Actionable Interpretability Workshop ICML2025 (@actinterp) 's Twitter Profile Photo

🚨 We're looking for reviewers for the workshop! If you're passionate about making interpretability useful and want to help shape the conversation, we'd love your input. Sign up to review >>💡🔍

🚨 We're looking for reviewers for the workshop!

If you're passionate about making interpretability useful and want to help shape the conversation, we'd love your input.

Sign up to review &gt;&gt;💡🔍
Hadas Orgad (@orgadhadas) 's Twitter Profile Photo

Position papers wanted! For the First Workshop on Actionable Interpretability, we’re looking for diverse perspectives on the state of the field. Should certain areas of interpretability research be developed further? Are there key metrics we should prioritize? Or do you have >>

Position papers wanted!

For the First Workshop on Actionable Interpretability, we’re looking for diverse perspectives on the state of the field. Should certain areas of interpretability research be developed further? Are there key metrics we should prioritize? Or do you have &gt;&gt;
Jiuding Sun (@jiudingsun) 's Twitter Profile Photo

💨 A new architecture of automating mechanistic interpretability with causal interchange intervention! #ICLR2025 🔬Neural networks are particularly good at discovering patterns from high-dimensional data, so we trained them to ... interpret themselves! 🧑‍🔬 1/4

💨 A new architecture of automating mechanistic interpretability with causal interchange intervention! #ICLR2025

🔬Neural networks are particularly good at discovering patterns from high-dimensional data, so we trained them to ... interpret themselves! 🧑‍🔬

1/4
Aran Komatsuzaki (@arankomatsuzaki) 's Twitter Profile Photo

The Leaderboard Illusion - Identifies systematic issues that have resulted in a distorted playing field of Chatbot Arena - Identifies 27 private LLM variants tested by Meta in the lead-up to the Llama-4 release

The Leaderboard Illusion

- Identifies systematic issues that have resulted in a distorted playing field of Chatbot Arena

- Identifies 27 private LLM variants tested by Meta in the lead-up to the Llama-4 release
Shiqi Chen (@shiqi_chen17) 's Twitter Profile Photo

🚀🔥 Thrilled to announce our ICML25 paper: "Why Is Spatial Reasoning Hard for VLMs? An Attention Mechanism Perspective on Focus Areas"! We dive into the core reasons behind spatial reasoning difficulties for Vision-Language Models from an attention mechanism view. 🌍🔍 Paper:

Hadas Orgad (@orgadhadas) 's Twitter Profile Photo

Deadline extended! ⏳ The Actionable Interpretability Workshop at #ICML2025 has moved its submission deadline to May 19th. More time to submit your work 🔍🧠✨ Don’t miss out!

Deadline extended! ⏳

The Actionable Interpretability Workshop at #ICML2025 has moved its submission deadline to May 19th. More time to submit your work 🔍🧠✨ Don’t miss out!
clem 🤗 (@clementdelangue) 's Twitter Profile Photo

This is the coolest dataset I've seen on Hugging Face today: action labels of 1v1 races in Gran Turismo 4 to train a multiplayer world model! Great stuff by Enigma!

This is the coolest dataset I've seen on <a href="/huggingface/">Hugging Face</a> today: action labels of 1v1 races in Gran Turismo 4 to train a multiplayer world model!

Great stuff by <a href="/EnigmaLabsAI/">Enigma</a>!
Percy Liang (@percyliang) 's Twitter Profile Photo

What would truly open-source AI look like? Not just open weights, open code/data, but *open development*, where the entire research and development process is public *and* anyone can contribute. We built Marin, an open lab, to fulfill this vision:

What would truly open-source AI look like? Not just open weights, open code/data, but *open development*, where the entire research and development process is public *and* anyone can contribute. We built Marin, an open lab, to fulfill this vision:
Mor Geva (@megamor2) 's Twitter Profile Photo

Help!! We got way more submissions than expected, and are now looking for reviewers! Please sign up in this form if you can do 2-3 reviews in the next few weeks 👇 docs.google.com/forms/d/e/1FAI…

Yoav Gur Arieh (@guryoav) 's Twitter Profile Photo

Can we precisely erase conceptual knowledge from LLM parameters? Most methods are shallow, coarse, or overreach, adversely affecting related or general knowledge. We introduce🪝𝐏𝐈𝐒𝐂𝐄𝐒 — a general framework for Precise In-parameter Concept EraSure. 🧵 1/

Can we precisely erase conceptual knowledge from LLM parameters?
Most methods are shallow, coarse, or overreach, adversely affecting related or general knowledge.

We introduce🪝𝐏𝐈𝐒𝐂𝐄𝐒 — a general framework for Precise In-parameter Concept EraSure. 🧵 1/
Mor Geva (@megamor2) 's Twitter Profile Photo

Removing knowledge from LLMs is HARD. Yoav Gur Arieh proposes a powerful approach that disentangles the MLP parameters to edit them in high resolution and remove target concepts from the model. Check it out!