Mor Geva (@megamor2) Twitter Tweets • TwiCopy

Jonathan Berant

6 months ago

Hi ho! New work: arxiv.org/pdf/2503.14481 With amazing collabs Jacob Eisenstein Reza Aghajani Adam Fisch dheeru dua Fantine Huot ✈️ ICLR 25 Mirella Lapata Vicky Zayats Some things are easier to learn in a social setting. We show agents can learn to faithfully express their beliefs (along... 1/3

Hi ho!

New work: arxiv.org/pdf/2503.14481
With amazing collabs <a href="/jacobeisenstein/">Jacob Eisenstein</a> <a href="/jdjdhekchbdjd/">Reza Aghajani</a> <a href="/adamjfisch/">Adam Fisch</a> <a href="/ddua17/">dheeru dua</a> <a href="/fantinehuot/">Fantine Huot ✈️ ICLR 25</a> <a href="/mlapata/">Mirella Lapata</a> <a href="/vicky_zayats/">Vicky Zayats</a>

Some things are easier to learn in a social setting. We show agents can learn to faithfully express their beliefs (along... 1/3

thumb_up_off_alt61

chat_bubble_outline2

repeat17

shareShare

Sohee Yang

@soheeyang_

6 months ago

Excited to share that the code and datasets for our papers on latent multi-hop reasoning are finally available on GitHub: github.com/google-deepmin… We hope these resources support further research in this area. Thanks for your patience as we worked through the release process!

thumb_up_off_alt403

chat_bubble_outline5

repeat72

shareShare

Yoni Haimovich 🚨🎗🚨 יוני חיימוביץ

@yonihaimovich

6 months ago

והחדשות הן... אנחנו

thumb_up_off_alt1,1K

chat_bubble_outline99

repeat300

shareShare

Mor Geva

@megamor2

5 months ago

בת שנתיים למדה לומר אזעקה הרבה לפני שלמדה לומר מילים נחמדות יותר כמו פשטידה או חביתה. עכשיו גם התחילה לסבול מהכללת יתר וקוראת לקולות של אמבולנס או של ילדים צועקים אזעקה

thumb_up_off_alt24

chat_bubble_outline3

repeat0

shareShare

Hadas Orgad

@orgadhadas

5 months ago

🎉 Our Actionable Interpretability workshop has been accepted to #ICML2025! 🎉 >> Follow Actionable Interpretability Workshop ICML2025 Tal Haklay Anja Reusch Marius Mosbach Sarah Wiegreffe Ian Tenney (@[email protected]) Mor Geva Paper submission deadline: May 9th!

🎉 Our Actionable Interpretability workshop has been accepted to #ICML2025! 🎉
>> Follow <a href="/ActInterp/">Actionable Interpretability Workshop ICML2025</a>

<a href="/tal_haklay/">Tal Haklay</a> <a href="/anja_reu/">Anja Reusch</a> <a href="/mariusmosbach/">Marius Mosbach</a> <a href="/sarahwiegreffe/">Sarah Wiegreffe</a> <a href="/iftenney/">Ian Tenney (@iftenney@sigmoid.social)</a> <a href="/megamor2/">Mor Geva</a>

Paper submission deadline: May 9th!

thumb_up_off_alt127

chat_bubble_outline1

repeat25

shareShare

neuronpedia

@neuronpedia

5 months ago

Announcement: we're open sourcing Neuronpedia! 🚀 This includes all our mech interp tools: the interpretability API, steering, UI, inference, autointerp, search, plus 4 TB of data - cited by 35+ research papers and used by 50+ write-ups. What you can do with OSS Neuronpedia: 🧵

thumb_up_off_alt148

chat_bubble_outline2

repeat28

shareShare

Tal Haklay

@tal_haklay

5 months ago

🚨 Call for Papers is Out! The First Workshop on 𝐀𝐜𝐭𝐢𝐨𝐧𝐚𝐛𝐥𝐞 𝐈𝐧𝐭𝐞𝐫𝐩𝐫𝐞𝐭𝐚𝐛𝐢𝐥𝐢𝐭𝐲 will be held at ICML 2025 in Vancouver! 📅 Submission Deadline: May 9 Follow us >> Actionable Interpretability Workshop ICML2025 🧠Topics of interest include: 👇

thumb_up_off_alt30

chat_bubble_outline1

repeat7

shareShare

Actionable Interpretability Workshop ICML2025

@actinterp

5 months ago

🚨 We're looking for reviewers for the workshop! If you're passionate about making interpretability useful and want to help shape the conversation, we'd love your input. Sign up to review >>💡🔍

thumb_up_off_alt10

chat_bubble_outline1

repeat6

shareShare

Hadas Orgad

@orgadhadas

5 months ago

Position papers wanted! For the First Workshop on Actionable Interpretability, we’re looking for diverse perspectives on the state of the field. Should certain areas of interpretability research be developed further? Are there key metrics we should prioritize? Or do you have >>

thumb_up_off_alt32

chat_bubble_outline2

repeat7

shareShare

Jiuding Sun

@jiudingsun

5 months ago

💨 A new architecture of automating mechanistic interpretability with causal interchange intervention! #ICLR2025 🔬Neural networks are particularly good at discovering patterns from high-dimensional data, so we trained them to ... interpret themselves! 🧑‍🔬 1/4

thumb_up_off_alt71

chat_bubble_outline1

repeat17

shareShare

Aran Komatsuzaki

@arankomatsuzaki

4 months ago

The Leaderboard Illusion - Identifies systematic issues that have resulted in a distorted playing field of Chatbot Arena - Identifies 27 private LLM variants tested by Meta in the lead-up to the Llama-4 release

thumb_up_off_alt502

chat_bubble_outline16

repeat79

shareShare

Shiqi Chen

@shiqi_chen17

4 months ago

🚀🔥 Thrilled to announce our ICML25 paper: "Why Is Spatial Reasoning Hard for VLMs? An Attention Mechanism Perspective on Focus Areas"! We dive into the core reasons behind spatial reasoning difficulties for Vision-Language Models from an attention mechanism view. 🌍🔍 Paper:

thumb_up_off_alt230

chat_bubble_outline5

repeat36

shareShare

Hadas Orgad

@orgadhadas

4 months ago

Deadline extended! ⏳ The Actionable Interpretability Workshop at #ICML2025 has moved its submission deadline to May 19th. More time to submit your work 🔍🧠✨ Don’t miss out!

thumb_up_off_alt42

chat_bubble_outline4

repeat8

shareShare

Mor Geva

@megamor2

4 months ago

Love this! Congrats Jonathan Jacobi and Gal Niv 👏👏

thumb_up_off_alt13

chat_bubble_outline0

repeat1

shareShare

clem 🤗

@clementdelangue

4 months ago

This is the coolest dataset I've seen on Hugging Face today: action labels of 1v1 races in Gran Turismo 4 to train a multiplayer world model! Great stuff by Enigma!

This is the coolest dataset I've seen on <a href="/huggingface/">Hugging Face</a> today: action labels of 1v1 races in Gran Turismo 4 to train a multiplayer world model!

Great stuff by <a href="/EnigmaLabsAI/">Enigma</a>!

thumb_up_off_alt146

chat_bubble_outline4

repeat19

shareShare

Percy Liang

@percyliang

4 months ago

What would truly open-source AI look like? Not just open weights, open code/data, but *open development*, where the entire research and development process is public *and* anyone can contribute. We built Marin, an open lab, to fulfill this vision:

thumb_up_off_alt939

chat_bubble_outline39

repeat185

shareShare

Mor Geva

@megamor2

4 months ago

Help!! We got way more submissions than expected, and are now looking for reviewers! Please sign up in this form if you can do 2-3 reviews in the next few weeks 👇 docs.google.com/forms/d/e/1FAI…

thumb_up_off_alt7

chat_bubble_outline0

repeat1

shareShare

Yoav Gur Arieh

@guryoav

3 months ago

Can we precisely erase conceptual knowledge from LLM parameters? Most methods are shallow, coarse, or overreach, adversely affecting related or general knowledge. We introduce🪝𝐏𝐈𝐒𝐂𝐄𝐒 — a general framework for Precise In-parameter Concept EraSure. 🧵 1/

thumb_up_off_alt63

chat_bubble_outline2

repeat8

shareShare

Mor Geva

@megamor2

3 months ago

Removing knowledge from LLMs is HARD. Yoav Gur Arieh proposes a powerful approach that disentangles the MLP parameters to edit them in high resolution and remove target concepts from the model. Check it out!

thumb_up_off_alt25

chat_bubble_outline0

repeat3

shareShare