Mor Geva (@megamor2) 's Twitter Profile
Mor Geva

@megamor2

ID: 850356925535531009

linkhttps://mega002.github.io/ calendar_today07-04-2017 14:37:44

450 Tweet

1,1K Followers

509 Following

Sohee Yang (@soheeyang_) 's Twitter Profile Photo

Excited to share that the code and datasets for our papers on latent multi-hop reasoning are finally available on GitHub: github.com/google-deepmin… We hope these resources support further research in this area. Thanks for your patience as we worked through the release process!

Mor Geva (@megamor2) 's Twitter Profile Photo

Χ‘Χͺ Χ©Χ Χͺיים ΧœΧžΧ“Χ” ΧœΧ•ΧžΧ¨ אזגקה Χ”Χ¨Χ‘Χ” ΧœΧ€Χ Χ™ Χ©ΧœΧžΧ“Χ” ΧœΧ•ΧžΧ¨ ΧžΧ™ΧœΧ™Χ Χ Χ—ΧžΧ“Χ•Χͺ Χ™Χ•ΧͺΧ¨ Χ›ΧžΧ• Χ€Χ©Χ˜Χ™Χ“Χ” או Χ—Χ‘Χ™ΧͺΧ”. Χ’Χ›Χ©Χ™Χ• גם Χ”ΧͺΧ—Χ™ΧœΧ” ΧœΧ‘Χ‘Χ•Χœ ΧžΧ”Χ›ΧœΧœΧͺ Χ™ΧͺΧ¨ וקוראΧͺ ΧœΧ§Χ•ΧœΧ•Χͺ של ΧΧžΧ‘Χ•ΧœΧ Χ‘ או של Χ™ΧœΧ“Χ™Χ צוגקים אזגקה

neuronpedia (@neuronpedia) 's Twitter Profile Photo

Announcement: we're open sourcing Neuronpedia! πŸš€ This includes all our mech interp tools: the interpretability API, steering, UI, inference, autointerp, search, plus 4 TB of data - cited by 35+ research papers and used by 50+ write-ups. What you can do with OSS Neuronpedia: 🧡

Tal Haklay (@tal_haklay) 's Twitter Profile Photo

🚨 Call for Papers is Out! The First Workshop on π€πœπ­π’π¨π§πšπ›π₯𝐞 πˆπ§π­πžπ«π©π«πžπ­πšπ›π’π₯𝐒𝐭𝐲 will be held at ICML 2025 in Vancouver! πŸ“… Submission Deadline: May 9 Follow us >> Actionable Interpretability Workshop ICML2025 🧠Topics of interest include: πŸ‘‡

🚨 Call for Papers is Out!

The First Workshop on π€πœπ­π’π¨π§πšπ›π₯𝐞 πˆπ§π­πžπ«π©π«πžπ­πšπ›π’π₯𝐒𝐭𝐲 will be held at ICML 2025 in Vancouver!

πŸ“… Submission Deadline: May 9
Follow us &gt;&gt; <a href="/ActInterp/">Actionable Interpretability Workshop ICML2025</a>

🧠Topics of interest include: πŸ‘‡
Actionable Interpretability Workshop ICML2025 (@actinterp) 's Twitter Profile Photo

🚨 We're looking for reviewers for the workshop! If you're passionate about making interpretability useful and want to help shape the conversation, we'd love your input. Sign up to review >>πŸ’‘πŸ”

🚨 We're looking for reviewers for the workshop!

If you're passionate about making interpretability useful and want to help shape the conversation, we'd love your input.

Sign up to review &gt;&gt;πŸ’‘πŸ”
Hadas Orgad (@orgadhadas) 's Twitter Profile Photo

Position papers wanted! For the First Workshop on Actionable Interpretability, we’re looking for diverse perspectives on the state of the field. Should certain areas of interpretability research be developed further? Are there key metrics we should prioritize? Or do you have >>

Position papers wanted!

For the First Workshop on Actionable Interpretability, we’re looking for diverse perspectives on the state of the field. Should certain areas of interpretability research be developed further? Are there key metrics we should prioritize? Or do you have &gt;&gt;
Jiuding Sun (@jiudingsun) 's Twitter Profile Photo

πŸ’¨ A new architecture of automating mechanistic interpretability with causal interchange intervention! #ICLR2025 πŸ”¬Neural networks are particularly good at discovering patterns from high-dimensional data, so we trained them to ... interpret themselves! πŸ§‘β€πŸ”¬ 1/4

πŸ’¨ A new architecture of automating mechanistic interpretability with causal interchange intervention! #ICLR2025

πŸ”¬Neural networks are particularly good at discovering patterns from high-dimensional data, so we trained them to ... interpret themselves! πŸ§‘β€πŸ”¬

1/4
Aran Komatsuzaki (@arankomatsuzaki) 's Twitter Profile Photo

The Leaderboard Illusion - Identifies systematic issues that have resulted in a distorted playing field of Chatbot Arena - Identifies 27 private LLM variants tested by Meta in the lead-up to the Llama-4 release

The Leaderboard Illusion

- Identifies systematic issues that have resulted in a distorted playing field of Chatbot Arena

- Identifies 27 private LLM variants tested by Meta in the lead-up to the Llama-4 release
Hadas Orgad (@orgadhadas) 's Twitter Profile Photo

Deadline extended! ⏳ The Actionable Interpretability Workshop at #ICML2025 has moved its submission deadline to May 19th. More time to submit your work πŸ”πŸ§ βœ¨Β Don’tΒ missΒ out!

Deadline extended! ⏳

The Actionable Interpretability Workshop at #ICML2025 has moved its submission deadline to May 19th. More time to submit your work πŸ”πŸ§ βœ¨Β Don’tΒ missΒ out!
clem πŸ€— (@clementdelangue) 's Twitter Profile Photo

This is the coolest dataset I've seen on Hugging Face today: action labels of 1v1 races in Gran Turismo 4 to train a multiplayer world model! Great stuff by Enigma!

This is the coolest dataset I've seen on <a href="/huggingface/">Hugging Face</a> today: action labels of 1v1 races in Gran Turismo 4 to train a multiplayer world model!

Great stuff by <a href="/EnigmaLabsAI/">Enigma</a>!
Percy Liang (@percyliang) 's Twitter Profile Photo

What would truly open-source AI look like? Not just open weights, open code/data, but *open development*, where the entire research and development process is public *and* anyone can contribute. We built Marin, an open lab, to fulfill this vision:

What would truly open-source AI look like? Not just open weights, open code/data, but *open development*, where the entire research and development process is public *and* anyone can contribute. We built Marin, an open lab, to fulfill this vision:
Mor Geva (@megamor2) 's Twitter Profile Photo

Help!! We got way more submissions than expected, and are now looking for reviewers! Please sign up in this form if you can do 2-3 reviews in the next few weeks πŸ‘‡ docs.google.com/forms/d/e/1FAI…

Yoav Gur Arieh (@guryoav) 's Twitter Profile Photo

Can we precisely erase conceptual knowledge from LLM parameters? Most methods are shallow, coarse, or overreach, adversely affecting related or general knowledge. We introduceπŸͺππˆπ’𝐂𝐄𝐒 β€” a general framework for Precise In-parameter Concept EraSure. 🧡 1/

Can we precisely erase conceptual knowledge from LLM parameters?
Most methods are shallow, coarse, or overreach, adversely affecting related or general knowledge.

We introduceπŸͺππˆπ’𝐂𝐄𝐒 β€” a general framework for Precise In-parameter Concept EraSure. 🧡 1/
Mor Geva (@megamor2) 's Twitter Profile Photo

Removing knowledge from LLMs is HARD. Yoav Gur Arieh proposes a powerful approach that disentangles the MLP parameters to edit them in high resolution and remove target concepts from the model. Check it out!