Tal Haklay (@tal_haklay) 's Twitter Profile
Tal Haklay

@tal_haklay

NLP | Interpretability | PhD student at the @TechnionLive

ID: 1505172645825990659

linkhttps://talhaklay.github.io/ calendar_today19-03-2022 13:21:51

98 Tweet

499 Followers

472 Following

BlackboxNLP (@blackboxnlp) 's Twitter Profile Photo

Have you heard about our shared task? πŸ“’ Mechanistic Interpretability (MI) is quickly advancing, but comparing methods remains a challenge. This year, as a part of #BlackboxNLP at EMNLP 2025, we're introducing a shared task to rigorously evaluate MI methods in LMs 🧡

Have you heard about our shared task? πŸ“’

Mechanistic Interpretability (MI) is quickly advancing, but comparing methods remains a challenge. 
This year, as a part of #BlackboxNLP at <a href="/emnlpmeeting/">EMNLP 2025</a>, we're introducing a shared task to rigorously evaluate MI methods in LMs 🧡
Koyena Pal (@kpal_koyena) 's Twitter Profile Photo

🚨 Registration is live! 🚨 The New England Mechanistic Interpretability (NEMI) Workshop is happening August 22nd 2025 at Northeastern University! A chance for the mech interp community to nerd out on how models really work πŸ§ πŸ€– 🌐 Info: nemiconf.github.io/summer25/ πŸ“ Register:

🚨 Registration is live! 🚨

The New England Mechanistic Interpretability (NEMI) Workshop is happening August 22nd 2025 at Northeastern University!

A chance for the mech interp community to nerd out on how models really work πŸ§ πŸ€–

🌐 Info: nemiconf.github.io/summer25/
πŸ“ Register:
Tal Haklay (@tal_haklay) 's Twitter Profile Photo

Next week I’ll be at ICML ICML Conference Come check out our poster "MIB: A Mechanistic Interpretability Benchmark"😎 July 17, 11 a.m. And don’t miss the first Actionable Interpretability Workshop on July 19 - focusing on bridging the gap between insights and actions! πŸ”βš™οΈ

Next week I’ll be at ICML <a href="/icmlconf/">ICML Conference</a>

Come check out our poster "MIB: A Mechanistic Interpretability Benchmark"😎 July 17, 11 a.m.

And don’t miss the first Actionable Interpretability Workshop on July 19 - focusing on bridging the gap between insights and actions! πŸ”βš™οΈ
Mor Geva (@megamor2) 's Twitter Profile Photo

Going to #icml2025? Don't miss the Actionable Interpretability Workshop (Actionable Interpretability Workshop ICML2025)! We've got an amazing lineup of speakers, panelists, and papers, all focused on turning insights from interpretability research into practical, real-world problems ✨

Going to #icml2025? Don't miss the Actionable Interpretability Workshop (<a href="/ActInterp/">Actionable Interpretability Workshop ICML2025</a>)! We've got an amazing lineup of speakers, panelists, and papers, all focused on turning insights from interpretability research into practical, real-world problems ✨
Tal Haklay (@tal_haklay) 's Twitter Profile Photo

🚨Meet our panelists at the Actionable Interpretability Workshop Actionable Interpretability Workshop ICML2025 at ICML Conference! Join us July 19 at 4pm for a panel on making interpretability research actionable, its challenges, and how the community can drive greater impact. Naomi Saphra hiring my lab at ICML 🧈πŸͺ° Samuel Marks Kyle Lo Fazl Barez

🚨Meet our panelists at the Actionable Interpretability Workshop <a href="/ActInterp/">Actionable Interpretability Workshop ICML2025</a> at <a href="/icmlconf/">ICML Conference</a>!

Join us July 19 at 4pm for a panel on making interpretability research actionable, its challenges, and how the community can drive greater impact.
<a href="/nsaphra/">Naomi Saphra hiring my lab at ICML 🧈πŸͺ°</a> <a href="/saprmarks/">Samuel Marks</a> <a href="/kylelostat/">Kyle Lo</a> <a href="/FazlBarez/">Fazl Barez</a>
Fazl Barez (@fazlbarez) 's Twitter Profile Photo

I’ll be at #ICML2025 – come say hi and talk to me about responsible AIπŸ‘‹ 🎀 Speaking (14th): Post-AGI Civilizational Equilibria post-agi.org πŸ’­ Panel alphaXiv (14th eve) lu.ma/n0yavto0 πŸ“ Main-Conf Poster (16th): PoisonBench icml.cc/virtual/2025/p… πŸ‘€

I’ll be at #ICML2025 – come say hi and talk to me about responsible AIπŸ‘‹

🎀 Speaking (14th): Post-AGI Civilizational Equilibria post-agi.org
πŸ’­ Panel <a href="/askalphaxiv/">alphaXiv</a> (14th eve) lu.ma/n0yavto0
πŸ“ Main-Conf Poster (16th): PoisonBench icml.cc/virtual/2025/p…
πŸ‘€
Itay Itzhak (@itay_itzhak_) 's Twitter Profile Photo

🚨New paper alert🚨 🧠 Instruction-tuned LLMs show amplified cognitive biases β€” but are these new behaviors, or pretraining ghosts resurfacing? Excited to share our new paper, accepted to CoLM 2025πŸŽ‰! See thread below πŸ‘‡ #BiasInAI #LLMs #MachineLearning #NLProc

🚨New paper alert🚨

🧠 
Instruction-tuned LLMs show amplified cognitive biases β€” but are these new behaviors, or pretraining ghosts resurfacing?

Excited to share our new paper, accepted to CoLM 2025πŸŽ‰!
See thread below πŸ‘‡
#BiasInAI #LLMs #MachineLearning #NLProc
Samuel Marks (@saprmarks) 's Twitter Profile Photo

In a new post, I present: 1. A framework for thinking about which downstream applications interpretability researchers should target 2. Eight concrete problems for practical interpretability work

In a new post, I present:
1. A framework for thinking about which downstream applications interpretability researchers should target
2. Eight concrete problems for practical interpretability work
Samuel Marks (@saprmarks) 's Twitter Profile Photo

I'm excited to discuss downstream applications of interpretability at Actionable Interpretability Workshop ICML2025! For a preview of my thoughts on the topic, see my blog post on how I think about picking applications to target x.com/saprmarks/stat…

Hadas Orgad (@orgadhadas) 's Twitter Profile Photo

Hope everyone’s getting the most out of #icml25. We’re excited and ready for the Actionable Interpretability (Actionable Interpretability Workshop ICML2025) workshop this Saturday! Check out the schedule and join us to discuss how we can move interpretability toward more practical impact.

Hope everyone’s getting the most out of #icml25. We’re excited and ready for the Actionable Interpretability (<a href="/ActInterp/">Actionable Interpretability Workshop ICML2025</a>) workshop this Saturday!
Check out the schedule and join us to discuss how we can move interpretability toward more practical impact.
Actionable Interpretability Workshop ICML2025 (@actinterp) 's Twitter Profile Photo

🚨The Actionable Interpretability Workshop is happening tomorrow at ICML! Join us for an exciting lineup of speakers, nearly 70 posters, and a great panel discussion πŸ™Œ Don’t miss it! πŸ”βš™οΈ ICML Conference Actionable Interpretability Workshop ICML2025

🚨The Actionable Interpretability Workshop is happening tomorrow at  ICML! 
Join us for an exciting lineup of speakers, nearly 70 posters, and a great panel discussion πŸ™Œ
Don’t miss it! πŸ”βš™οΈ

<a href="/icmlconf/">ICML Conference</a> <a href="/ActInterp/">Actionable Interpretability Workshop ICML2025</a>
Tal Haklay (@tal_haklay) 's Twitter Profile Photo

ICMLπŸ›«πŸ›¬ACL Next week I’ll be at ACL 2025, giving an oral presentation about position-aware automatic circuit discovery. DM me if you’d like to chat about interpretability, mech-interp at scale, or just life :)

ICMLπŸ›«πŸ›¬ACL

Next week I’ll be at <a href="/aclmeeting/">ACL 2025</a>, giving an oral presentation about position-aware automatic circuit discovery.

DM me if you’d like to chat about interpretability, mech-interp at scale, or just life :)
Actionable Interpretability Workshop ICML2025 (@actinterp) 's Twitter Profile Photo

Big congrats to Alex McKenzie, Pedro Ferreira, and their collaborators on receiving Outstanding Paper Awards!πŸ‘πŸ‘ and thanks for the fantastic oral presentations! Check out the papers here πŸ‘‡

Big congrats to Alex McKenzie, Pedro Ferreira, and their collaborators on receiving Outstanding Paper Awards!πŸ‘πŸ‘ 
and thanks for the fantastic oral presentations!

Check out the papers here πŸ‘‡
Ivan Titov (@iatitov) 's Twitter Profile Photo

Many thanks to the Actionable Interpretability Workshop ICML2025 organisers for highlighting our work - and congratulations to Pedro, Alex and the other awardees! Sad not to have been there in person, it looked like a fantastic workshop. AmsterdamNLP EdinburghNLP