Blair Bilodeau (@blairbilodeau) 's Twitter Profile
Blair Bilodeau

@blairbilodeau

quant

ID: 348159742

linkhttp://www.blairbilodeau.ca calendar_today03-08-2011 23:56:37

4,4K Tweet

1,1K Followers

380 Following

Blair Bilodeau (@blairbilodeau) 's Twitter Profile Photo

I'm at #NeurIPS2022 this week! Come check out my poster for arxiv.org/abs/2202.05100 on Tuesday, 4pm-6pm Hall J #805 Reach out to chat about adaptivity, uncertainty, coffee, or music! Looking forward to finally putting an in-person face to some twitter handles and zoom screens

Been Kim (@_beenkim) 's Twitter Profile Photo

Feature visualizations are widely used interpretability tools - but can we trust them? We investigate this question from an adversarial 🥷, empirical 🔬 and theoretical 📝 perspective. The result: Don’t trust your eyes! (1/6) Paper: arxiv.org/abs/2306.04719 🧵

Feature visualizations are widely used interpretability tools - but can we trust them? We investigate this question from an adversarial 🥷, empirical 🔬 and theoretical 📝 perspective. The result: Don’t trust your eyes! (1/6)
Paper: arxiv.org/abs/2306.04719

🧵
Blair Bilodeau (@blairbilodeau) 's Twitter Profile Photo

3.5 years after we started this project while visiting Institute for Advanced Study (IAS), it’s been accepted to the Annals of Statistics. I won’t be at #JSM2023 but make sure to stop by Jeff’s session to hear about our work!

Been Kim (@_beenkim) 's Twitter Profile Photo

Many previous work of mine and others hinted ‘something fishy’ about saliency-based methods. But we never had a rigorous proof of what we saw. This work “Impossibility Theorems for Feature Attribution", now published in PNAS, to me marks a point of new beginnings.

Many previous work of mine and others hinted ‘something fishy’ about saliency-based methods. But we never had a rigorous proof of what we saw. This work “Impossibility Theorems for Feature Attribution", now published in PNAS, to me marks a point of new beginnings.
Natasha Jaques (@natashajaques) 's Twitter Profile Photo

Our recent PNAS paper shows that widely used interpretability methods, when used to ask simple counterfactual questions about models like “if I pay down this credit card will my credit score increase?”, are provably no better than random guessing. This is really problematic bc...

Suraj Srinivas (@suuraj) 's Twitter Profile Photo

Join us next week (June 20) for the Theory of Interpretable AI seminar series, where Blair Bilodeau will discuss the ** fundamental theoretical limitations ** of attribution methods and its implications for interpretability! 🌐tverven.github.io/tiai-seminar/ Michal Moshkovitz Tim van Erven

Join us next week (June 20) for the Theory of Interpretable AI seminar series, where <a href="/blairbilodeau/">Blair Bilodeau</a> will discuss the ** fundamental theoretical limitations ** of attribution methods and its implications for interpretability!

🌐tverven.github.io/tiai-seminar/

<a href="/ML_Theorist/">Michal Moshkovitz</a> <a href="/tverven/">Tim van Erven</a>