Ekdeep Singh Lubana (@ekdeepl) 's Twitter Profile
Ekdeep Singh Lubana

@ekdeepl

Postdoc at CBS-NTT Program on Physics of Intelligence, Harvard University.

ID: 944451685711273984

linkhttp://ekdeepslubana.github.io calendar_today23-12-2017 06:16:43

445 Tweet

1,1K Followers

1,1K Following

Andrew Lee (@a_jy_l) 's Twitter Profile Photo

New Preprint! Did you know that steering vectors from one LM can be transferred and re-used in another LM? We argue this is because token embeddings across LMs share many “global” and “local” geometric similarities! 1/N

New Preprint! Did you know that steering vectors from one LM can be transferred and re-used in another LM? We argue this is because token embeddings across LMs share many “global” and “local” geometric similarities! 1/N
Kempner Institute at Harvard University (@kempnerinst) 's Twitter Profile Photo

New in the Deeper Learning blog: Kempner researchers characterize the inherent bias of sparse autoencoders and call for a new generation of SAEs that are aware of concept geometry. kempnerinstitute.harvard.edu/research/deepe… by Sumedh Hindupur, Ekdeep Singh, Thomas Fel, (Dem + 1) x Ba #AI #autoencoders #ML

Laura Ruis (@lauraruis) 's Twitter Profile Photo

Excited to announce that this fall I'll be joining Jacob Andreas's amazing lab at MIT for a postdoc to work on interp. for reasoning (with Ev (like in 'evidence', not Eve) Fedorenko 🇺🇦 🤯 among others). Cannot wait to think more about this direction in such a dream academic context!

Andrew Lee (@a_jy_l) 's Twitter Profile Photo

🚨New preprint! How do reasoning models verify their own CoT? We reverse-engineer LMs and find critical components and subspaces needed for self-verification! 1/n

🚨New preprint!

How do reasoning models verify their own CoT?
We reverse-engineer LMs and find critical components and subspaces needed for self-verification!

1/n
Ekdeep Singh Lubana (@ekdeepl) 's Twitter Profile Photo

Check out the thread of our recent ICML paper on using knowledge graphs to mechanistically study how model editing can deteriorate a neural network's capabilities!