
Csordás Róbert
@robert_csordas
Postdoc at Stanford working on systematic generalization and algorithmic reasoning. Ex IDSIA PhD, Ex @DeepMind intern.
ID: 745005274784751616
https://robertcsordas.github.io/ 20-06-2016 21:27:54
169 Tweet
762 Followers
426 Following

Today, we're joined by Julie Kallini ✨, PhD student at Stanford NLP Group to discuss her recent papers, “MrT5: Dynamic Token Merging for Efficient Byte-level Language Models” and “Mission: Impossible Language Models.” For the MrT5 paper, we explore the importance and failings of

Had a great time chatting with Sam Charrington about MrT5 and Mission: Impossible—thank you for having me on the TWIML AI Podcast!







Padding a transformer’s input with blank tokens (...) is a simple form of test-time compute. Can it increase the computational power of LLMs? 👀 New work with Ashish Sabharwal addresses this with *exact characterizations* of the expressive power of transformers with padding 🧵





New on arXiv: Knee-Deep in C-RASP, by Andy J Yang, Michael Cadilhac and me. The solid stepped line is our theoretical prediction based on what problems C-RASP can solve, and the numbers/colors are what transformers (no position embedding) can learn.

