Stanford Trustworthy AI Research (STAIR) Lab (@stai_research) Twitter Tweets • TwiCopy

6 months ago

Test-time training (TTT) layers are RNN layers where the hidden state is a machine learning model and the update rule is a step of gradient descent. See this thread for previous work. x.com/karansdalal/st…

thumb_up_off_alt243

chat_bubble_outline2

repeat19

shareShare

Karan Dalal

6 months ago

Our approach simply adds TTT layers to a pre-trained Diffusion Transformer and fine-tunes it on long videos with text annotations. To keep costs manageable, we limit self-attention to local segments and let TTT (linear complexity) operate globally.

thumb_up_off_alt150

chat_bubble_outline1

repeat7

shareShare

Karan Dalal

6 months ago

We create an “On-Chip Tensor Parallel” algorithm to implement an efficient TTT-MLP kernel. Specifically, we shard the weights of the “hidden state model” across Streaming Multiprocessors, and use the DSMEM feature Hopper GPUs implement AllReduce among SMs. This avoids costly

thumb_up_off_alt136

chat_bubble_outline1

repeat7

shareShare

Karan Dalal

6 months ago

Grateful for wonderful collaborators. This work will be presented at CVPR 2025. Daniel Koceja Gashon Hussein Jiarui Xu Yue Zhao Jan Kautz Carlos Guestrin Tatsunori Hashimoto Sanmi Koyejo Yejin Choi Xiaolong Wang

Grateful for wonderful collaborators. This work will be presented at CVPR 2025.

<a href="/danielkoceja/">Daniel Koceja</a> <a href="/GashonHussein/">Gashon Hussein</a> <a href="/Jerry_XU_Jiarui/">Jiarui Xu</a> <a href="/__yuezhao__/">Yue Zhao</a> <a href="/jankautz/">Jan Kautz</a> <a href="/guestrin/">Carlos Guestrin</a> <a href="/tatsu_hashimoto/">Tatsunori Hashimoto</a> <a href="/sanmikoyejo/">Sanmi Koyejo</a> <a href="/YejinChoinka/">Yejin Choi</a> <a href="/xiaolonw/">Xiaolong Wang</a>

thumb_up_off_alt163

chat_bubble_outline2

repeat9

shareShare

Karan Dalal

6 months ago

Daniel Koceja Gashon Hussein Jiarui Xu Yue Zhao Jan Kautz Carlos Guestrin Tatsunori Hashimoto Sanmi Koyejo Yejin Choi Xiaolong Wang + our wonderful collaborators without Twitter – Shihao Han, Ka Chun Cheung, Youjin Song, and Yu Sun.

thumb_up_off_alt92

chat_bubble_outline1

repeat3

shareShare

Ben Bucknall

5 months ago

📢 Over the moon that Open Problems in Technical AI Governance has now been published at TMLR! See the updated version here: shorturl.at/joQJS

thumb_up_off_alt202

chat_bubble_outline6

repeat46

shareShare

Ben Bucknall

5 months ago

We're also organising a workshop on technical AI governance at #ICML2025! This is a great opportunity to present work on any of the problems outlined in the paper. Submissions are due May 7th and we're also looking for PC members. More info 👉🏻Technical AI Governance @ ICML 2025

thumb_up_off_alt14

chat_bubble_outline1

repeat2

shareShare

Ben Bucknall

5 months ago

Huge thanks again to Anka Reuel | @ankareuel.bsky.social for co-leading this work, and to Lisa Soder for doing incredible work making the workshop happen. And of course to all our wonderful co-authors: Cas (Stephen Casper), Tim Fist, Onni Aarne, Lewis Hammond, Lujain Ibrahim لجين إبراهيم, Alan Chan, Peter Wills,

thumb_up_off_alt12

chat_bubble_outline1

repeat2

shareShare

Ben Bucknall

5 months ago

Markus Anderljung, Ben Garfinkel, Lennart Heim, ⿻ Andrew Trask, Gabriel Mukobi, Rylan Schaeffer, Mauricio Baker, Sara Hooker, Irene Solaiman, Sasha Luccioni, PhD 🦋🌎✨🤗, Nitarshan Rajkumar, Nick Moës, Jeffrey Ladish, David Bau, Paul Bricman, Neel Guha, Jessica Newman, Yoshua Bengio, Tobin South, Alex Pentland,

thumb_up_off_alt8

chat_bubble_outline1

repeat2

shareShare

Ben Bucknall

Sanmi Koyejo, Mykel Kochenderfer, Robert Trager, Oxford Martin AI Governance Initiative, SISL, Stanford Trustworthy AI Research (STAIR) Lab

5 months ago

thumb_up_off_alt6

chat_bubble_outline0

repeat2

shareShare

Stanford HAI

5 months ago

📢 New white paper: Scholars from Stanford HAI, The Asia Foundation, and University of Pretoria map the current landscape of technical approaches to developing LLMs that better perform for and represent low-resource languages. (1/4) ↘️ hai.stanford.edu/policy/mind-th…

📢 New white paper: Scholars from <a href="/StanfordHAI/">Stanford HAI</a>, <a href="/Asia_Foundation/">The Asia Foundation</a>, and <a href="/UPTuks/">University of Pretoria</a> map the current landscape of technical approaches to developing LLMs that better perform for and represent low-resource languages. (1/4) ↘️

hai.stanford.edu/policy/mind-th…

thumb_up_off_alt30

chat_bubble_outline2

repeat14

shareShare

Stanford HAI

5 months ago

LLM development suffers from a digital divide: Most major LLMs underperform for low-resource languages; are not attuned to relevant cultural contexts; and are not accessible in parts of the Global South. (2/4)

thumb_up_off_alt4

chat_bubble_outline1

repeat3

shareShare

Stanford HAI

5 months ago

The paper discusses the trade-offs of three approaches (massively multilingual models, regional multilingual models, and monolingual or monocultural models) and highlights ongoing initiatives to address underlying data scarcity and quality issues. (3/4)

thumb_up_off_alt9

chat_bubble_outline1

repeat5

shareShare

Stanford HAI

5 months ago

Here, the authors present three high-level recommendations for AI researchers, funders, policymakers, and civil society organizations: (4/4) hai.stanford.edu/policy/mind-th…

thumb_up_off_alt10

chat_bubble_outline0

repeat4

shareShare

Alyssa Unell

@alyssaunell

5 months ago

Excited to present this work at ICLR's SynthData Workshop on Sunday April 27! Come through from 11:30-12:30 @ Peridot 202-203 to talk anything synthetic data for post-training, benchmarking, and AI for healthcare in general.

thumb_up_off_alt44

chat_bubble_outline1

repeat8

shareShare

Ken Liu

@kenziyuliu

5 months ago

will present this work as spotlight at icml this year :)

thumb_up_off_alt98

chat_bubble_outline8

repeat5

shareShare

Stanford HAI