Stanford Trustworthy AI Research (STAIR) Lab (@stai_research) 's Twitter Profile
Stanford Trustworthy AI Research (STAIR) Lab

@stai_research

A research group in @StanfordAILab researching AI Capabilities, Trust and Safety, Equity and Reliability
Website: stair.cs.stanford.edu

ID: 1721259514933215232

linkhttps://stairlab.stanford.edu/ calendar_today05-11-2023 20:14:02

245 Tweet

585 Followers

53 Following

Karan Dalal (@karansdalal) 's Twitter Profile Photo

Today, we're releasing a new paper – One-Minute Video Generation with Test-Time Training. We add TTT layers to a pre-trained Transformer and fine-tune it to generate one-minute Tom and Jerry cartoons with strong temporal consistency. Every video below is produced directly by

Karan Dalal (@karansdalal) 's Twitter Profile Photo

Test-time training (TTT) layers are RNN layers where the hidden state is a machine learning model and the update rule is a step of gradient descent. See this thread for previous work. x.com/karansdalal/st…

Karan Dalal (@karansdalal) 's Twitter Profile Photo

Our approach simply adds TTT layers to a pre-trained Diffusion Transformer and fine-tunes it on long videos with text annotations. To keep costs manageable, we limit self-attention to local segments and let TTT (linear complexity) operate globally.

Our approach simply adds TTT layers to a pre-trained Diffusion Transformer and fine-tunes it on long videos with text annotations. To keep costs manageable, we limit self-attention to local segments and let TTT (linear complexity) operate globally.
Karan Dalal (@karansdalal) 's Twitter Profile Photo

We create an “On-Chip Tensor Parallel” algorithm to implement an efficient TTT-MLP kernel. Specifically, we shard the weights of the “hidden state model” across Streaming Multiprocessors, and use the DSMEM feature Hopper GPUs implement AllReduce among SMs. This avoids costly

We create an “On-Chip Tensor Parallel” algorithm to implement an efficient TTT-MLP kernel. Specifically, we shard the weights of the “hidden state model” across Streaming Multiprocessors, and use the DSMEM feature Hopper GPUs implement AllReduce among SMs.

This avoids costly
Ben Bucknall (@ben_s_bucknall) 's Twitter Profile Photo

📢 Over the moon that Open Problems in Technical AI Governance has now been published at TMLR! See the updated version here: shorturl.at/joQJS

📢 Over the moon that Open Problems in Technical AI Governance has now been published at TMLR! See the updated version here: shorturl.at/joQJS
Ben Bucknall (@ben_s_bucknall) 's Twitter Profile Photo

We're also organising a workshop on technical AI governance at #ICML2025! This is a great opportunity to present work on any of the problems outlined in the paper. Submissions are due May 7th and we're also looking for PC members. More info 👉🏻Technical AI Governance @ ICML 2025

Stanford HAI (@stanfordhai) 's Twitter Profile Photo

📢 New white paper: Scholars from Stanford HAI, The Asia Foundation, and University of Pretoria map the current landscape of technical approaches to developing LLMs that better perform for and represent low-resource languages. (1/4) ↘️ hai.stanford.edu/policy/mind-th…

📢 New white paper: Scholars from <a href="/StanfordHAI/">Stanford HAI</a>, <a href="/Asia_Foundation/">The Asia Foundation</a>, and <a href="/UPTuks/">University of Pretoria</a> map the current landscape of technical approaches to developing LLMs that better perform for and represent low-resource languages. (1/4) ↘️

hai.stanford.edu/policy/mind-th…
Stanford HAI (@stanfordhai) 's Twitter Profile Photo

LLM development suffers from a digital divide: Most major LLMs underperform for low-resource languages; are not attuned to relevant cultural contexts; and are not accessible in parts of the Global South. (2/4)

Stanford HAI (@stanfordhai) 's Twitter Profile Photo

The paper discusses the trade-offs of three approaches (massively multilingual models, regional multilingual models, and monolingual or monocultural models) and highlights ongoing initiatives to address underlying data scarcity and quality issues. (3/4)

Stanford HAI (@stanfordhai) 's Twitter Profile Photo

Here, the authors present three high-level recommendations for AI researchers, funders, policymakers, and civil society organizations: (4/4) hai.stanford.edu/policy/mind-th…

Alyssa Unell (@alyssaunell) 's Twitter Profile Photo

Excited to present this work at ICLR's SynthData Workshop on Sunday April 27! Come through from 11:30-12:30 @ Peridot 202-203 to talk anything synthetic data for post-training, benchmarking, and AI for healthcare in general.

Stanford HAI (@stanfordhai) 's Twitter Profile Photo

Most major LLMs are trained using English data, making it ineffective for the approximately 5 billion people who don't speak English. Here, Stanford HAI Faculty Affiliate Sanmi Koyejo discusses the risks of this digital divide and how to close it. stanford.io/3SfGmRk