Tristan Thrush (@tristanthrush) Twitter Tweets • TwiCopy

Tristan Thrush

@tristanthrush

+ Follow

PhD-ing @StanfordAILab @stanfordnlp. Interested in data, multimodality, scaling, and many more things.

ID: 1388198782924255232

linkhttp://www.tristanthrush.com calendar_today30-04-2021 18:29:34

570 Tweet

3,3K Followers

892 Following

Stanford NLP Group

@stanfordnlp

7 months ago

.Stanford NLP Group will be in @Singapore with lots of ICLR 2026 papers. Tristan Thrush, Christopher Potts & Tatsunori Hashimoto will show how to select good pretraining data: LLM losses on texts correlate with downstream benchmarks, so select high-correlation docs. arxiv.org/abs/2409.05816

thumb_up_off_alt44

chat_bubble_outline1

repeat7

shareShare

Ken Liu

@kenziyuliu

7 months ago

An LLM generates an article verbatim—did it “train on” the article? It’s complicated: under n-gram definitions of train-set inclusion, LLMs can complete “unseen” texts—both after data deletion and adding “gibberish” data. Our results impact unlearning, MIAs & data transparency🧵

thumb_up_off_alt290

chat_bubble_outline10

repeat79

shareShare

Karan Dalal

@karansdalal

7 months ago

Today, we're releasing a new paper – One-Minute Video Generation with Test-Time Training. We add TTT layers to a pre-trained Transformer and fine-tune it to generate one-minute Tom and Jerry cartoons with strong temporal consistency. Every video below is produced directly by

thumb_up_off_alt5,5K

chat_bubble_outline187

repeat940

shareShare

Ian Magnusson

@ianmagnusson

7 months ago

🔭 Science relies on shared artifacts collected for the common good. 🛰 So we asked: what's missing in open language modeling? 🪐 DataDecide 🌌 charts the cosmos of pretraining—across scales and corpora—at a resolution beyond any public suite of models that has come before.

thumb_up_off_alt88

chat_bubble_outline4

repeat62

shareShare

Julie Kallini ✨ @ ICLR 2025 ✈️

@juliekallini

7 months ago

🚀 In T-minus 1 week, I’ll be at ICLR presenting MrT5! The final version has tons of updates: - New controller algorithm for targeted compression rates - More baselines and downstream tasks - Scaled-up experiments to 1.23B parameter models And now, MrT5 is on 🤗HuggingFace! 🧵

thumb_up_off_alt127

chat_bubble_outline4

repeat29

shareShare

Jihyeon Je

@jihyeonje

6 months ago

Can you rotate a dice 🎲 in your head? Mental imagery plays a key role in perspective reasoning for humans - but can it help VLMs reason spatially? We show that Abstract Perspective Change significantly improves VLM reasoning from unseen views. Check out our preprint for more:

thumb_up_off_alt96

chat_bubble_outline0

repeat14

shareShare

Keshigeyan Chandrasegaran

@keshigeyan

5 months ago

1/ Model architectures have been mostly treated as fixed post-training. 🌱 Introducing Grafting: A new way to edit pretrained diffusion transformers, allowing us to customize architectural designs on a small compute budget. 🌎 grafting.stanford.edu Co-led with Michael Poli

thumb_up_off_alt117

chat_bubble_outline5

repeat28

shareShare

CLS

@chengleisi

4 months ago

Are AI scientists already better than human researchers? We recruited 43 PhD students to spend 3 months executing research ideas proposed by an LLM agent vs human experts. Main finding: LLM ideas result in worse projects than human ideas.

thumb_up_off_alt553

chat_bubble_outline10

repeat162

shareShare