Richard Antonello (@neurorj) Twitter Tweets • TwiCopy

Richard Antonello

@neurorj

+ Follow

Postdoc in the Mesgarani Lab at Columbia University. Studying how the brain processes language by using LLMs. (Formerly @HuthLab at UT Austin)

ID: 1260656805669191680

calendar_today13-05-2020 19:43:07

219 Tweet

356 Followers

224 Following

Katie Kang

@katie_kang_

a year ago

LLMs excel at fitting finetuning data, but are they learning to reason or just parroting🦜? We found a way to probe a model's learning process to reveal *how* each example is learned. This lets us predict model generalization using only training data, amongst other insights: 🧵

thumb_up_off_alt766

chat_bubble_outline19

repeat122

shareShare

Marianne Arriola @ ICLR’25

@mariannearr

8 months ago

🚨Announcing our #ICLR2025 Oral! 🔥Diffusion LMs are on the rise for parallel text generation! But unlike autoregressive LMs, they struggle with quality, fixed-length constraints & lack of KV caching. 🚀Introducing Block Diffusion—combining autoregressive and diffusion models

thumb_up_off_alt880

chat_bubble_outline16

repeat133

shareShare

Ruimin Gao

@ruimin_g

7 months ago

Excited to introduce funROI: A Python package for functional ROI analyses of fMRI data! funroi.readthedocs.io/en/latest/ #fMRI #Neuroimaging #Python #OpenScience Work w Anna Ivanova 🧵👇

thumb_up_off_alt97

chat_bubble_outline2

repeat20

shareShare

Karan Dalal

@karansdalal

7 months ago

Today, we're releasing a new paper – One-Minute Video Generation with Test-Time Training. We add TTT layers to a pre-trained Transformer and fine-tune it to generate one-minute Tom and Jerry cartoons with strong temporal consistency. Every video below is produced directly by

thumb_up_off_alt5,5K

chat_bubble_outline187

repeat940

shareShare

Richard Antonello

@neurorj

6 months ago

For those attending NAACL, today I'll be presenting recent work on how we can use language encoding models to identify functional specialization throughout cortex. Stop by my talk at 10:30 at the CMCL workshop!

thumb_up_off_alt15

chat_bubble_outline1

repeat1

shareShare

Yufan Zhuang

@yufan_zhuang

5 months ago

🤯Your LLM just threw away 99.9 % of what it knows. Standard decoding samples one token at a time and discards the rest of the probability mass. Mixture of Inputs (MoI) rescues that lost information, feeding it back for more nuanced expressions. It is a brand new

thumb_up_off_alt39

chat_bubble_outline4

repeat7

shareShare

Emily Cheng

@sparse_emcheng

5 months ago

We'll be presenting this at #ACL2025 ! Come find me and Thomas Jiralerspong in Vienna :)

thumb_up_off_alt11

chat_bubble_outline1

repeat2

shareShare

Guy Gaziv

@ggaziv

5 months ago

Can we precisely and noninvasively modulate deep brain activity just by riding the natural visual feed? 👁️🧠 In our new preprint, we use brain models to craft subtle image changes that steer deep neural populations in primate IT cortex. Just pixels. 📝arxiv.org/abs/2506.05633

thumb_up_off_alt53

chat_bubble_outline2

repeat11

shareShare

Chandan Singh

@csinva

3 months ago

New paper: Ask 35 simple questions about sentences in a story and use the answers to predict brain responses. Interpretable. Compact. Surprisingly high performance in both fMRI and ECoG. biorxiv.org/content/10.110…

thumb_up_off_alt28

chat_bubble_outline3

repeat6

shareShare

Robert Sanek

@robertsanek

2 months ago

Richard Antonello Very cool video. I created an infographic to try to visualize the full study in more depth studyvisuals.com/artificial-int…

thumb_up_off_alt3

chat_bubble_outline0

repeat1

shareShare

Simone Scardapane

@s_scardapane

2 months ago

*Harnessing the Universal Geometry of Embeddings* by Rishi Jha Jack Morris Vitaly Shmatikov With the proper set of losses, text embeddings from different models can be aligned with no paired data (what they call the "strong" Platonic hypothesis). arxiv.org/abs/2505.12540

*Harnessing the Universal Geometry of Embeddings*
by <a href="/rishi_d_jha/">Rishi Jha</a> <a href="/jxmnop/">Jack Morris</a> <a href="/shmatikov/">Vitaly Shmatikov</a>

With the proper set of losses, text embeddings from different models can be aligned with no paired data (what they call the "strong" Platonic hypothesis).

arxiv.org/abs/2505.12540

thumb_up_off_alt371

chat_bubble_outline0

repeat61

shareShare

Rajvi Agravat

@rajviagravat

2 months ago

If you're at #SNL2025 and curious about speech and music perception or its representation in developing brains, stop by my poster in Session C – #77! :)

thumb_up_off_alt3

chat_bubble_outline0

repeat1

shareShare

Sam Norman-Haignere

@samnormanh

a month ago

Human auditory cortex integrates information in speech across absolute time (e.g., 200 ms), not phonemes, syllables, words, or any other time-varying speech structure: nature.com/articles/s4159…

thumb_up_off_alt57

chat_bubble_outline4

repeat19

shareShare

Anna Ivanova

@neuranna

a month ago

As our lab started to build encoding 🧠 models, we were trying to figure out best practices in the field. So Taha Binhuraib 🦉 built a library to easily compare design choices & model features across datasets! We hope it will be useful to the community & plan to keep expanding it! 1/

thumb_up_off_alt37

chat_bubble_outline1

repeat6

shareShare