Wenjie Li (@wenjiewli) Twitter Tweets • TwiCopy

Katherine Hermann

3 years ago

How should we build models more aligned with human cognition? To induce more human-like behaviors & representations, let’s train embodied, interactive agents in rich, ethologically-relevant environments. w/ Aran Nayebi, Sjoerd van Steenkiste, & Matt Jones psyarxiv.com/a35mt/

thumb_up_off_alt121

chat_bubble_outline2

repeat22

shareShare

Amirhossein Kazemnejad

@a_kazemnejad

2 years ago

🚨Stop using positional encoding (PE) in Transformer decoders (e.g. GPTs). Our work shows 𝗡𝗼𝗣𝗘 (no positional encoding) outperforms all variants like absolute, relative, ALiBi, Rotary. A decoder can learn PE in its representation (see proof). Time for 𝗡𝗼𝗣𝗘 𝗟𝗟𝗠𝘀🧵[1/n]

thumb_up_off_alt1,1K

chat_bubble_outline43

repeat239

shareShare

Yulia Gryaditskaya

@ygryaditskaya

2 years ago

How well automatic methods today can understand hand-drawn abstract scene sketches? Paper: arxiv.org/abs/2312.12463 Code: ahmedbourouis.github.io/Scene_Sketch_S… Work by my Ph.D. student Ahmed Bourouis, in collaboration with Judy Fan #cvssp #PAI

thumb_up_off_alt69

chat_bubble_outline4

repeat25

shareShare

Siva Reddy

@sivareddyg

2 years ago

Glad to see that NoPE still holds its superiority over RoPE on length generalization as we also find in [1]. But this FIRE is truly on fire 🔥 [1] x.com/a_kazemnejad/s…

thumb_up_off_alt19

chat_bubble_outline1

repeat4

shareShare

Brenden Lake

@lakebrenden

a year ago

Implications: Despite these limitations, we find that a learner can get a strong start in acquiring visual models from a child's input without strong, domain-specific inductive biases. Future algorithmic advances, combined with richer and larger developmental datasets, can be

thumb_up_off_alt2

chat_bubble_outline0

repeat1

shareShare

steven t. piantadosi

@spiantado

a year ago

New perspective in Nature Reviews Psychology: human intelligence is a matter of scale of information processing, not genetic changes to one domain. Implications for AI, evolution, and development. - with @CantlonLab rdcu.be/dDoBt

New perspective in <a href="/NatRevPsych/">Nature Reviews Psychology</a>: human intelligence is a matter of scale of information processing, not genetic changes to one domain. Implications for AI, evolution, and development. - with @CantlonLab
rdcu.be/dDoBt

thumb_up_off_alt434

chat_bubble_outline11

repeat109

shareShare

Andrew Saxe

@saxelab

a year ago

New preprint! Transformers need multiple circuit mechanisms to improve simultaneously to reduce the loss. Because of this, hypotheses about circuit mechanisms can be tested by clamping some during learning and watching the loss dynamics. A toy model illustrates this ⬇️

thumb_up_off_alt69

chat_bubble_outline1

repeat14

shareShare

Wenjie Li

@wenjiewli

a year ago

Is memorization necessary for generalization?

thumb_up_off_alt3

chat_bubble_outline1

repeat0

shareShare

Wenjie Li

@wenjiewli

a year ago

So excited to be @CogSci2024! Come chat with me about common sense reasoning, deep learning, anything cogsci and more :)

thumb_up_off_alt19

chat_bubble_outline0

repeat2

shareShare

Tal Linzen

@tallinzen

a year ago

Can LMs serve as cognitive models of human language processing? Humans make syntactic agreement errors ("the key to the cabinets are rusty"). Suhas Arehalli and I tested if the errors documented in six human studies emerge in LMs. They... sometimes did. direct.mit.edu/opmi/article/d…

thumb_up_off_alt52

chat_bubble_outline2

repeat7

shareShare

Laura Ruis

@lauraruis

10 months ago

How do LLMs learn to reason from data? Are they ~retrieving the answers from parametric knowledge🦜? In our new preprint, we look at the pretraining data and find evidence against this: Procedural knowledge in pretraining drives LLM reasoning ⚙️🔢 🧵⬇️

thumb_up_off_alt966

chat_bubble_outline24

repeat208

shareShare

Rohan Paul

@rohanpaul_ai

10 months ago

Your LLM isn't doing math - it's using clever pattern matching tricks. LLMs perform arithmetic using neither robust algorithms nor memorization; rather, they rely on a “bag of heuristics”, as proposed in this paper. 🤔 Original Problem: Do LLMs solve reasoning tasks using

thumb_up_off_alt1,1K

chat_bubble_outline51

repeat230

shareShare

Andrew Lampinen

@andrewlampinen

6 months ago

Lots more detail in the preprint here: arxiv.org/abs/2502.20349 Thanks to Wilka Carvalho for spurring and spearheading this work, and to the many others who gave us thoughtful feedback! This paper is definitely a work in progress, so comments and suggestions are welcome!

thumb_up_off_alt13

chat_bubble_outline0

repeat3

shareShare

Griffiths Computational Cognitive Science Lab

@cocosci_lab

6 months ago

New preprint reveals that large language models blend two distinct representations of numbers -- as strings and as integers -- which can lead to some surprising errors. This work shows how methods from cognitive science can be useful for understanding AI systems.

thumb_up_off_alt65

chat_bubble_outline1

repeat6

shareShare

Peter Tong

@tongpetersb

5 months ago

Vision models have been smaller than language models; what if we scale them up? Introducing Web-SSL: A family of billion-scale SSL vision models (up to 7B parameters) trained on billions of images without language supervision, using VQA to evaluate the learned representation.

thumb_up_off_alt484

chat_bubble_outline8

repeat84

shareShare

Brenden Lake

@lakebrenden

3 months ago

I'm joining Princeton as an Associate Professor of Computer Science and Psychology this fall! Princeton is ambitiously investing in AI and Natural & Artificial Minds, and I'm excited for my lab to contribute. Recruiting postdocs and Ph.D. students in CS and Psychology — join us!

thumb_up_off_alt1,1K

chat_bubble_outline51

repeat80

shareShare

Andrew Lampinen

@andrewlampinen

2 months ago

In our symbolic behaviour in AI paper (arxiv.org/abs/2102.03406), we argued that AI should take a similar perspective: building deep learning systems that treat symbols as subjective tools that they can employ as a means to an end.

thumb_up_off_alt106

chat_bubble_outline2

repeat10

shareShare