Wenjie Li (@wenjiewli) 's Twitter Profile
Wenjie Li

@wenjiewli

Neural Computation PhD student at @CarnegieMellon | prev: Research Scientist at @NYUDataScience with @LakeBrenden. Working on concept learning

ID: 1373061637742153729

linkhttp://wenjieli.me calendar_today19-03-2021 23:59:53

50 Tweet

119 Followers

318 Following

Katherine Hermann (@khermann_) 's Twitter Profile Photo

How should we build models more aligned with human cognition? To induce more human-like behaviors & representations, let’s train embodied, interactive agents in rich, ethologically-relevant environments. w/ Aran Nayebi, Sjoerd van Steenkiste, & Matt Jones psyarxiv.com/a35mt/

Amirhossein Kazemnejad (@a_kazemnejad) 's Twitter Profile Photo

🚨Stop using positional encoding (PE) in Transformer decoders (e.g. GPTs). Our work shows 𝗡𝗼𝗣𝗘 (no positional encoding) outperforms all variants like absolute, relative, ALiBi, Rotary. A decoder can learn PE in its representation (see proof). Time for 𝗡𝗼𝗣𝗘 𝗟𝗟𝗠𝘀🧵[1/n]

🚨Stop using positional encoding (PE) in Transformer decoders (e.g. GPTs). Our work shows 𝗡𝗼𝗣𝗘 (no positional encoding) outperforms all variants like absolute, relative, ALiBi, Rotary. A decoder can learn PE in its representation (see proof). Time for 𝗡𝗼𝗣𝗘 𝗟𝗟𝗠𝘀🧵[1/n]
Yulia Gryaditskaya (@ygryaditskaya) 's Twitter Profile Photo

How well automatic methods today can understand hand-drawn abstract scene sketches? Paper: arxiv.org/abs/2312.12463 Code: ahmedbourouis.github.io/Scene_Sketch_S… Work by my Ph.D. student Ahmed Bourouis, in collaboration with Judy Fan #cvssp #PAI

How well automatic methods today can understand hand-drawn abstract scene sketches? 

Paper: arxiv.org/abs/2312.12463
Code: ahmedbourouis.github.io/Scene_Sketch_S…

Work by my Ph.D. student <a href="/BrsAhmed/">Ahmed Bourouis</a>, in collaboration with <a href="/judyefan/">Judy Fan</a>
#cvssp #PAI
Siva Reddy (@sivareddyg) 's Twitter Profile Photo

Glad to see that NoPE still holds its superiority over RoPE on length generalization as we also find in [1]. But this FIRE is truly on fire 🔥 [1] x.com/a_kazemnejad/s…

Brenden Lake (@lakebrenden) 's Twitter Profile Photo

Implications: Despite these limitations, we find that a learner can get a strong start in acquiring visual models from a child's input without strong, domain-specific inductive biases. Future algorithmic advances, combined with richer and larger developmental datasets, can be

steven t. piantadosi (@spiantado) 's Twitter Profile Photo

New perspective in Nature Reviews Psychology: human intelligence is a matter of scale of information processing, not genetic changes to one domain. Implications for AI, evolution, and development. - with @CantlonLab rdcu.be/dDoBt

New perspective in <a href="/NatRevPsych/">Nature Reviews Psychology</a>: human intelligence is a matter of scale of information processing, not genetic changes to one domain. Implications for AI, evolution, and development. - with @CantlonLab
rdcu.be/dDoBt
Andrew Saxe (@saxelab) 's Twitter Profile Photo

New preprint! Transformers need multiple circuit mechanisms to improve simultaneously to reduce the loss. Because of this, hypotheses about circuit mechanisms can be tested by clamping some during learning and watching the loss dynamics. A toy model illustrates this ⬇️

Wenjie Li (@wenjiewli) 's Twitter Profile Photo

So excited to be @CogSci2024! Come chat with me about common sense reasoning, deep learning, anything cogsci and more :)

Tal Linzen (@tallinzen) 's Twitter Profile Photo

Can LMs serve as cognitive models of human language processing? Humans make syntactic agreement errors ("the key to the cabinets are rusty"). Suhas Arehalli and I tested if the errors documented in six human studies emerge in LMs. They... sometimes did. direct.mit.edu/opmi/article/d…

Laura Ruis (@lauraruis) 's Twitter Profile Photo

How do LLMs learn to reason from data? Are they ~retrieving the answers from parametric knowledge🦜? In our new preprint, we look at the pretraining data and find evidence against this: Procedural knowledge in pretraining drives LLM reasoning ⚙️🔢 🧵⬇️

How do LLMs learn to reason from data? Are they ~retrieving the answers from parametric knowledge🦜? In our new preprint, we look at the pretraining data and find evidence against this:

Procedural knowledge in pretraining drives LLM reasoning ⚙️🔢

🧵⬇️
Rohan Paul (@rohanpaul_ai) 's Twitter Profile Photo

Your LLM isn't doing math - it's using clever pattern matching tricks. LLMs perform arithmetic using neither robust algorithms nor memorization; rather, they rely on a “bag of heuristics”, as proposed in this paper. 🤔 Original Problem: Do LLMs solve reasoning tasks using

Your LLM isn't doing math - it's using clever pattern matching tricks.

LLMs perform arithmetic using neither robust algorithms nor memorization; rather, they rely on a “bag of heuristics”, as proposed in this paper.

🤔 Original Problem:

Do LLMs solve reasoning tasks using
Andrew Lampinen (@andrewlampinen) 's Twitter Profile Photo

Lots more detail in the preprint here: arxiv.org/abs/2502.20349 Thanks to Wilka Carvalho for spurring and spearheading this work, and to the many others who gave us thoughtful feedback! This paper is definitely a work in progress, so comments and suggestions are welcome!

Griffiths Computational Cognitive Science Lab (@cocosci_lab) 's Twitter Profile Photo

New preprint reveals that large language models blend two distinct representations of numbers -- as strings and as integers -- which can lead to some surprising errors. This work shows how methods from cognitive science can be useful for understanding AI systems.

Peter Tong (@tongpetersb) 's Twitter Profile Photo

Vision models have been smaller than language models; what if we scale them up? Introducing Web-SSL: A family of billion-scale SSL vision models (up to 7B parameters) trained on billions of images without language supervision, using VQA to evaluate the learned representation.

Vision models have been smaller than language models; what if we scale them up?

Introducing Web-SSL: A family of billion-scale SSL vision models (up to 7B parameters) trained on billions of images without language supervision, using VQA to evaluate the learned representation.
Brenden Lake (@lakebrenden) 's Twitter Profile Photo

I'm joining Princeton as an Associate Professor of Computer Science and Psychology this fall! Princeton is ambitiously investing in AI and Natural & Artificial Minds, and I'm excited for my lab to contribute. Recruiting postdocs and Ph.D. students in CS and Psychology — join us!

I'm joining Princeton as an Associate Professor of Computer Science and Psychology this fall! Princeton is ambitiously investing in AI and Natural &amp; Artificial Minds, and I'm excited for my lab to contribute. Recruiting postdocs and Ph.D. students in CS and Psychology — join us!
Andrew Lampinen (@andrewlampinen) 's Twitter Profile Photo

In our symbolic behaviour in AI paper (arxiv.org/abs/2102.03406), we argued that AI should take a similar perspective: building deep learning systems that treat symbols as subjective tools that they can employ as a means to an end.