Pranava Madhyastha (@foobarin) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

One core learning we had with Chameleon is that the intended form of the modality is a modality in itself. Visual Perception and Visual Generation are two separate modalities and must be treated as such; hence, using discretized tokens for perception is wrong.

thumb_up_off_alt186

chat_bubble_outline14

repeat16

shareShare

Bartłomiej Cupiał

@cupiabart

a year ago

So here's a story of, by far, the weirdest bug I've encountered in my CS career. Along with Maciej Wołczyk we've been training a neural network that learns how to play NetHack, an old roguelike game, that looks like in the screenshot. Recenlty, something unexpected happened.

So here's a story of, by far, the weirdest bug I've encountered in my CS career.

Along with <a href="/maciejwolczyk/">Maciej Wołczyk</a> we've been training a neural network that learns how to play NetHack, an old roguelike game, that looks like in the screenshot. Recenlty, something unexpected happened.

thumb_up_off_alt8,8K

chat_bubble_outline136

repeat1,1K

shareShare

Delip Rao e/σ

@deliprao

a year ago

lol.. wrappers be like

thumb_up_off_alt53

chat_bubble_outline3

repeat2

shareShare

Dr Christopher Madan 🐘🧠💻 (he/him)

@cmadan

a year ago

"A random half of panelists were shown a CV and only a one-paragraph summary of the proposed research, while the other half were shown a CV and a full proposal. We find that withholding proposal texts from panelists did not detectibly impact rankings." link.springer.com/article/10.100…

thumb_up_off_alt665

chat_bubble_outline27

repeat257

shareShare

Arvind Narayanan

@random_walker

a year ago

New essay: ML seems to promise discovery without understanding, but this is fool's gold that has led to a reproducibility crisis in ML-based science. aisnakeoil.com/p/scientists-s… (with Sayash Kapoor). In 2021 we compiled evidence that an error called leakage is pervasive in ML models

thumb_up_off_alt999

chat_bubble_outline28

repeat291

shareShare

Edward Grefenstette

@egrefen

a year ago

Nathan Benaich THEY WROTE ERROR BOUNDS SO IT’S REAL SCIENCE

thumb_up_off_alt26

chat_bubble_outline1

repeat1

shareShare

Yann LeCun

@ylecun

a year ago

Thoughts exist without language.

thumb_up_off_alt2,2K

chat_bubble_outline595

repeat263

shareShare

Kyunghyun Cho

@kchonyc

a year ago

PLEASE release this custom html parse PLEASE 🙏

thumb_up_off_alt114

chat_bubble_outline2

repeat5

shareShare

Christopher Manning

@chrmanning

a year ago

I agree with much of both @emilymbender.bsky.social’s #ACL2024 presidential talk and (((ل()(ل() 'yoav))))👾’s rejoinder, but I want to comment on just one aspect where I disagree with both: the definition and domain of CL vs NLP. 🧵👇

thumb_up_off_alt157

chat_bubble_outline3

repeat15

shareShare

Tal Linzen

@tallinzen

a year ago

as SAC for EMNLP I was asked to read the discussions between the authors and reviewers and had every intention to do so but the length of the discussions is out of control. many tables with results of new experiments, hundreds of lines of code (!). bring back word limits please.

thumb_up_off_alt25

chat_bubble_outline2

repeat3

shareShare

Delip Rao e/σ

@deliprao

10 months ago

Unless you are an OpenAI employee working on improving their products, I don’t understand why such efforts are science. Why are we (question to faculty) spending taxpayer dollars in doing QA for a closed product by a well-capitalized company that does not give back to science?

thumb_up_off_alt757

chat_bubble_outline55

repeat64

shareShare

kyutai

@kyutai_labs

10 months ago

Today, we release several Moshi artifacts: a long technical report with all the details behind our model, weights for Moshi and its Mimi codec, along with streaming inference code in Pytorch, Rust and MLX. More details below 🧵 ⬇️ Paper: kyutai.org/Moshi.pdf Repo:

thumb_up_off_alt1,1K

chat_bubble_outline50

repeat401

shareShare

Emile van Krieken

@emilevankrieken

10 months ago

Great workshop at AAAI about low-rank representations! These have important consequences for Neurosymbolic: Logical circuits can be understood as low-rank factorisations.

thumb_up_off_alt15

chat_bubble_outline2

repeat3

shareShare

Soumith Chintala

@soumithchintala

10 months ago

"How to train a model on 10k H100 GPUs?" has now been immortalized on my blog: soumith.ch/blog/2024-10-0…

thumb_up_off_alt599

chat_bubble_outline8

repeat78

shareShare

Edward Grefenstette

@egrefen

10 months ago

This “English(/French/etc) is the new code” trope bemuses me, even as someone bullish about LLMs substantially changing how we code. A 🧵(1/9)

thumb_up_off_alt14

chat_bubble_outline4

repeat2

shareShare

antonio vergari - hiring PhD students

@tetraduzione

10 months ago

great opportunity to do a #PhD in #Europe in #ML #AI 🚨🚨🚨 I'll hire 2 students via #ELLIS this year, reach out if you want to do research in: - #reliable and #efficient ML in the wild - scalable #neurosymbolic #nesy AI - #lowrank representations - #tractable inference 🚨🚨🚨

thumb_up_off_alt119

chat_bubble_outline2

repeat27

shareShare

(((ل()(ل() 'yoav))))👾

@yoavgo

9 months ago

i think this is the wrong question. yes, CS graduates are very bad in software development, and a dedicated LLM can be better. but put these graduates in a job, and some will develop to be "senior devs" at some point, capable of working on real-life, large systems. LLMs won't.

thumb_up_off_alt54

chat_bubble_outline1

repeat4

shareShare

(((ل()(ل() 'yoav))))👾

@yoavgo

9 months ago

the term "experts" in "mixture of experts" in the context of LLMs is highly misleading and does way more harm than good in coming up with a conceptual representation of what this component brings to the table.

thumb_up_off_alt169

chat_bubble_outline10

repeat11

shareShare

Christopher Manning

@chrmanning

9 months ago

Papers at #EMNLP2024 #3 A counter-example to the frequently adopted mech interp linear representation hypothesis: Recurrent Neural Networks Learn … Non-Linear Representations Fri Nov 15 BlackboxNLP 2024 poster aclanthology.org/2024.blackboxn… CC Csordás Róbert Christopher Potts

thumb_up_off_alt89

chat_bubble_outline2

repeat23

shareShare

(((ل()(ل() 'yoav))))👾

@yoavgo

7 months ago

(a) how did MMLU become the defacto standard benchmark every LLM is trying to beat? (b) it is estimated to contain 9% questions that human experts think are wrong. do we know if humans and models agree on which ones belong in this 9%?

thumb_up_off_alt28

chat_bubble_outline3

repeat2

shareShare

Pranava Madhyastha

Gate.io

Armen Aghajanyan

Bartłomiej Cupiał

Delip Rao e/σ

Dr Christopher Madan 🐘🧠💻 (he/him)

Arvind Narayanan

Edward Grefenstette

Yann LeCun

Kyunghyun Cho

Christopher Manning

Tal Linzen

Delip Rao e/σ

kyutai

Emile van Krieken

Soumith Chintala

Edward Grefenstette

antonio vergari - hiring PhD students

(((ل()(ل() 'yoav))))👾

(((ل()(ل() 'yoav))))👾

Christopher Manning

(((ل()(ل() 'yoav))))👾