Kristian Georgiev (@kris_georgiev1) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

In ML, we train on biased (huge) datasets ➡️ models encode spurious corrs and fail on minority groups. Can we scalably remove "bad" data? w/ Saachi Jain Kimia Hamidieh Kristian Georgiev Andrew Ilyas Marzyeh we propose D3M, a method for exactly this: gradientscience.org/d3m/

In ML, we train on biased (huge) datasets ➡️ models encode spurious corrs and fail on minority groups. Can we scalably remove "bad" data?

w/ <a href="/saachi_jain_/">Saachi Jain</a> <a href="/kimiahmdh/">Kimia Hamidieh</a> <a href="/kris_georgiev1/">Kristian Georgiev</a> <a href="/andrew_ilyas/">Andrew Ilyas</a> <a href="/MarzyehGhassemi/">Marzyeh</a> we propose D3M, a method for exactly this: gradientscience.org/d3m/

thumb_up_off_alt122

chat_bubble_outline2

repeat22

shareShare

Eric Wong

@riceric22

a year ago

Traditional concept vectors used to explain deep representations fail to compose when combined, i.e. 🐤(small) +🦢(white) =🦩(big & colorful)❌ We propose CCE: a method for extracting *composable* concepts, i.e. 🐤(small) +🦢(white) =🕊️(small & white)✅ debugml.github.io/compositional-…

thumb_up_off_alt30

chat_bubble_outline0

repeat7

shareShare

Aleksander Madry

@aleks_madry

a year ago

Excited to share something I was working on for a while now: the Before AGI podcast! There is a lot of talking about AGI. But what’s less explored is an equally important question: What happens before AGI arrives? And what should we be doing to prepare? First episode with

thumb_up_off_alt167

chat_bubble_outline8

repeat28

shareShare

Aleksander Madry

@aleks_madry

a year ago

At #ICML2024 ? Our tutorial "Data Attribution at Scale" will be to tomorrow at 9:30 AM CEST in Hall A1! I will not be able to make it (but will arrive later that day), but my awesome students Andrew Ilyas Sam Park Logan Engstrom will carry the torch :)

thumb_up_off_alt70

chat_bubble_outline6

repeat10

shareShare

Aleksander Madry

@aleks_madry

a year ago

Attending #ICML2024? Check out our work on decomposing predictions and editing model behavior via targeted interventions to model internals! Poster: #2513, Hall C 4-9, 1:30p (Tue) Paper: arxiv.org/abs/2404.11534 w/ Harshay Shah Andrew Ilyas

thumb_up_off_alt64

chat_bubble_outline7

repeat11

shareShare

Alireza Fallah

@afallah94

a year ago

On Saturday, I will be giving a talk at 11:40 am at the ICML @agenticmarkets workshop on our work on three-layer data markets: arxiv.org/abs/2402.09697. I would discuss a model to study data monetization and the role of privacy regulations in the presence of strategic agents!

thumb_up_off_alt31

chat_bubble_outline1

repeat3

shareShare

Andrew Ilyas

@andrew_ilyas

a year ago

Thanks to all who attended our tutorial "Data Attribution at Scale" at ICML (w/ Sam Park Logan Engstrom Kristian Georgiev Aleksander Madry)! We're really excited to see the response to this emerging topic. Slides, notes, ICML video: ml-data-tutorial.org Public recording soon!

Thanks to all who attended our tutorial "Data Attribution at Scale" at ICML (w/ <a href="/smsampark/">Sam Park</a> <a href="/logan_engstrom/">Logan Engstrom</a> <a href="/kris_georgiev1/">Kristian Georgiev</a> <a href="/aleks_madry/">Aleksander Madry</a>)! We're really excited to see the response to this emerging topic.

Slides, notes, ICML video: ml-data-tutorial.org
Public recording soon!

thumb_up_off_alt157

chat_bubble_outline2

repeat27

shareShare

Andrew Ilyas

@andrew_ilyas

a year ago

The ATTRIB workshop is back @ NeurIPS 2024! We welcome papers connecting model behavior to data, algorithms, parameters, scale, or anything else. Submit by Sep 18! More info: attrib-workshop.cc Co-organizers: Tolga Bolukbasi Logan Engstrom Sadhika Malladi Elisa Nguyen Sam Park

thumb_up_off_alt50

chat_bubble_outline0

repeat14

shareShare

AK

@_akhaliq

10 months ago

ContextCite Attributing Model Generation to Context paper page: huggingface.co/papers/2409.00… How do language models use information provided as context when generating a response? Can we infer whether a particular generated statement is actually grounded in the context, a

thumb_up_off_alt54

chat_bubble_outline0

repeat14

shareShare

Andrew Ilyas

@andrew_ilyas

8 months ago

Machine unlearning ("removing" training data from a trained ML model) is a hard, important problem. Datamodel Matching (DMM): a new unlearning paradigm with strong empirical performance! w/ Kristian Georgiev Roy Rinberg Sam Park Shivam Garg Aleksander Madry Seth Neel (1/4)

thumb_up_off_alt137

chat_bubble_outline2

repeat23

shareShare

Andrew Ilyas

@andrew_ilyas

8 months ago

DMM is a *meta-algorithm*, so better data attribution ➡️ better oracle predictions ➡️ better unlearning! Check out our work for details on DMM, new techniques for evaluating unlearning, theoretical analyses, and more! arXiv: arxiv.org/abs/2410.23232 Blog: bit.ly/unlearning-via…

thumb_up_off_alt11

chat_bubble_outline0

repeat1

shareShare

Gautam Kamath

@thegautamkamath

7 months ago

I wrote a survey article on computationally efficient methods for "robust" mean estimation, which includes robustness to contamination, heavy-tailed data, or in the sense of differential privacy. The same ideas are useful for all 3 (seemingly-different) forms of robustness! 1/2

thumb_up_off_alt75

chat_bubble_outline2

repeat9

shareShare

Seth Neel

@sethinternet

6 months ago

Excited to see this new paper published in Transactions on Machine Learning Research! I study the problem of simultaneously estimating many private regressions that share the same set of covariates X but have l different outcomes Y. For example, X might be a persons genomic data, and Y's might correspond to

Excited to see this new paper published in <a href="/TmlrOrg/">Transactions on Machine Learning Research</a>! I study the problem of simultaneously estimating many private regressions that share the same set of covariates X but have l different outcomes Y. For example, X might be a persons genomic data, and Y's might correspond to

thumb_up_off_alt10

chat_bubble_outline2

repeat1

shareShare

Mckay Wrigley

@mckaywrigley

5 months ago

Andrej Karpathy The hottest new programming language is vibes

thumb_up_off_alt378

chat_bubble_outline8

repeat23

shareShare

Sitan Chen

@sitanch

5 months ago

Clean mathematical explanation for why critical windows appear for any localization-based sampler! Also interesting that in LLMs, these often correspond to failures in reasoning. Congrats Marvin Li and Aayush Karan for the cool mix of theory and empirical work!

thumb_up_off_alt26

chat_bubble_outline0

repeat3

shareShare

zestular

@zestular

5 months ago

le chat doesn’t disappoint

thumb_up_off_alt62,62K

chat_bubble_outline185

repeat2,2K

shareShare

Nat McAleese

@__nmca__

4 months ago

large reasoning models are extremely good at reward hacking. A thread of examples from OpenAI's recent monitoring paper: (0/n)

thumb_up_off_alt948

chat_bubble_outline14

repeat80

shareShare

Logan Engstrom

@logan_engstrom

4 months ago

Want state-of-the-art data curation, data poisoning & more? Just do gradient descent! w/ Andrew Ilyas Ben Chen Axel Feldmann Billy Moses Aleksander Madry: we show how to optimize final model loss wrt any continuous variable. Key idea: Metagradients (grads through model training)

Want state-of-the-art data curation, data poisoning & more? Just do gradient descent!

w/ <a href="/andrew_ilyas/">Andrew Ilyas</a> Ben Chen <a href="/axel_s_feldmann/">Axel Feldmann</a> <a href="/wsmoses/">Billy Moses</a> <a href="/aleks_madry/">Aleksander Madry</a>: we show how to optimize final model loss wrt any continuous variable.

Key idea: Metagradients (grads through model training)

thumb_up_off_alt162

chat_bubble_outline9

repeat29

shareShare

Joe Palermo

@joepalerm0

4 months ago

We’ve been hard at work on reinforcement fine-tuning (RFT) to make it a more flexible and powerful tool. RFT is best thought of as a way to improve model capabilities on well-specified tasks with known correct answers. It shouldn't come as a surprise that models can often get

thumb_up_off_alt60

chat_bubble_outline2

repeat12

shareShare

Marc Finzi

@m_finzi

3 months ago

Why do larger language models generalize better? In our new ICLR paper, we derive an interpretable generalization bound showing that compute-optimal LLMs provably generalize better with scale! 📄arxiv.org/abs/2504.15208 1/7🧵

thumb_up_off_alt124

chat_bubble_outline3

repeat30

shareShare