Aleksander Madry (@aleks_madry) 's Twitter Profile
Aleksander Madry

@aleks_madry

OpenAI and MIT faculty (on leave) @aleksmadry.bsky.social

ID: 882511862524465152

linkhttps://madrylab.mit.edu/ calendar_today05-07-2017 08:09:59

924 Tweet

34,34K Followers

200 Following

Ben Cohen-Wang (@bcohenwang) 's Twitter Profile Photo

It can be helpful to pinpoint the in-context information that a language model uses when generating content (is it using provided documents? or its own intermediate thoughts?). We present Attribution with Attention (AT2), a method for doing so efficiently and reliably! (1/8)

It can be helpful to pinpoint the in-context information that a language model uses when generating content (is it using provided documents? or its own intermediate thoughts?). We present Attribution with Attention (AT2), a method for doing so efficiently and reliably! (1/8)
Ananya Kumar (@ananyaku) 's Twitter Profile Photo

We are at ICLR this week! If you are interested in OpenAI, our team, or just generally chatting about AI, please reach out!

Sarah Cen (@cen_sarah) 's Twitter Profile Photo

This work has been a long time coming and I'm so grateful to my collaborators for helping make this work possible. Main takeaway: AI supply chains matter! We've seen them emerge (rapidly) in the past few years and they will have implications on *all* of us, inside and outside

Aspen (@aspenkhopkins) 's Twitter Profile Photo

AI supply chains introduce new challenges to traditional formulations of deployment, safety, & evaluation. But! They can be studied! Come join the conversation (& read our new paper ✨) !

Shivin Dass (@shivindass) 's Twitter Profile Photo

Ever wondered which data from large datasets (like OXE) actually helps when training/tuning a policy for specific tasks? We present DataMIL, a framework for measuring how each training sample influences policy performance, hence enabling effective data selection 🧵

Bowen Baker (@bobabowen) 's Twitter Profile Photo

Modern reasoning models think in plain English. Monitoring their thoughts could be a powerful, yet fragile, tool for overseeing future AI systems. I and researchers across many organizations think we should work to evaluate, preserve, and even improve CoT monitorability.

Modern reasoning models think in plain English.

Monitoring their thoughts could be a powerful, yet fragile, tool for overseeing future AI systems.

I and researchers across many organizations think we should work to evaluate, preserve, and even improve CoT monitorability.