Hamed Hassani (@hamedshassani) Twitter Tweets • TwiCopy

Reza Shokri

a year ago

Watermark Smoothing Attacks against Language Models: arxiv.org/abs/2407.14206. LLM watermarks as statistical perturbations to token probabilities can be removed *without* significantly reducing text quality. Adversary having access to a weaker model can infer and smooth watermarks

thumb_up_off_alt42

chat_bubble_outline1

repeat8

shareShare

EnCORE Institute

@encoreinstitut

a year ago

We welcome our very distinguished External Advisory Board today (Monday, Aug 26). Jelani Nelson D. Sivakumar Arya Mazumdar Aaron Roth Hamed Hassani Barna Saha

We welcome our very distinguished External Advisory Board today (Monday, Aug 26).
<a href="/minilek/">Jelani Nelson</a>
<a href="/dsivakumar/">D. Sivakumar</a>
<a href="/MountainOfMoon/">Arya Mazumdar</a>
<a href="/Aaroth/">Aaron Roth</a>
<a href="/HamedSHassani/">Hamed Hassani</a>
<a href="/B1ar2n3a/">Barna Saha</a>

thumb_up_off_alt14

chat_bubble_outline0

repeat2

shareShare

The Nobel Prize

@nobelprize

a year ago

BREAKING NEWS The Royal Swedish Academy of Sciences has decided to award the 2024 #NobelPrize in Physics to John J. Hopfield and Geoffrey E. Hinton “for foundational discoveries and inventions that enable machine learning with artificial neural networks.”

thumb_up_off_alt33,33K

chat_bubble_outline1,1K

repeat13,13K

shareShare

Hamed Hassani

@hamedshassani

a year ago

Jailbreaking: From text to actions

thumb_up_off_alt24

chat_bubble_outline0

repeat2

shareShare

Alex Robey

@alexrobey23

a year ago

I'm grateful to have received the Adversarial ML Rising Star Award! 🚀 AdvMLFrontiers is a fantastic venue. Many thanks to the award committee Pin-Yu Chen Bo Li sijia.liu Cho-Jui Hsieh and to the workshop organizers!

thumb_up_off_alt43

chat_bubble_outline4

repeat2

shareShare

Russ Salakhutdinov

@rsalakhu

10 months ago

Jailbreaking LLM-Controlled Robots – Machine Learning Blog | ML@CMU | Carnegie Mellon University blog.ml.cmu.edu/2024/10/29/jai…

thumb_up_off_alt47

chat_bubble_outline4

repeat8

shareShare

IEEE Spectrum

@ieeespectrum

10 months ago

New research shows that AI-driven robots can be easily jailbroken and tricked into doing harmful or dangerous tasks. spectrum.ieee.org/jailbreak-llm?…

thumb_up_off_alt37

chat_bubble_outline3

repeat22

shareShare

Alex Robey

@alexrobey23

9 months ago

Check out Will Knight's recent coverage of our work on jailbreaking LLM-controlled robots in WIRED!

thumb_up_off_alt22

chat_bubble_outline1

repeat5

shareShare

Alex Robey

@alexrobey23

9 months ago

In around an hour (at 3:45pm PST), I'll be giving a talk about jailbreaking LLM-controlled robots at the AdvML workshop at #NeurIPS2024 in East Ballroom C. I'll be at the poster session directly afterward as well if anyone wants to chat about this work! 🤖

thumb_up_off_alt23

chat_bubble_outline2

repeat4

shareShare

Amin Karbasi

@aminkarbasi

7 months ago

🔥🔥🔥 Adversarial reasoning is born. Hot take: The core problem we address in this paper is the role of reasoning in AI safety. While there have been recent efforts by OpenAI arguing that replacing reasoning with increased compute can lead to better defense mechanisms, these

thumb_up_off_alt44

chat_bubble_outline0

repeat7

shareShare

Hamed Hassani

@hamedshassani

7 months ago

Optimal uncertainty quantification for risk-averse decision making:

thumb_up_off_alt37

chat_bubble_outline0

repeat3

shareShare

Shayan Kiyani

@shayankiyani1

7 months ago

Wondering how to make high-stakes decisions using ML—in areas like medicine, robotics, or finance? Our latest work lays out a decision-theoretic foundation for risk-averse uncertainty quantification. If you want to learn how to make better calls when it truly matters, read on!

thumb_up_off_alt32

chat_bubble_outline1

repeat6

shareShare

Arya Mazumdar

@mountainofmoon

7 months ago

And the next EnCORE Institute workshop will be on **Theoretical Perspectives on LLMs** sites.google.com/ucsd.edu/encor… We have a great lineup of participants - and an incredible set of talks. Registration link will be active soon

And the next <a href="/EnCOREInstitut/">EnCORE Institute</a> workshop will be on **Theoretical Perspectives on LLMs** sites.google.com/ucsd.edu/encor… We have a great lineup of participants - and an incredible set of talks. Registration link will be active soon

thumb_up_off_alt79

chat_bubble_outline3

repeat14

shareShare

Po-Shen Loh

@poshenloh

6 months ago

Oh my goodness. GPT-o1 got a perfect score on my Carnegie Mellon University undergraduate #math exam, taking less than a minute to solve each problem. I freshly design non-standard problems for all of my exams, and they are open-book, open-notes. (Problems included below, with links to

Oh my goodness. GPT-o1 got a perfect score on my <a href="/CarnegieMellon/">Carnegie Mellon University</a> undergraduate #math exam, taking less than a minute to solve each problem. I freshly design non-standard problems for all of my exams, and they are open-book, open-notes. (Problems included below, with links to

thumb_up_off_alt2,2K

chat_bubble_outline95

repeat368

shareShare

Amin Karbasi

@aminkarbasi

5 months ago

Our book on “Conditional Gradient Methods” will be published by SIAM. If interested, please check the final version arxiv.org/abs/2211.14103 Authors: G. Braun, A. Carderera, C. Combettes Hamed Hassani Sebastian Pokutta Aryan Mokhtari Burlb: Elad Hazan Francis Bach

thumb_up_off_alt131

chat_bubble_outline1

repeat17

shareShare

Alex Robey

@alexrobey23

5 months ago

A few days ago, we dropped 𝗮𝗻𝘁𝗶𝗱𝗶𝘀𝘁𝗶𝗹𝗹𝗮𝘁𝗶𝗼𝗻 𝘀𝗮𝗺𝗽𝗹𝗶𝗻𝗴 🚀 . . . and we've gotten a little bit of pushback. But whether you're at a frontier lab or developing smaller, open-source models, this research should be on your radar. Here's why 🧵

thumb_up_off_alt32

chat_bubble_outline1

repeat6

shareShare

Sima Noorani

@nooranisimaa

3 months ago

How can we quantify uncertainty in LLMs from only a few sampled outputs? The key lies in the classical problem of missing mass—the probability of unseen outputs. This perspective offers a principled foundation for conformal prediction in query-only settings like LLMs.

thumb_up_off_alt50

chat_bubble_outline1

repeat7

shareShare

Shayan Kiyani

@shayankiyani1

3 months ago

We push conformal prediction and its trade-offs beyond regression & classification — into query-based generative models. Surprisingly (or not?), missing mass & Good-Turing estimators emerge as key tools once again. Very excited about this one!

thumb_up_off_alt19

chat_bubble_outline0

repeat4

shareShare

Alex Robey

@alexrobey23

2 months ago

On Monday, I'll be presenting a tutorial on jailbreaking LLMs + the security of AI agents with Hamed Hassani and Amin Karbasi at ICML. I'll be in Vancouver all week -- send me a DM if you'd like to chat about jailbreaking, AI agents, robots, distillation, or anything else!

thumb_up_off_alt77

chat_bubble_outline2

repeat9

shareShare

Ahmad Beirami @ ICLR 2025

@abeirami

2 months ago

If you are interested in safety/security jailbreaking of LLMs, defenses against them, and how the safety issues become more complicated when we design agentic workflows, this tutorial by Hamed Hassani, Amin Karbasi, Alex Robey is highly recommended

thumb_up_off_alt144

chat_bubble_outline2

repeat17

shareShare