Chawin Sitawarin (@csitawarin) 's Twitter Profile
Chawin Sitawarin

@csitawarin

Research Scientist @GoogleDeepMind. Postdoc @Meta. PhD @UCBerkeley. ML security 👹 privacy 👀 robustness 🛡️

ID: 3073512017

linkhttps://chawins.github.io/ calendar_today05-03-2015 21:52:22

62 Tweet

230 Followers

628 Following

Yizheng Chen (@surrealyz) 's Twitter Profile Photo

I am recruiting PhD students & Postdocs on AI Security, LLM Agents, Code Generation research at UMD Computer Science UMD Department of Computer Science & Maryland Cybersecurity Center Maryland Cybersecurity Center (MC2) For PhD program pls mention me in your application cs.umd.edu/grad/apply. For Postdocs please email me.

Michael Aerni @ ICLR (@aernimichael) 's Twitter Profile Photo

LLMs may be copying training data in everyday conversations with users! In our latest work, we study how often this happens compared to humans. 👇🧵

LLMs may be copying training data in everyday conversations with users!

In our latest work, we study how often this happens compared to humans. 👇🧵
Nikola Jovanović @ ICLR 🇸🇬 (@ni_jovanovic) 's Twitter Profile Photo

SynthID-Text by Google DeepMind is the first large-scale LLM watermark deployment, but its behavior in adversarial scenarios is largely unexplored. In our new blogpost, we apply the recent works from SRI Lab and find that... 👇🧵

Sicheng Zhu (@sichengzhuml) 's Twitter Profile Photo

Using GCG to jailbreak Llama 3 yields only a 14% attack success rate. Is GCG hitting a wall, or is Llama 3 just safer? We found that simply replacing the generic "Sure, here is***" target prefix with our tailored prefix boosts success rates to 80%. (1/8)

Using GCG to jailbreak Llama 3 yields only a 14% attack success rate. Is GCG hitting a wall, or is Llama 3 just safer? We found that simply replacing the generic "Sure, here is***" target prefix with our tailored prefix boosts success rates to 80%. (1/8)
Max Nadeau (@maxnadeau_) 's Twitter Profile Photo

🧵 Announcing Open Philanthropy's Technical AI Safety RFP! We're seeking proposals across 21 research areas to help make AI systems more trustworthy, rule-following, and aligned, even as they become more capable.

🧵 Announcing <a href="/open_phil/">Open Philanthropy</a>'s Technical AI Safety RFP!

We're seeking proposals across 21 research areas to help make AI systems more trustworthy, rule-following, and aligned, even as they become more capable.
Edoardo Debenedetti (@edoardo_debe) 's Twitter Profile Photo

1/🔒Worried about giving your agent advanced capabilities due to prompt injection risks and rogue actions? Worry no more! Here's CaMeL: a robust defense against prompt injection attacks in LLM agents that provides formal security guarantees without modifying the underlying model!

1/🔒Worried about giving your agent advanced capabilities due to prompt injection risks and rogue actions? Worry no more! Here's CaMeL: a robust defense against prompt injection attacks in LLM agents that provides formal security guarantees without modifying the underlying model!
Tong Wu (@tongwu_pton) 's Twitter Profile Photo

🛠️ Still doing prompt engineering for R1 reasoning models? 🧩 Why not do some "engineering" in reasoning as well? Introducing our new paper, Effectively Controlling Reasoning Models through Thinking Intervention. 🧵[1/n]

🛠️ Still doing prompt engineering for R1 reasoning models?
🧩 Why not do some "engineering" in reasoning as well?
Introducing our new paper, Effectively Controlling Reasoning Models through Thinking Intervention. 
🧵[1/n]
Andreas Terzis (@aterzis) 's Twitter Profile Photo

We are starting our journey on making Gemini robust to prompt injections and in this paper we present the steps we have taken so far. A collective effort by the GDM Security & Privacy Research team spanning over > 1 year.

jack morris (@jxmnop) 's Twitter Profile Photo

new paper from our work at Meta! **GPT-style language models memorize 3.6 bits per param** we compute capacity by measuring total bits memorized, using some theory from Shannon (1953) shockingly, the memorization-datasize curves look like this: ___________ / / (🧵)

new paper from our work at Meta!

**GPT-style language models memorize 3.6 bits per param**

we compute capacity by measuring total bits memorized, using some theory from Shannon (1953)

shockingly, the memorization-datasize curves look like this:
      ___________
  /
/

(🧵)
Chawin Sitawarin (@csitawarin) 's Twitter Profile Photo

I will be at ICML this year after a full long year of not attending any conference :) Happy to chat, and please don’t hesitate to reach out here, email, on Whova, or in person 🥳

Konpat Ta Preechakul (@phizaz) 's Twitter Profile Photo

Some problems can’t be rushed—they can only be done step by step, no matter how many people or processors you throw at them. We’ve scaled AI by making everything bigger and more parallel: Our models are parallel. Our scaling is parallel. Our GPUs are parallel. But what if the

Chawin Sitawarin (@csitawarin) 's Twitter Profile Photo

Very cool thought-provoking piece! In practice, computation units are much more nuanced than what theories capture. But just trying to identify classes of problems that benefit from sequential computation (or is unsolvable without it) seems very useful!

Konrad Rieck 🌈 (@mlsec) 's Twitter Profile Photo

🚨 Got a great idea for an AI + Security competition? SaTML Conference is now accepting proposals for its Competition Track! Showcase your challenge and engage the community. 👉 satml.org/call-for-compe… 🗓️ Deadline: Aug 6

🚨 Got a great idea for an AI + Security competition?

<a href="/satml_conf/">SaTML Conference</a> is now accepting proposals for its Competition Track! Showcase your challenge and engage the community.

👉 satml.org/call-for-compe…
🗓️ Deadline: Aug 6