monoxgas (@monoxgas) 's Twitter Profile
monoxgas

@monoxgas

Security engineering, research, exploits, ml.

Co-Founder with @moo_hax at @dreadnode

ID: 199907473

calendar_today08-10-2010 00:38:44

333 Tweet

4,4K Followers

370 Following

AI Safety Papers (@safe_paper) 's Twitter Profile Photo

Are aligned neural networks adversarially aligned? Nicholas Carlini, Milad Nasr (Milad Nasr), Christopher A. Choquette-Choo, Matthew Jagielski, Irena Gao, Anas Awadalla, Pang Wei Koh, Daphne Ippolito (Daphne Ippolito), Katherine Lee (Katherine Lee), Florian Tramèr, Ludwig Schmidt

moo (@moo_hax) 's Twitter Profile Photo

Some players are handling the CTF format better than others (meme from the Discord). Everyone is learning…something. 12 days left, still time to hit the leaderboard! kaggle.com/competitions/a…

Some players are handling the CTF format better than others (meme from the Discord). Everyone is learning…something. 12 days left, still time to hit the leaderboard!

kaggle.com/competitions/a…
monoxgas (@monoxgas) 's Twitter Profile Photo

The most common ask we got after the AI Village @ DEF CON CTF on Kaggle was to make the challenges available all the time. We took our first steps today and look forward to building out a great ML CTF and learning platform. Hope you enjoy!

monoxgas (@monoxgas) 's Twitter Profile Photo

Shout to Rob for the 4 new Bear challenges. Awesome place to get started with great walkthroughs. The roadmap is looking 🔥this year

monoxgas (@monoxgas) 's Twitter Profile Photo

I took an early stab at PGD for LLMs based on arxiv.org/abs/2402.09154 (Simon Geisler). Neat technique to relax the one-hot for gradient updates + projection. Also got to spend some time with litgpt. github.com/dreadnode/rese… Experimental and messy, but enjoy.

monoxgas (@monoxgas) 's Twitter Profile Photo

Crazy ride so far. Will and I continue to learn the importance of having a great team around you. I'll take my time here and extend a huge thank you to the dreadnode team who work extremely hard everyday to build a company with us. You all rock.

dreadnode (@dreadnode) 's Twitter Profile Photo

Introducing AIRTBench, an AI red teaming benchmark for evaluating language models’ ability to autonomously discover and exploit AI/ML security vulnerabilities. Read the paper on arXiv: arxiv.org/abs/2506.14682 Open-source dataset and benchmark eval code repo:

Introducing AIRTBench, an AI red teaming benchmark for evaluating language models’ ability to autonomously discover and exploit AI/ML security vulnerabilities.

Read the paper on arXiv: arxiv.org/abs/2506.14682 

Open-source dataset and benchmark eval code repo: