Niklas Risse (@niklas2484) 's Twitter Profile
Niklas Risse

@niklas2484

PhD candidate @maxplanckpress (MPI-SP) | AI for Security

ID: 3405810028

linkhttps://www.nrisse.com calendar_today06-08-2015 15:17:10

72 Tweet

180 Followers

345 Following

Konrad Rieck 🌈 (@mlsec) 's Twitter Profile Photo

Got some hot research cooking? 🔥 Two weeks until the SaTML Conference paper deadline! We’re eager to see your work on secure, private, and fair machine learning, as well as any other aspects of machine learning system security. 👉 satml.org/participate-cf… ⏰ Deadline: Sep 18

hardmaru (@hardmaru) 's Twitter Profile Photo

It’s good to see the community raise the issue of extreme overfitting to evaluation benchmarks for LLMs. This is the real “Reflection” that the open-source community needs to have.

Marcel Böhme👨‍🔬 (@mboehme_) 's Twitter Profile Photo

How does it feel like to do world-class research? If you are a CS undergrad who is interested in our topics, the Software Security group at #MPI_SP is hiring interns for summer & winter 2025! Details: 📅 01 November 2024 ✍️ cis.mpg.de/internships/ 🛡️ mpi-softsec.github.io

Abraham Mhaidli (@abrahammhaidli) 's Twitter Profile Photo

Hi all! I am looking to recruit 1-2 summer 2025 research interns to work with me at MPI-SP in Bochum, Germany, on topics relating to the ethics and harms of emerging technologies, including (but not limited to!) virtual reality, brain computer interfaces, and more! (1/2)

Marcel Böhme👨‍🔬 (@mboehme_) 's Twitter Profile Photo

Very lucky to receive the ERC Consolidator this year! This is 5-year funding for groundbreaking research. If you are interested in our perspective on software security analysis at scale, stick around and read on. European Research Council (ERC) #ERCCoG #MPI_SP CASA - Cluster of Excellence for Cyber Security mpi-sp.org/71953/news_pub…

Jürgen Schmidhuber (@schmidhuberai) 's Twitter Profile Photo

1995-2025: The Decline of Germany & Japan vs US & China. Can All-Purpose Robots Fuel a Comeback? In 1995, in terms of nominal GDP, a combined Germany and Japan were almost 1:1 economically with a combined USA and China. Only 3 decades later, this ratio is now down to 1:5!

1995-2025: The Decline of Germany & Japan vs US & China. Can All-Purpose Robots Fuel a Comeback?

In 1995, in terms of nominal GDP, a combined Germany and Japan were almost 1:1 economically with a combined USA and China. Only 3 decades later, this ratio is now down to 1:5!
Nacho Mellado (@uavster) 's Twitter Profile Photo

Apple gets it. Robots are going to be everywhere, but they won’t look like robots. Check out their new paper ELEGNT. I believe this is the future of everyday objects: helpful and human.

Niklas Risse (@niklas2484) 's Twitter Profile Photo

Function-level vulnerability detection is dead. Excited to share that our paper "Top Score on the Wrong Exam: On Benchmarking in Machine Learning for Vulnerability Detection" got into ISSTA 2025. We show that function-level vulnerability detection is fundamentally flawed — and

Function-level vulnerability detection is dead.

Excited to share that our paper "Top Score on the Wrong Exam: On Benchmarking in Machine Learning for Vulnerability Detection" got into ISSTA 2025.

We show that function-level vulnerability detection is fundamentally flawed — and
Marcel Böhme👨‍🔬 (@mboehme_) 's Twitter Profile Photo

Our paper "Top Score on the Wrong Exam" paper will be presented at #ISSTA25 🐣 in Trondheim! 📝mpi-softsec.github.io/papers/ISSTA25… 🧑‍💻github.com/niklasrisse/To… // Niklas Risse Jing Liu (fuzzing.bsky.social).

Our paper "Top Score on the Wrong Exam" paper will be presented at #ISSTA25 🐣 in Trondheim!

📝mpi-softsec.github.io/papers/ISSTA25…
🧑‍💻github.com/niklasrisse/To…

// <a href="/niklas2484/">Niklas Risse</a> <a href="/fuzzjing/">Jing Liu (fuzzing.bsky.social)</a>.
Marcel Böhme👨‍🔬 (@mboehme_) 's Twitter Profile Photo

Thrilled to share a recent opinion piece at the IEEE Security and Privacy (Vol. 23, Issue 3). Basically a long-term perspective on the field meant for both researchers and practitioners. 📝 ieeexplore.ieee.org/stamp/stamp.js…

Thrilled to share a recent opinion piece at the IEEE Security and Privacy (Vol. 23, Issue 3).

Basically a long-term perspective on the field meant for both researchers and practitioners.

📝 ieeexplore.ieee.org/stamp/stamp.js…
Niklas Risse (@niklas2484) 's Twitter Profile Photo

Proud to share that our paper “Top Score on the Wrong Exam: On Benchmarking in Machine Learning for Vulnerability Detection” received an ACM Distinguished Paper Award at ISSTA 2025 in Trondheim, Norway. If you’re interested, the paper is available here: dl.acm.org/doi/10.1145/37…

Proud to share that our paper “Top Score on the Wrong Exam: On Benchmarking in Machine Learning for Vulnerability Detection” received an ACM Distinguished Paper Award at ISSTA 2025 in Trondheim, Norway.

If you’re interested, the paper is available here: dl.acm.org/doi/10.1145/37…
Marcel Böhme👨‍🔬 (@mboehme_) 's Twitter Profile Photo

Can we statistically estimate how likely an LLM-generated program is correct w/o knowing what is a correct program for that task? Sounds impossible-but it's actually really simple. In fact our oracle-less eval can reliably substitute a pass@1 based eval. arxiv.org/abs/2507.00057

Can we statistically estimate how likely an LLM-generated program is correct w/o knowing what is a correct program for that task?

Sounds impossible-but it's actually really simple. In fact our oracle-less eval can reliably substitute a pass@1 based eval.

arxiv.org/abs/2507.00057
Lukas Seidel (@pr0me) 's Twitter Profile Photo

Alex Plaskett the chunk-based approach seems neat, but it's also another (func-level) ML4VD paper with the common flaws imho. I think people publishing in this domain should at least cite and address Niklas Risse's "Top Score on the Wrong Exam: On Benchmarking in Machine Learning for

François Chollet (@fchollet) 's Twitter Profile Photo

The proprietary frontier models of today are ephemeral artifacts. Essentially very expensive sandcastles. Destined to be washed away by the rising tide of open source replication (first) and algorithmic disruption (later).

Marcel Böhme👨‍🔬 (@mboehme_) 's Twitter Profile Photo

Looking for a PostDoc, a PhD, and 3-6mth interns as part of my ERC project. Homepage: mboehme.github.io Böhme Lab: mpi-softsec.github.io Reach out if you find this interesting. 👇

Nando de Freitas (@nandodf) 's Twitter Profile Photo

The only bitter lesson is that LLMs have succeeded beyond any expert expectations. Underpinning LLMs is the idea of scaling, which is too often misunderstood as more parameters. Scaling is about using massive compute effectively to maximise the throughput of data ingestion into