Mikayel Samvelyan (@_samvelyan) 's Twitter Profile
Mikayel Samvelyan

@_samvelyan

Research Scientist @GoogleDeepMind. Previously @Meta, @Reddit, @UCL, and @UniofOxford.

ID: 958312958593064961

linkhttps://samvelyan.com calendar_today30-01-2018 12:16:28

735 Tweet

1,1K Followers

376 Following

Kenneth Stanley (@kenneth0stanley) 's Twitter Profile Photo

Completely agree with this point. How can you really take AI safety seriously without grappling directly and explicitly with the fact the world is open-ended? Rainbow Teaming is a great example!

Michael Dennis (@michaeld1729) 's Twitter Profile Photo

The unpredictability of open-ended systems allow us to prepare agents for an unpredictable reality. AI development is itself a black-swan event, and if we cannot make AI systems to robust to the dramatic distribution shift they will cause, they will induce their own failures.

Laura Ruis (@lauraruis) 's Twitter Profile Photo

This got accepted to #ICML2025 as a *spotlight paper* (top 2.6%!) 🚀 --- work that Yi Xu did as an Msc student! Surely this will mark the start of an exceptional academic journey

Tim RocktÀschel (@_rockt) 's Twitter Profile Photo

Our UCL DARK MSc student Yi Xu managed to get his work accepted as a spotlight paper at ICML Conference 2025 (top 2.6% submissions) 🚀 What an amazing success testament to the outstanding supervision by Robert Kirk and Laura Ruis.

Mikayel Samvelyan (@_samvelyan) 's Twitter Profile Photo

Huge congratulations to my academic sister Laura on getting a postdoc position at MIT! 🧠✹ So proud of everything she’s achieved — can’t wait to see all the amazing things she’ll do there. 🚀

Deedy (@deedydas) 's Twitter Profile Photo

Google's AI just made math discoveries NO human has! —Solved optimal packing of 11 and 12 hexagons in hexagons. —Reduced 4x4 matrix multiplication from 49 operations to 48 (first advance in 56 years!) and many more. AlphaEvolve is the AlphaGo 'move 37' moment for math. Insane.

Google's AI just made math discoveries NO human has!

—Solved optimal packing of 11 and 12 hexagons in hexagons.
—Reduced 4x4 matrix multiplication from 49 operations to 48 (first advance in 56 years!)
and many more.

AlphaEvolve is the AlphaGo 'move 37' moment for math. Insane.
Nathan Benaich (@nathanbenaich) 's Twitter Profile Photo

"open-endedness is all we'll need"...this is the study of a system’s ability to continuously generate artifacts that are both novel and learnable to an observer as a route to agi. excited to have Edward Hughes from Google DeepMind's open-endedness team join us at RAAIS 2025!

"open-endedness is all we'll need"...this is the study of a system’s ability to continuously generate artifacts that are both novel and learnable to an observer as a route to agi.

excited to have <a href="/edwardfhughes/">Edward Hughes</a> from <a href="/GoogleDeepMind/">Google DeepMind</a>'s open-endedness team join us at <a href="/raais/">RAAIS</a> 2025!
Tim RocktÀschel (@_rockt) 's Twitter Profile Photo

Proud to announce that Dr akbir. defended his PhD thesis titled "Safe Automated Research" last week đŸ„ł. Massive thanks to Murray Shanahan and Pontus Stenetorp for examining! As is customary, Akbir received a personal mortarboard from UCL DARK. Details 👇

Proud to announce that Dr <a href="/akbirkhan/">akbir.</a> defended his PhD thesis titled "Safe Automated Research" last week đŸ„ł. Massive thanks to <a href="/mpshanahan/">Murray Shanahan</a> and Pontus Stenetorp for examining! As is customary, Akbir received a personal mortarboard from <a href="/UCL_DARK/">UCL DARK</a>. Details 👇
Cong Lu (@cong_ml) 's Twitter Profile Photo

Schmidhuber's Gödel Machine: AI "rewriting its code" if provably useful showed the dream of recursive self-improvement 🔄 Thrilled to share our practical realization, inspired by Darwinian evolution! Done with the amazing Jenny Zhang, Shengran Hu, Robert Lange Jeff Clune 😍

Jenny Zhang (@jennyzhangzt) 's Twitter Profile Photo

One promising direction is combining ideas from AlphaEvolve and the Darwin Gödel Machine. Imagine a self-referential system improving itself even at the lowest algorithmic levels at *scale* AlphaEvolve: deepmind.google/discover/blog/
 Darwin Gödel Machine: arxiv.org/abs/2505.22954

Edward Hughes (@edwardfhughes) 's Twitter Profile Photo

What an enormous privilege to give the opening lecture at the OxML summer school this morning. Never have I had such a thought-provoking set of audience questions! Here's to the automation of innovation towards human flourishing alongside the next generation of researchers.

Cong Lu (@cong_ml) 's Twitter Profile Photo

🚀Introducing “StochasTok: Improving Fine-Grained Subword Understanding in LLMs”!🚀 LLMs are incredible but still struggle disproportionately with subword tasks, e.g., for character counts, wordplay, multi-digit numbers, fixing typos
 Enter StochasTok, led by Anya Sims! [1/]

🚀Introducing “StochasTok: Improving Fine-Grained Subword Understanding in LLMs”!🚀

LLMs are incredible but still struggle disproportionately with subword tasks, e.g., for character counts, wordplay, multi-digit numbers, fixing typos
 Enter StochasTok, led by <a href="/anyaasims/">Anya Sims</a>!

[1/]
Nathan Herr (@naitherr) 's Twitter Profile Photo

Excited to introduce LLM-First Search (LFS) - a new paradigm where the language model takes the lead in reasoning and search! LFS is a self-directed search method that empowers LLMs to guide the exploration process themselves, without relying on predefined heuristics or fixed

Excited to introduce LLM-First Search (LFS) -  a new paradigm where the language model takes the lead in reasoning and search!

LFS is a self-directed search method that empowers LLMs to guide the exploration process themselves, without relying on predefined heuristics or fixed
Mikayel Samvelyan (@_samvelyan) 's Twitter Profile Photo

Check out Alex’s amazing internship project using Quality-Diversity algorithms to create synthetic reasoning problems! 👇 💡Key takeaway: better data quality improves in-distribution results, while more diversity enhances out-of-distribution generalization.

Tim RocktÀschel (@_rockt) 's Twitter Profile Photo

Happy "The NetHack Learning Environment is still completely unsolved" day for those of you who are celebrating it. We released The NetHack Learning Environment (arxiv.org/abs/2006.13760) on this day five years ago. Current frontier models achieve only ~1.7% progression (see balrogai.com).

Happy "<a href="/NetHack_LE/">The NetHack Learning Environment</a> is still completely unsolved" day for those of you who are celebrating it. We released The NetHack Learning Environment (arxiv.org/abs/2006.13760) on this day five years ago. Current frontier models achieve only ~1.7% progression (see balrogai.com).
Laura Ruis (@lauraruis) 's Twitter Profile Photo

LLMs can be programmed by backprop 🔎 In our new preprint, we show they can act as fuzzy program interpreters and databases. After being ‘programmed’ with next-token prediction, they can retrieve, evaluate, and even *compose* programs at test time, without seeing I/O examples.

LLMs can be programmed by backprop 🔎

In our new preprint, we show they can act as fuzzy program interpreters and databases. After being ‘programmed’ with next-token prediction, they can retrieve, evaluate, and even *compose* programs at test time, without seeing I/O examples.
Mikayel Samvelyan (@_samvelyan) 's Twitter Profile Photo

Much-needed multi-agent benchmark for LLMs đŸ‘„ Theory of Mind is key as LLMs act in agentic, interactive settings — yet remains underexplored and hard to measure. đŸ’œ Decrypto offers an ToM-based evaluation of reasoning for agents operating in complex social settings. Great work!