Kirill Neklyudov (@k_neklyudov) 's Twitter Profile
Kirill Neklyudov

@k_neklyudov

Assistant Professor @UMontreal; Core Academic Member @Mila_Quebec

ID: 763831084924755968

linkhttp://necludov.github.io calendar_today11-08-2016 20:14:56

171 Tweet

1,1K Followers

376 Following

Patrick Kidger (@patrickkidger) 's Twitter Profile Photo

🔥 Time for my first bioML blog post! This one is for all the folks getting into ML-for-protein-design. ✨ "Just know stuff, proteinML edition" kidger.site/thoughts/just-… This is intended as a curriculum-with-context, as a starting point for the field. 1/2

🔥 Time for my first bioML blog post! This one is for all the folks getting into ML-for-protein-design.

✨ "Just know stuff, proteinML edition"

kidger.site/thoughts/just-…

This is intended as a curriculum-with-context, as a starting point for the field.

1/2
Austin Cheng (@auhcheng) 's Twitter Profile Photo

Excited to share Quetzal, a simple but scalable model for building 3D molecules atom-by-atom. 🐉 Named after Quetzalcoatl, the Aztec god of creation We equip a standard causal transformer with a per-atom diffusion MLP to model the continuous 3D position of the next atom. [1/3]

Daniel Severo (@_dsevero) 's Twitter Profile Photo

New work: a scalable way to learn dists over permutations/rankings. The method can trade-off compute and expressivity by varying # NFEs (ie unmasking more than one token at a time), and subsumes well known families of models (eg Mallow' model) arxiv.org/abs/2505.24664

New work: a scalable way to learn dists over permutations/rankings.

The method can trade-off compute and expressivity by varying # NFEs (ie unmasking more than one token at a time), and subsumes well known families of models (eg Mallow' model)

arxiv.org/abs/2505.24664
Lorenz Richter @ICLR'25 (@lorenz_richter) 's Twitter Profile Photo

We derive policy gradients for reinforcement learning with random time horizons in arxiv.org/pdf/2506.00962. While arguably being a typical setting in applications, it has been largely overlooked in the literature. Our adjusted formulas offer significant numerical improvements.

We derive policy gradients for reinforcement learning with random time horizons in arxiv.org/pdf/2506.00962. While arguably being a typical setting in applications, it has been largely overlooked in the literature. Our adjusted formulas offer significant numerical improvements.
Floor Eijkelboom (@feijkelboom) 's Twitter Profile Photo

Generative models excel at images and text, but tabular data remains a challenge.🤔 We introduce 🐈 TabbyFlow 🐈 - a variational flow matching approach with general exponential families for mixed-type tables. Work with Andrés Guzmán-Cordero & Jan-Willem van de Meent accepted to #ICML2025 🎉 👇 1/n

Generative models excel at images and text, but tabular data remains a challenge.🤔

We introduce 🐈 TabbyFlow 🐈 - a variational flow matching approach with general exponential families for mixed-type tables.

Work with <a href="/AndresGuzco/">Andrés Guzmán-Cordero</a> &amp; <a href="/jwvdm/">Jan-Willem van de Meent</a> accepted to #ICML2025 🎉 

👇 1/n
Kirill Neklyudov (@k_neklyudov) 's Twitter Profile Photo

The supervision signal in AI4Science is so crisp that we can solve very complicated problems almost without any data or RL! In this project, we train a model to solve the Schrödinger equation for different molecular conformations using Density Functional Theory (DFT) In the

Ricky T. Q. Chen (@rickytqchen) 's Twitter Profile Photo

Padding in our non-AR sequence models? Yuck. 🙅 👉 Instead of unmasking, our new work *Edit Flows* perform iterative refinements via position-relative inserts and deletes, operations naturally suited for variable-length sequence generation. Easily better than using mask tokens.

Acceleration Consortium (AC) (@acceleration_c) 's Twitter Profile Photo

We're spotlighting #WomenInSTEM and their inspiring journeys! Meet Marta Skreta, Computer Science PhD student University of Toronto. Video created by Biomedical Engineering @ University of Toronto students Meghan + Ana-Maria Oproescu with support from Helen Tran and the AC’s EDI Initiate Grant. 🎥 youtube.com/watch?v=h2uRpm…

Alex Tong (@alexandertong7) 's Twitter Profile Photo

Check out FKCs! A principled flexible approach for diffusion sampling. I was surprised how well it scaled to high dimensions given its reliance on importance reweighting. Thanks to great collaborators Mila - Institut québécois d'IA Vector Institute Imperial College London and Google DeepMind. Thread👇🧵

Microsoft Research (@msftresearch) 's Twitter Profile Photo

Microsoft researchers achieved a breakthrough in the accuracy of DFT, a method for predicting the properties of molecules and materials, by using deep learning. This work can lead to better batteries, green fertilizers, precision drug discovery, and more. msft.it/6011SQwKX

Microsoft researchers achieved a breakthrough in the accuracy of DFT, a method for predicting the properties of molecules and materials, by using deep learning. This work can lead to better batteries, green fertilizers, precision drug discovery, and more. msft.it/6011SQwKX
Rianne van den Berg (@vdbergrianne) 's Twitter Profile Photo

🚀 After two+ years of intense research, we’re thrilled to introduce Skala — a scalable deep learning density functional that hits chemical accuracy on atomization energies and matches hybrid-level accuracy on main group chemistry — all at the cost of semi-local DFT. ⚛️🔥🧪🧬

🚀 After two+ years of intense research, we’re thrilled to introduce Skala — a scalable deep learning density functional that hits chemical accuracy on atomization energies and matches hybrid-level accuracy on main group chemistry — all at the cost of semi-local DFT. ⚛️🔥🧪🧬
Rob Brekelmans (@brekelmaniac) 's Twitter Profile Photo

Given q_t, r_t as diffusion model(s), an SDE w/drift β ∇ log q_t + α ∇ log r_t doesn’t sample the sequence of geometric avg/product/tempered marginals! To correct this, we derive an SMC scheme via PDE perspective Resampling weights are ‘free’, depend only on (exact) scores!

Max Zhdanov (@maxxxzdn) 's Twitter Profile Photo

🤹 New blog post! I write about our recent work on using hierarchical trees to enable sparse attention over irregular data (point clouds, meshes) - Erwin Transformer. blog: maxxxzdn.github.io/blog/erwin/ paper: arxiv.org/abs/2502.17019 Compressed version in the thread below:

🤹 New blog post! 

I write about our recent work on using hierarchical trees to enable sparse attention over irregular data (point clouds, meshes) - Erwin Transformer.

blog: maxxxzdn.github.io/blog/erwin/
paper: arxiv.org/abs/2502.17019

Compressed version in the thread below:
Kirill Neklyudov (@k_neklyudov) 's Twitter Profile Photo

This work is exemplary! James and his coauthors took the direction that most researchers wouldn't call shiny. They gave the idea a full shot and came out with a beautiful study. They pushed the boundaries of our understanding of the energy-based models miles further!