Max Zhdanov (@maxxxzdn) Twitter Tweets • TwiCopy

Max Zhdanov

@maxxxzdn

+ Follow

busy scaling on a single GPU at @amlabuva with @wellingmax and @jwvdm

ID: 1502348291052392455

linkhttp://maxxxzdn.github.io calendar_today11-03-2022 18:19:00

384 Tweet

1,1K Followers

324 Following

Ji-Ha

4 months ago

Blog post: Rethinking Probability - Mass, Averages, and Granularity Developing an intuition for probability using analogies from physics, exploring both the standard measure-theoretic and the expectation-first foundations.

Blog post: Rethinking Probability - Mass, Averages, and Granularity

Developing an intuition for probability using analogies from physics, exploring both the standard measure-theoretic and the expectation-first foundations.

thumb_up_off_alt182

chat_bubble_outline3

repeat19

shareShare

Max Zhdanov

4 months ago

see you all in Vancouver 🇨🇦 new results on PDE benchmarks soon as well 👨‍🍳

thumb_up_off_alt11

chat_bubble_outline0

repeat0

shareShare

Jean-Philip Piquemal

4 months ago

#compchem Second preprint linked to the FeNNix-Bio1 #machinelearning foundation model. FeNNix-Bio1's inference is pretty fast already with a few GPUs but, "what if", we were able to push it at the #Exascale? Let's have a glimpse into the future (1/3): "Pushing the Accuracy Limit

thumb_up_off_alt138

chat_bubble_outline27

repeat32

shareShare

Neel Nanda

4 months ago

After supervising 20+ papers, I have highly opinionated views on writing great ML papers. When I entered the field I found this all frustratingly opaque So I wrote a guide on turning research into high-quality papers with scientific integrity! Hopefully still useful for NeurIPS

After supervising 20+ papers, I have highly opinionated views on writing great ML papers. When I entered the field I found this all frustratingly opaque

So I wrote a guide on turning research into high-quality papers with scientific integrity! Hopefully still useful for NeurIPS

thumb_up_off_alt1,1K

chat_bubble_outline18

repeat167

shareShare

Max Zhdanov

4 months ago

LLMs are a classic example of emergence, and it is not the first time we look into a non-living system showing life-like properties, Prigogine got a Nobel for that. I totally see Anthropic going this direction and getting another one.

thumb_up_off_alt6

chat_bubble_outline0

repeat0

shareShare

Brandon Wood

4 months ago

We released the Open Molecules 2025 (OMol25) Dataset last week! 🚀🧪 OMol25 is a large (100M+) and diverse molecular DFT dataset for training machine learning models. It was a massive collaborative and interdisciplinary effort and I’m super proud of the whole team! 🙌 1/7

We released the Open Molecules 2025 (OMol25) Dataset last week! 🚀🧪 OMol25 is a large (100M+) and diverse molecular DFT dataset for training machine learning models. It was a massive collaborative and interdisciplinary effort and I’m super proud of the whole team! 🙌

1/7

thumb_up_off_alt106

chat_bubble_outline2

repeat21

shareShare

Nabil Iqbal

4 months ago

The arxiv preprint on our conformally equivariant neural network -- named AdS-GNN due to its secret origins in AdS/CFT -- is now out! arxiv.org/abs/2505.12880 🧵explaining it below. Joint work with the amazing team of Max Zhdanov, Erik Bekkers and Patrick Forre.

The arxiv preprint on our conformally equivariant neural network -- named AdS-GNN due to its secret origins in AdS/CFT -- is now out!

arxiv.org/abs/2505.12880

🧵explaining it below. Joint work with the amazing team of <a href="/maxxxzdn/">Max Zhdanov</a>, <a href="/erikjbekkers/">Erik Bekkers</a> and Patrick Forre.

thumb_up_off_alt271

chat_bubble_outline1

repeat55

shareShare

Maurice Weiler

@maurice_weiler

4 months ago

New preprint! We extend Taco Cohen's theory of equivariant CNNs on homogeneous spaces to the non-linear setting. Beyond convolutions, this covers equivariant attention, implicit kernel MLPs and more general message passing layers. More details in Oscar Carlsson's thread 👇

New preprint! We extend <a href="/TacoCohen/">Taco Cohen</a>'s theory of equivariant CNNs on homogeneous spaces to the non-linear setting. Beyond convolutions, this covers equivariant attention, implicit kernel MLPs and more general message passing layers.
More details in <a href="/O_EA_Carlsson/">Oscar Carlsson</a>'s thread 👇

thumb_up_off_alt79

chat_bubble_outline1

repeat14

shareShare

Katie Everett

4 months ago

1. We often observe power laws between loss and compute: loss = a * flops ^ b + c 2. Models are rapidly becoming more efficient, i.e. use less compute to reach the same loss But: which innovations actually change the exponent in the power law (b) vs change only the constant (a)?

thumb_up_off_alt254

chat_bubble_outline8

repeat44

shareShare

Nina Miolane 🦋 @ninamiolane.bsky.social

3 months ago

Interested in graph and topological deep learning?🍩 Join us this wednesday online for this exciting Symmetry and Geometry in Neural Representations global seminar series!🌟

thumb_up_off_alt57

chat_bubble_outline2

repeat12

shareShare

Gabriele Cesa

@_gabrielecesa_

3 months ago

Excited to be giving a talk at the Cambridge Wednesday Seminar today at 3pm. Looking forward to sharing ideas and great discussion about equivariance and beyond Thanks Pietro Lio' Riccardo Ali for inviting me! cst.cam.ac.uk/seminars/list/…

thumb_up_off_alt14

chat_bubble_outline0

repeat3

shareShare

Artem Moskalev @ at ICLR2025 🇸🇬

3 months ago

ICML Spotlight 🚨 Equivariance is too slow and expensive, especially when you need global context. It makes us wonder if it even worths the cost, especially in high-dimensional problems? We present Geometric Hyena Networks — a simple equivariant model orders of magnitude more

ICML Spotlight 🚨 Equivariance is too slow and expensive, especially when you need global context. It makes us wonder if it even worths the cost, especially in high-dimensional problems? We present Geometric Hyena Networks — a simple equivariant model orders of magnitude more

thumb_up_off_alt84

chat_bubble_outline2

repeat20

shareShare

Marco Fumero@ICLR25

3 months ago

Neural networks implicitly define a latent vector field on the data manifold, via autoencoding iterations🌀 This representation retains properties of the model, revealing memorization and generalization regimes, and characterizing distribution shifts 📜: arxiv.org/abs/2505.22785

Neural networks implicitly define a latent vector field on the data manifold, via autoencoding iterations🌀

This representation retains properties of the model, revealing memorization and generalization regimes, and characterizing distribution shifts

📜: arxiv.org/abs/2505.22785

thumb_up_off_alt412

chat_bubble_outline5

repeat71

shareShare

Erik Bekkers

3 months ago

Great discussion, Chaitanya K. Joshi! We also explored this with extensive experiments in our recent paper: arxiv.org/abs/2501.01999. We find, among others, that equiv mods in a sense scale even better than non-equiv ones. Going more or less completely against the vibes from your post😅1/5

thumb_up_off_alt49

chat_bubble_outline1

repeat12

shareShare