Alireza Makhzani (@alimakhzani) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

Monday in the reading group - flow matching? neigh: "Action Matching: Learning Stochastic Dynamics from Samples" arxiv.org/abs/2210.06662 with Kirill Neklyudov and Alireza Makhzani! One of the most interesting ICML papers. 👌 On Zoom at 11am EDT / 3pm UTC: m2d2.io/talks/logg/abo…

Monday in the reading group - flow matching? neigh: "Action Matching: Learning Stochastic Dynamics from Samples" arxiv.org/abs/2210.06662 with <a href="/k_neklyudov/">Kirill Neklyudov</a> and <a href="/AliMakhzani/">Alireza Makhzani</a>! One of the most interesting ICML papers. 👌

On Zoom at 11am EDT / 3pm UTC: m2d2.io/talks/logg/abo…

thumb_up_off_alt181

chat_bubble_outline0

repeat20

shareShare

Alireza Makhzani

@alimakhzani

2 years ago

Introducing “Wasserstein Lagrangian Flows”: A novel computational approach for solving Optimal Transport and its variants. Paper: arxiv.org/abs/2310.10649 Led by Kirill Neklyudov and Rob Brekelmans With: Alex Tong Lazar Atanackovic Qiang Liu The solution of Optimal Transport (OT) and

thumb_up_off_alt194

chat_bubble_outline1

repeat44

shareShare

Alireza Makhzani

@alimakhzani

2 years ago

Check out Rob Brekelmans's thread comparing Action Matching and its extension, Wasserstein Lagrangian flows, with Flow / Bridge Matching and their extensions.

thumb_up_off_alt5

chat_bubble_outline0

repeat0

shareShare

Alireza Makhzani

@alimakhzani

2 years ago

Very proud of Rozhina's journey and achievements! It's a privilege to mentor passionate students like her.

thumb_up_off_alt1

chat_bubble_outline1

repeat0

shareShare

Alireza Makhzani

@alimakhzani

2 years ago

Yibo interned with me last summer at Vector, and was exceptional! Don't miss the chance to meet and hire him at #NeurIPS2023!

thumb_up_off_alt7

chat_bubble_outline0

repeat0

shareShare

Wu Lin

@linyorker

2 years ago

For the first time, we (with Felix Dangel, Runa Eschenhagen, Kirill Neklyudov Agustinus Kristiadi, Richard E. Turner, Alireza Makhzani) propose a sparse 2nd-order method for large NN training with BFloat16 and show its advantages over AdamW. also @NeurIPS workshop on Opt for ML arxiv.org/abs/2312.05705 /1

For the first time, we (with <a href="/f_dangel/">Felix Dangel</a>, <a href="/runame_/">Runa Eschenhagen</a>, <a href="/k_neklyudov/">Kirill Neklyudov</a> <a href="/akristiadi7/">Agustinus Kristiadi</a>, Richard E. Turner, <a href="/AliMakhzani/">Alireza Makhzani</a>) propose a sparse 2nd-order method for large NN training with BFloat16 and show its advantages over AdamW. also @NeurIPS workshop on Opt for ML arxiv.org/abs/2312.05705 /1

thumb_up_off_alt47

chat_bubble_outline2

repeat8

shareShare

Agustinus Kristiadi

@akristiadi7

2 years ago

Large NNs like transformers (i) need fp16 to train => matrix inversion in 2nd order methods is unstable, (ii) expensive to store the preconditioner 😩 Our work solves both by exploiting the Riemannian geometry of preconditioning matrices---it's as efficient as AdamW! 🌐

thumb_up_off_alt15

chat_bubble_outline0

repeat3

shareShare

Alireza Makhzani

@alimakhzani

2 years ago

I'm excited to be in New Orleans for #NeurIPS2023! Looking forward to catching up with old friends and meeting new folks. My group will be presenting [Spotlight] Wasserstein Quantum Monte Carlo: A Novel Approach for Solving the Quantum Many-Body Schrödinger Equation

thumb_up_off_alt37

chat_bubble_outline0

repeat7

shareShare

Alireza Makhzani

@alimakhzani

2 years ago

It was very fun to present the "Wasserstein Quantum Monte Carlo" poster, next to Max Welling at #NeurIPS2023. This work was led by my exceptional postdoc Kirill Neklyudov, who unfortunately couldn't attend the conference.

It was very fun to present the "Wasserstein Quantum Monte Carlo" poster, next to <a href="/wellingmax/">Max Welling</a> at #NeurIPS2023. This work was led by my exceptional postdoc <a href="/k_neklyudov/">Kirill Neklyudov</a>, who unfortunately couldn't attend the conference.

thumb_up_off_alt67

chat_bubble_outline1

repeat4

shareShare

Daniel Severo

@_dsevero

a year ago

In the next few weeks I'll be wrapping up my PhD and joining FAIR AI at Meta full-time in Montréal 🇨🇦! Looking forward to contributing to the AI space through open-source research. Very grateful to all who helped me get here. It truly does take a village to advise a PhD student!

thumb_up_off_alt152

chat_bubble_outline15

repeat3

shareShare

Kirill Neklyudov

@k_neklyudov

a year ago

Je vais à Montréal! This June I'm starting a new position as an assistant professor at Université de Montréal and as a core academic member of Mila - Institut québécois d'IA. Drop me a line if you're interested in working together on problems in AI4Science, Optimal Transport, and Generative Modeling.

thumb_up_off_alt130

chat_bubble_outline14

repeat13

shareShare

Alireza Makhzani

@alimakhzani

a year ago

Introducing “Probabilistic Inference in Language Models via Twisted Sequential Monte Carlo” Many capability and safety techniques of LLMs—such as RLHF, automated red-teaming, prompt engineering, and infilling—can be viewed from a probabilistic inference perspective, specifically

thumb_up_off_alt125

chat_bubble_outline1

repeat22

shareShare

Alireza Makhzani

@alimakhzani

a year ago

Very cool to see OpenAI is using "k-Sparse Autoencoders" (my ICLR 2014 paper) to extract interpretable features from GPT-4, and showing that it outperforms other methods on sparsity-reconstruction frontier: arxiv.org/abs/1312.5663 If you are interested in sparse autoencoders,

thumb_up_off_alt602

chat_bubble_outline10

repeat75

shareShare

Kirill Neklyudov

@k_neklyudov

a year ago

Wasserstein Lagrangian Flows explain many different dynamics on the space of distributions from a single perspective. arxiv.org/abs/2310.10649 I made a video explaining our (with Rob Brekelmans) #icml2024 paper about WLF. Like subscribe share, lol. youtu.be/kkddiLegc3s?si…

thumb_up_off_alt238

chat_bubble_outline3

repeat41

shareShare

Wu Lin

@linyorker

a year ago

#ICML2024 Can We Remove the Square-Root in Adaptive Methods? arxiv.org/abs/2402.03496 Root-free (RF) methods are better on CNNs and competitive on Transformers compared to root-based methods (AdamW) Removing the root makes matrix methods faster: Root-free Shampoo in BFloat16 /1

thumb_up_off_alt60

chat_bubble_outline9

repeat16

shareShare

Hannes Stärk

@hannesstaerk

a year ago

Come discuss an ICML Conference best paper award with the author Rob Brekelmans in our reading group on Monday! "Probabilistic Inference in LMs via Twisted Sequential Monte Carlo" arxiv.org/abs/2404.17546 On zoom Mon 9am PT / 12pm ET / 5pm CEST. Links: portal.valencelabs.com/logg

Come discuss an <a href="/icmlconf/">ICML Conference</a> best paper award with the author <a href="/brekelmaniac/">Rob Brekelmans</a> in our reading group on Monday!
"Probabilistic Inference in LMs via Twisted Sequential Monte Carlo" arxiv.org/abs/2404.17546

On zoom Mon 9am PT / 12pm ET / 5pm CEST. Links: portal.valencelabs.com/logg

thumb_up_off_alt361

chat_bubble_outline6

repeat49

shareShare

David Pfau

@pfau

3 months ago

In some sense, there’s nothing in this paper that we couldn’t have done in 2018 (and I wish we had! I’d be famous!) But the inspiration for this paper actually came from the fantastic recent work on Wasserstein QMC by Kirill Neklyudov and others. Good research should be timeless.

thumb_up_off_alt63

chat_bubble_outline2

repeat6

shareShare