Lorenzo Noci (@lorenzo_noci) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

From stochastic parrot 🦜 to Clever Hans 🐴? In our work with Vaishnavh Nagarajan @ ICML we carefully analyse the debate surrounding next-token prediction and identify a new failure of LLMs due to teacher-forcing 👨🏻‍🎓! Check out our work arxiv.org/abs/2403.06963 and the linked thread!

From stochastic parrot 🦜 to Clever Hans 🐴? In our work with <a href="/_vaishnavh/">Vaishnavh Nagarajan @ ICML</a> we carefully analyse the debate surrounding next-token prediction and identify a new failure of LLMs due to teacher-forcing 👨🏻‍🎓! Check out our work arxiv.org/abs/2403.06963 and the linked thread!

thumb_up_off_alt32

chat_bubble_outline1

repeat5

shareShare

Alex Atanasov

@abatanasov

a year ago

[1/n] Thrilled that this project with @jzavatoneveth and @cpehlevan is finally out! Our group has spent a lot of time studying high dimensional regression and its connections to scaling laws. All our results follow easily from a single central theorem 🧵 arxiv.org/abs/2405.00592

thumb_up_off_alt114

chat_bubble_outline5

repeat28

shareShare

Bobby

@bobby_he

a year ago

Outlier Features (OFs) aka “neurons with big features” emerge in standard transformer training & prevent benefits of quantisation🥲but why do OFs appear & which design choices minimise them? Our new work (+Lorenzo Noci Daniele Paliotta Imanol Schlag T. Hofmann) takes a look👀🧵

thumb_up_off_alt182

chat_bubble_outline4

repeat39

shareShare

Aurelien Lucchi

@aurelienlucchi

a year ago

My group has multiple openings both for PhD and Post-doc positions to work in the area of optimization for ML, and deep learning theory. We are looking for people with a strong theoretical background (degree in math, theoretical physics or CS with strong theory emphasis).

thumb_up_off_alt276

chat_bubble_outline8

repeat63

shareShare

Chris J. Maddison

@cjmaddison

a year ago

I'm also recruiting PhD/MSc students this coming cycle, with an eye towards applications in drug discovery. cs.toronto.edu/~cmaddis/ DM me or email me if you have any questions at all!

thumb_up_off_alt30

chat_bubble_outline0

repeat9

shareShare

Bobby

@bobby_he

9 months ago

Updated camera ready arxiv.org/abs/2405.19279. New results include: - non-diagonal preconditioners (SOAP/Shampoo) minimise OFs compared to diagonal (Adam/AdaFactor) - Scaling to 7B params - showing our methods to reduce OFs translate to PTQ int8 quantisation ease. Check it out!

thumb_up_off_alt154

chat_bubble_outline1

repeat31

shareShare

Lorenzo Noci

@lorenzo_noci

8 months ago

Indeed very useful :)

thumb_up_off_alt6

chat_bubble_outline0

repeat0

shareShare

Lorenzo Noci

@lorenzo_noci

8 months ago

Systematic empirical analysis of the role of feature learning in continual learning using scaling limits theory. Meet Jacopo in Vancouver :)

thumb_up_off_alt5

chat_bubble_outline0

repeat0

shareShare

Bobby

@bobby_he

8 months ago

Come by poster #2402 East hall at NeurIPS from 11am-2pm Friday to chat about why outlier features emerge during training and how we can prevent them!

thumb_up_off_alt45

chat_bubble_outline0

repeat10

shareShare

Blake Bordelon ☕️🧪👨‍💻

@blake__bordelon

8 months ago

Come by at Neurips to hear Hamza present about interesting properties of various feature learning infinite parameter limits of transformer models! Poster in Hall A-C #4804 at 11 AM PST Friday Paper arxiv.org/abs/2405.15712 Work with Hamza Tahir Chaudhry and Cengiz Pehlevan

thumb_up_off_alt44

chat_bubble_outline0

repeat7

shareShare

Lénaïc Chizat

@lenaicchizat

4 months ago

Announcing : The 2nd International Summer School on Mathematical Aspects of Data Science EPFL, Sept 1–5, 2025 Speakers: Bach (Francis Bach) Bandeira Mallat Montanari (Andrea Montanari) Peyré (Gabriel Peyré) For PhD students & early-career researchers Application deadline: May 15

thumb_up_off_alt101

chat_bubble_outline2

repeat23

shareShare

Elvis Nava

@elvisnavah

4 months ago

Come build with us and OpenAI !!

thumb_up_off_alt15

chat_bubble_outline0

repeat2

shareShare

Alberto Bietti

@albertobietti

3 months ago

Come hear about how transformers perform factual recall using associative memories, and how this emerges in phases during training! #ICLR2025 poster #602 at 3pm today. Lead by Eshaan Nichani Link: iclr.cc/virtual/2025/p… Paper: arxiv.org/abs/2412.06538

thumb_up_off_alt47

chat_bubble_outline1

repeat9

shareShare

Aurelien Lucchi

@aurelienlucchi

3 months ago

Our research group in the department of Mathematics and CS at the University of Basel (Switzerland) is looking for several PhD candidates and one post-doc who have a theoretical background in optimization and machine learning or practical experience in reasoning. RT please.

thumb_up_off_alt37

chat_bubble_outline1

repeat7

shareShare

Lorenzo Noci

@lorenzo_noci

23 days ago

Pass by if you want to know about scaling up your model under distribution shifts of the training data. Take away: muP needs to be tuned to the optimal amount of feature learning that optimizes the forgetting/plasticity trade off.

thumb_up_off_alt25

chat_bubble_outline0

repeat4

shareShare

Lorenzo Noci

Gate.io

Gregor Bachmann

Alex Atanasov

Bobby

Aurelien Lucchi

Chris J. Maddison

Bobby

Lorenzo Noci

Lorenzo Noci

Bobby

Blake Bordelon ☕️🧪👨‍💻

Lénaïc Chizat

Elvis Nava

Alberto Bietti

Aurelien Lucchi

Lorenzo Noci