Maciej Kilian (@kilian_maciej) Twitter Tweets • TwiCopy

Maciej Kilian

@kilian_maciej

+ Follow

intelligence flows ; founding team @perceptroninc

ID: 1348734194705428481

linkhttp://github.com/iejMac calendar_today11-01-2021 20:51:12

218 Tweet

668 Followers

151 Following

Maciej Kilian

@kilian_maciej

5 months ago

stay with me now

thumb_up_off_alt21

chat_bubble_outline0

repeat4

shareShare

Excited to see further studies into early fusion vs late fusion models, in particular a great analysis into multimodal MoE’s aligned with our findings in MoMa on designing parameter specialization in multimodal LLMs. A few key things that helped us on top of the results presented

thumb_up_off_alt37

chat_bubble_outline1

repeat8

shareShare

Aidan O’Gara

@aidanogara_

5 months ago

So excited to see the next batch of models out of Malaysia

thumb_up_off_alt423

chat_bubble_outline5

repeat33

shareShare

Maciej Kilian

@kilian_maciej

5 months ago

turn all dumb matter into computronium

thumb_up_off_alt6

chat_bubble_outline0

repeat0

shareShare

Maciej Kilian

@kilian_maciej

4 months ago

very cool. we found similar results in diffusion model training where EMA on model weights & const LR is more common. section 5.3 arxiv.org/pdf/2405.13218

thumb_up_off_alt1

chat_bubble_outline0

repeat1

shareShare

Sam Altman

@sama

4 months ago

i think we should stop arguing about what year AGI will arrive and start arguing about what year the first self-replicating spaceship will take off

thumb_up_off_alt21,21K

chat_bubble_outline2,2K

repeat1,1K

shareShare

Maciej Kilian

@kilian_maciej

2 months ago

just do it

thumb_up_off_alt18

chat_bubble_outline1

repeat2

shareShare

comma

@comma_ai

2 months ago

4th of July came early to the comma shop! Get a comma 3X for just $799 - its lowest price ever.

thumb_up_off_alt326

chat_bubble_outline26

repeat21

shareShare

Maciej Kilian

@kilian_maciej

2 months ago

thumb_up_off_alt6

chat_bubble_outline0

repeat0

shareShare

Maciej Kilian

@kilian_maciej

2 months ago

thank you chatgpt, very cool

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Jeremy Bernstein

@jxbz

2 months ago

Laker and I are presenting this work in an hour at ICML poster E-2103. It’s on a theoretical framework and language (modula) for optimizers that are fast (like Shampoo) and scalable (like muP). You can think of modula as Muon extended to general layer types and network topologies

thumb_up_off_alt193

chat_bubble_outline3

repeat21

shareShare

Laker Newhouse

@lakernewhouse

2 months ago

[1/6] Curious about Muon, but not sure where to start? I wrote a 3-part blog series called “Understanding Muon” designed to get you up to speed—with The Matrix references, annotated source code, and thoughts on where Muon might be going.

thumb_up_off_alt314

chat_bubble_outline7

repeat39

shareShare

Maciej Kilian

Maciej Kilian

Akshat Shrivastava

Aidan O’Gara

Maciej Kilian

Maciej Kilian

Sam Altman

Maciej Kilian

comma

Maciej Kilian

Maciej Kilian

Jeremy Bernstein

Laker Newhouse