Vimal Thilak🦉🐒 (@aggieinca) Twitter Tweets • TwiCopy

Vimal Thilak🦉🐒

@aggieinca

7 months ago

Opted to look at my stock portfolio losses in log scale today. I feel better already

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Vimal Thilak🦉🐒

@aggieinca

7 months ago

Logloglog scale today 🫡

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

We release a large scale study to answer the following: - Is late fusion inherently better than early fusion for multimodal models? - How do native multimodal models scale compared to LLMs. - How sparsity (MoEs) can play a detrimental role in handling heterogeneous modalities? 🧵

thumb_up_off_alt428

chat_bubble_outline8

repeat73

shareShare

Alaa El-Nouby

@alaa_nouby

7 months ago

We have been thinking a lot about how to train truly native multimodal models: (1) what arch to use (early-fusion, late-fusion, MoEs)? (2) the impact of data mixtures (interleaved, img-cap, text data) We took a stab at answering these questions (and more) in this preprint ...

thumb_up_off_alt171

chat_bubble_outline4

repeat27

shareShare

Vimal Thilak🦉🐒

@aggieinca

6 months ago

Check out this post that has information about research from Apple that will be presented at ICLR 2025 in 🇸🇬 this week. I will be at ICLR and will be presenting some of our work (led by Samira Abnar) at SLLM Sparsity in LLMs Workshop at ICLR 2025 workshop. Happy to chat about JEPAs as well!

thumb_up_off_alt19

chat_bubble_outline0

repeat6

shareShare

Jason Ramapuram

@jramapuram

6 months ago

Stop by poster #596 at 10A-1230P tomorrow (Fri 25 April) at #ICLR2025 to hear more about Sigmoid Attention! We just pushed 8 trajectory checkpoints each for two 7B LLMs for Sigmoid Attention and a 1:1 Softmax Attention (trained with a deterministic dataloader for 1T tokens): -

thumb_up_off_alt45

chat_bubble_outline1

repeat14

shareShare

Harshay Shah

@harshays_

6 months ago

If you’re at #ICLR2025, go watch Vimal Thilak🦉🐒 give an oral presentation at the @SparseLLMs workshop on scaling laws for pretraining MoE LMs! Had a great time co-leading this project with Samira Abnar & Vimal Thilak🦉🐒 at Apple MLR last summer. When: Sun Apr 27, 9:30a Where: Hall 4-07

thumb_up_off_alt19

chat_bubble_outline0

repeat5

shareShare

Yani Ioannou @ ICLR 2025 ✈️

@yanii

6 months ago

First poster session at the #ICLR2025 Sparsity in LLMs Workshop at ICLR 2025 workshop

First poster session at the #ICLR2025 <a href="/sparseLLMs/">Sparsity in LLMs Workshop at ICLR 2025</a> workshop

thumb_up_off_alt21

chat_bubble_outline0

repeat5

shareShare

Vimal Thilak🦉🐒

@aggieinca

6 months ago

No Twitter, I will repost this article without reading it first. I know it's that good :)

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Randall Balestriero

@randall_balestr

6 months ago

In less than 24h! Zoom will be open to all! Program below: 9am -> 9:10am opening 9:10am -> 9:55am Phillip Isola 9:55am -> 10:40am Thomas Serre 11am -> 11:45am Eero Simoncelli 11:45am -> 12:30pm Yi Ma 1:30pm -> 2:15pm Yann LeCun 2:15pm -> 3pm

thumb_up_off_alt37

chat_bubble_outline1

repeat6

shareShare

Miguel Angel Bautista

@itsbautistam

6 months ago

We will be presenting this work at ICML25 in Vancouver! Great work by Yuyang Wang leading this project! I’m curious about what would the diffusion/fm community want to see this type of model do? (Besides getting better FID on ImageNet 😂)

thumb_up_off_alt52

chat_bubble_outline2

repeat8

shareShare

Shuangfei Zhai

@zhaisf

6 months ago

Proud to report that TarFlow is accepted to #ICML2025 as a Spotlight 🎉 I’m really looking forward to new ideas and applications enabled by powerful Normalizing Flow models 🚀

thumb_up_off_alt84

chat_bubble_outline0

repeat13

shareShare

Vimal Thilak🦉🐒

@aggieinca

6 months ago

Yep. What is meant by image-like here? 🤔 The problem, more like the frustrating aspect, of empirical work is we have no idea what is optimal as anytime I dare make a claim, hypertuner bros laugh at me and release new SoTA :).

thumb_up_off_alt1

chat_bubble_outline1

repeat0

shareShare

Pavankumar Vasu

@pavankumarvasu

6 months ago

Excited to share code & models for FastVLM — our blazing-fast Vision-Language Model appearing at #CVPR2025 Run it on-device with inference code optimized for Apple Silicon using #mlx. Code: github.com/apple/ml-fastv… Updated paper & results coming soon. Stay tuned! 👀

thumb_up_off_alt196

chat_bubble_outline11

repeat48

shareShare

Vimal Thilak🦉🐒

@aggieinca

6 months ago

Oh no 🫢

thumb_up_off_alt1

chat_bubble_outline1

repeat0

shareShare

Tianqi Chen

@tqchenml

6 months ago

ML systems infrastructure (compilers, inference engines, GPU accelerations, and more) are at the heart of the AI revolution . One thing I love about #MLSys2025 is that it comes with high density of talents and a shared mindset in these directions. Starting next Monday!

thumb_up_off_alt92

chat_bubble_outline1

repeat14

shareShare

Vimal Thilak🦉🐒

@aggieinca

6 months ago

Ahmad started a very interesting discussion . I wish we had openreview-like thing to archive these discussions:). TMLR is a good venue. Correctness cover subjectives (novelty!?) has a better chance of being useful down the road. Also Ecclesiastes 1:9

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Ben Recht

@beenwrekt

6 months ago

Rex "garbage in" Douglass Ph.D. Too dumb to acknowledge it is right.

<a href="/RexDouglass/">Rex "garbage in" Douglass Ph.D.</a> Too dumb to acknowledge it is right.

thumb_up_off_alt2

chat_bubble_outline1

repeat1

shareShare

Ben Recht

@beenwrekt

6 months ago

Rex "garbage in" Douglass Ph.D. DeepMind employee whining about travel inequity and then complaining about publication prestige...

thumb_up_off_alt4

chat_bubble_outline1

repeat1

shareShare

Mohammed Adnan

@adnan_ahmad1306

6 months ago

1/10 🧵 🔍Can weight symmetry provide insights into sparse training and the Lottery Ticket Hypothesis? 🧐We dive deep into this question in our latest paper, "Sparse Training from Random Initialization: Aligning Lottery Ticket Masks using Weight Symmetry", accepted at #ICML2025

thumb_up_off_alt83

chat_bubble_outline4

repeat26

shareShare

Vimal Thilak🦉🐒

Vimal Thilak🦉🐒

Vimal Thilak🦉🐒

Mustafa Shukor

Alaa El-Nouby

Vimal Thilak🦉🐒

Jason Ramapuram

Harshay Shah

Yani Ioannou @ ICLR 2025 ✈️

Vimal Thilak🦉🐒

Randall Balestriero

Miguel Angel Bautista

Shuangfei Zhai

Vimal Thilak🦉🐒

Pavankumar Vasu

Vimal Thilak🦉🐒

Tianqi Chen

Vimal Thilak🦉🐒

Ben Recht

Ben Recht

Mohammed Adnan