Dan Fu (@realdanfu) Twitter Tweets • TwiCopy

Dan Fu

@realdanfu

+ Follow

Incoming assistant professor at UCSD CSE in MLSys. Currently recruiting students! Also running the kernels team @togethercompute.

ID: 1173687463790829568

linkhttp://danfu.org calendar_today16-09-2019 19:58:03

710 Tweet

5,5K Followers

205 Following

Dan Fu

@realdanfu

3 months ago

Thanks for having me on, it was really fun!

thumb_up_off_alt16

chat_bubble_outline0

repeat1

shareShare

chipmunk is up on arxiv! across HunyuanVideo and Flux.1-dev, 5-25% of the intermediate activation values in attention and MLPs account for 70-90% of the change in activations across steps caching + sparsity speeds up generation by only recomputing fast changing activations

thumb_up_off_alt20

chat_bubble_outline1

repeat7

shareShare

Infini-AI-Lab

@infiniailab

3 months ago

🥳 Happy to share our new work – Kinetics: Rethinking Test-Time Scaling Laws 🤔How to effectively build a powerful reasoning agent? Existing compute-optimal scaling laws suggest 64K thinking tokens + 1.7B model > 32B model. But, It only shows half of the picture! 🚨 The O(N²)

thumb_up_off_alt239

chat_bubble_outline5

repeat65

shareShare

Dan Fu

@realdanfu

3 months ago

Great work from Infini-AI-Lab! Congrats Beidi Chen!

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Sabri Eyuboglu

@eyuboglusabri

3 months ago

When we put lots of text (eg a code repo) into LLM context, cost soars b/c of the KV cache’s size. What if we trained a smaller KV cache for our documents offline? Using a test-time training recipe we call self-study, we find that this can reduce cache memory on avg 39x

thumb_up_off_alt287

chat_bubble_outline12

repeat66

shareShare

Hermann

@kumbonghermann

3 months ago

Excited to be presenting our new work–HMAR: Efficient Hierarchical Masked Auto-Regressive Image Generation– at #CVPR2025 this week. VAR (Visual Autoregressive Modelling) introduced a very nice way to formulate autoregressive image generation as a next-scale prediction task (from

thumb_up_off_alt49

chat_bubble_outline1

repeat21

shareShare

Dan Fu

@realdanfu

3 months ago

Announcing HMAR - Efficient Hierarchical Masked Auto-Regressive Image Generation, led by Hermann! HMAR is hardware-efficient, reformulates autoregressive image generation in a way that can take advantage of tensor cores. Hermann is presenting it at CVPR this week!

thumb_up_off_alt19

chat_bubble_outline0

repeat5

shareShare

Keshigeyan Chandrasegaran

@keshigeyan

3 months ago

1/ Model architectures have been mostly treated as fixed post-training. 🌱 Introducing Grafting: A new way to edit pretrained diffusion transformers, allowing us to customize architectural designs on a small compute budget. 🌎 grafting.stanford.edu Co-led with Michael Poli

thumb_up_off_alt117

chat_bubble_outline5

repeat28

shareShare

Dan Fu

@realdanfu

3 months ago

And to close out a trio of diffusion papers… Super excited to announce Grafting - a method for distilling pretrained diffusion transformers into *new architectures*, led by Keshigeyan Chandrasegaran! Swap attention for new primitives for 2% pretraining cost, exciting for modeling research!

thumb_up_off_alt23

chat_bubble_outline2

repeat4

shareShare

Alex Ratner

@ajratner

3 months ago

Scale alone is not enough for AI data. Quality and complexity are equally critical. Excited to support all of these for LLM developers with Snorkel AI Data-as-a-Service, and to share our new leaderboard! — Our decade-plus of research and work in AI data has a simple point:

thumb_up_off_alt142

chat_bubble_outline15

repeat33

shareShare

Dan Fu

@realdanfu

3 months ago

Never a better time to work with Snorkel AI :)

thumb_up_off_alt8

chat_bubble_outline0

repeat0

shareShare

soham

@sohamgovande

3 months ago

Chipmunks can now hop across multiple GPU architectures (sm_80, sm_89, sm_90). You can get a 1.4-3x lossless speedup when generating videos on A100s, 4090s, and H100s! Chipmunks also play with more open-source models: Mochi, Wan, & others (w/ tutorials for integration) 🐿️

thumb_up_off_alt12

chat_bubble_outline2

repeat3

shareShare

Dan Fu

@realdanfu

3 months ago

Chipmunks for everyone!

thumb_up_off_alt11

chat_bubble_outline1

repeat1

shareShare

Dan Fu

@realdanfu

2 months ago

What a throwback to weak supervision! Great work Jon Saad-Falcon Kelly Buchanan Mayee Chen!

thumb_up_off_alt24

chat_bubble_outline1

repeat7

shareShare

Austin Silveria

@austinsilveria

2 months ago

🐿️ chipmunk ship! flux kontext supported for up to 30% faster cute chipmunks!

thumb_up_off_alt6

chat_bubble_outline1

repeat2

shareShare

Dan Fu

@realdanfu

2 months ago

Day zero support for Flux kontext dev on Chipmunk! Great work Austin Silveria!

thumb_up_off_alt8

chat_bubble_outline0

repeat0

shareShare

Hermann

@kumbonghermann

2 months ago

Happy to share that our HMAR code and pre-trained models are now publicly available. Please try them out here: code: github.com/NVlabs/HMAR checkpoints: huggingface.co/nvidia/HMAR

thumb_up_off_alt24

chat_bubble_outline0

repeat7

shareShare

Dan Fu

@realdanfu

2 months ago

HMAR code and models are out!

thumb_up_off_alt7

chat_bubble_outline0

repeat0

shareShare