Yuki Mitsufuji (@mittu1204) Twitter Tweets • TwiCopy

A list of diffusion works & tutorial (at #ISMIR2024) from our lab! [ML] arxiv.org/abs/2405.17251 #NeurIPS24 (GenWarp: Novel View Synthesis) arxiv.org/abs/2405.14822 #NeurIPS24 (PaGoDA: Multi-Scale 1 Step Generator) arxiv.org/abs/2310.02279 #ICLR24 (CTM: Fast Image Gen.)

thumb_up_off_alt62

chat_bubble_outline1

repeat15

shareShare

Yuki Mitsufuji

@mittu1204

8 months ago

🎶Large music models from our team🎶: 1. SoniDo🎼 for music mixing, demixing, transcription, etc. pdf: arxiv.org/abs/2411.01135 2. OpenMU🧙‍♂️ for music captioning, reasoning, lyric understanding, etc. pdf: arxiv.org/abs/2410.15573 code: mzhaojp22.github.io/open_music_und… demo ISMIR Conference :

thumb_up_off_alt48

chat_bubble_outline0

repeat8

shareShare

Yuki Mitsufuji

@mittu1204

7 months ago

I'm very happy to see that MMAudio, which my talented colleagues (mi141, A. Hayakawa, Takashi Shibuya) and intern (Rex Cheng) at Sony AI have invested their time and effort into, is being tested by so many creative people in X arXiv: arxiv.org/abs/2412.15322

thumb_up_off_alt89

chat_bubble_outline6

repeat12

shareShare

Yuki Mitsufuji

@mittu1204

6 months ago

When you register for #ICASSP2025, don't forget to select our tutorial on diffusion models for audio: 🎶Transforming Chaos into Harmony: Diffusion Models in Audio Signal Processing🎶 See you in Hyderabad, India!🇮🇳

thumb_up_off_alt47

chat_bubble_outline1

repeat8

shareShare

Yuki Mitsufuji

@mittu1204

6 months ago

Fast sampling methods for discrete diffusion from our lab 🏎️ Our sampling schedule optimization method (Jump Your Steps) is accepted at #ICLR2025 Another (Di4C) is about distllation for discrete diffusion!

thumb_up_off_alt11

chat_bubble_outline0

repeat0

shareShare

Yuki Mitsufuji

@mittu1204

5 months ago

We got 3 at #CVPR2025 and 6 at #ICLR2025. Kudos to my team members and interns for enormous effort to achieve this🪄 Paper details will be released soon✍️

thumb_up_off_alt116

chat_bubble_outline5

repeat1

shareShare

Yuki Mitsufuji

@mittu1204

5 months ago

#CVPR2025 Accepted Papers cvpr.thecvf.com/Conferences/20…

thumb_up_off_alt102

chat_bubble_outline0

repeat9

shareShare

Yuki Mitsufuji

@mittu1204

4 months ago

Here is MMAudio from Sony AI, one of our three papers accepted at #CVPR2025

thumb_up_off_alt50

chat_bubble_outline2

repeat5

shareShare

Yuki Mitsufuji

@mittu1204

4 months ago

The schedule of our #ICLR2025 papers is available. Don't miss poster sessions 1, 3, and 5, held on April 23rd, 24th, and 25th, respectively

thumb_up_off_alt21

chat_bubble_outline0

repeat1

shareShare

Yuki Mitsufuji

@mittu1204

4 months ago

🇮🇳Our tutorial on diffusion for audio signal processing was a great success, thanks to the amazing audience!! We're so grateful you chose our tutorial. We'll be uploading our materials soon!📃 sites.google.com/view/diffusion… #ICASSP2025

thumb_up_off_alt87

chat_bubble_outline1

repeat9

shareShare

Yuki Mitsufuji

@mittu1204

3 months ago

The #ICLR2025 week is coming soon! Check out our intern info as well as 6 papers from our lab: [Intern Positions] - Intern for Audio-Visual & Motion: ai.sony/joinus/job-rol… - Intern for Audio-Visual: sonyglobal.wd1.myworkdayjobs.com/en-US/SonyJapa… - Intern for 3D: ai.sony/joinus/job-rol… [Main

thumb_up_off_alt43

chat_bubble_outline0

repeat10

shareShare

Yuki Mitsufuji

@mittu1204

3 months ago

Our lab scored 3 papers at #ICML2025!🎉 1. Distillation of Discrete Diffusion through Dimensional Correlations (Di4C), joint work with いんそうさん x.com/kurowassann621… 2. Supervised Contrastive Learning from Weakly-labeled Audio Segments for Musical Version Matching (CLEWS)

thumb_up_off_alt143

chat_bubble_outline0

repeat8

shareShare

Yuki Mitsufuji

@mittu1204

3 months ago

It’s fulfilling that our lab’s audio papers were accepted to conferences with high H5-index rankings, where audio research is less common. I’d love to see more audio research highlighted there: - SoundCTM #ICLR2025 - MMDisCo #ICLR2025 - MMAudio #CVPR2025 - CLEWS #ICML2025

thumb_up_off_alt57

chat_bubble_outline0

repeat1

shareShare

Yuki Mitsufuji

@mittu1204

2 months ago

Gemini Diffusion looks great! Our recent work on optimizing sampling schedules (JYS, #ICLR2025) and a distillation method (Di4C, #ICML2025) for discrete diffusion introduces approaches that could potentially make dLLM inference even faster. If you're looking for literature on

thumb_up_off_alt44

chat_bubble_outline1

repeat0

shareShare

Yuki Mitsufuji

@mittu1204

2 months ago

MMDisCo at #ICLR2025 — a framework for co-generation across any modality combination, assuming modality-specific pretrained models. While "Seeing & Hearing" achieves multimodal alignment via ImageBind, MMDisCo uses a discriminator trained on synced/unsynced pairs as

thumb_up_off_alt18

chat_bubble_outline0

repeat2

shareShare