Yuki Mitsufuji (@mittu1204) 's Twitter Profile
Yuki Mitsufuji

@mittu1204

PhD, Distinguished Engineer @Sony, Lead Research Scientist/VP of AI Research @SonyAI_global, Head of Creative AI Lab, Associate Prof. @tokyotech_jp

ID: 94236519

linkhttps://www.yukimitsufuji.com/ calendar_today03-12-2009 02:39:47

3,3K Tweet

3,3K Followers

93 Following

Yuki Mitsufuji (@mittu1204) 's Twitter Profile Photo

Here's a sneak peek of a crowd-based competition on sounding video generation starting from Oct. 1st! #ECCV2024 aicrowd.com/challenges/sou…

Yuki Mitsufuji (@mittu1204) 's Twitter Profile Photo

A list of diffusion works & tutorial (at #ISMIR2024) from our lab! [ML] arxiv.org/abs/2405.17251 #NeurIPS24 (GenWarp: Novel View Synthesis) arxiv.org/abs/2405.14822 #NeurIPS24 (PaGoDA: Multi-Scale 1 Step Generator) arxiv.org/abs/2310.02279 #ICLR24 (CTM: Fast Image Gen.)

Yuki Mitsufuji (@mittu1204) 's Twitter Profile Photo

🎶Large music models from our team🎶: 1. SoniDo🎼 for music mixing, demixing, transcription, etc. pdf: arxiv.org/abs/2411.01135 2. OpenMU🧙‍♂️ for music captioning, reasoning, lyric understanding, etc. pdf: arxiv.org/abs/2410.15573 code: mzhaojp22.github.io/open_music_und… demo ISMIR Conference :

Yuki Mitsufuji (@mittu1204) 's Twitter Profile Photo

I'm very happy to see that MMAudio, which my talented colleagues (mi141, A. Hayakawa, Takashi Shibuya) and intern (Rex Cheng) at Sony AI have invested their time and effort into, is being tested by so many creative people in X arXiv: arxiv.org/abs/2412.15322

Yuki Mitsufuji (@mittu1204) 's Twitter Profile Photo

When you register for #ICASSP2025, don't forget to select our tutorial on diffusion models for audio: 🎶Transforming Chaos into Harmony: Diffusion Models in Audio Signal Processing🎶 See you in Hyderabad, India!🇮🇳

When you register for #ICASSP2025, don't forget to select our tutorial on diffusion models for audio:

🎶Transforming Chaos into Harmony: Diffusion Models in Audio Signal Processing🎶

See you in Hyderabad, India!🇮🇳
Yuki Mitsufuji (@mittu1204) 's Twitter Profile Photo

Fast sampling methods for discrete diffusion from our lab 🏎️ Our sampling schedule optimization method (Jump Your Steps) is accepted at #ICLR2025 Another (Di4C) is about distllation for discrete diffusion!

Yuki Mitsufuji (@mittu1204) 's Twitter Profile Photo

We got 3 at #CVPR2025 and 6 at #ICLR2025. Kudos to my team members and interns for enormous effort to achieve this🪄 Paper details will be released soon✍️

Yuki Mitsufuji (@mittu1204) 's Twitter Profile Photo

The schedule of our #ICLR2025 papers is available. Don't miss poster sessions 1, 3, and 5, held on April 23rd, 24th, and 25th, respectively

The schedule of our #ICLR2025 papers is available. Don't miss poster sessions 1, 3, and 5, held on April 23rd, 24th, and 25th, respectively
Yuki Mitsufuji (@mittu1204) 's Twitter Profile Photo

🇮🇳Our tutorial on diffusion for audio signal processing was a great success, thanks to the amazing audience!! We're so grateful you chose our tutorial. We'll be uploading our materials soon!📃 sites.google.com/view/diffusion… #ICASSP2025

🇮🇳Our tutorial on diffusion for audio signal processing was a great success, thanks to the amazing audience!! We're so grateful you chose our tutorial. We'll be uploading our materials soon!📃

sites.google.com/view/diffusion…

#ICASSP2025
Yuki Mitsufuji (@mittu1204) 's Twitter Profile Photo

The #ICLR2025 week is coming soon! Check out our intern info as well as 6 papers from our lab: [Intern Positions] - Intern for Audio-Visual & Motion: ai.sony/joinus/job-rol… - Intern for Audio-Visual: sonyglobal.wd1.myworkdayjobs.com/en-US/SonyJapa… - Intern for 3D: ai.sony/joinus/job-rol… [Main

Yuki Mitsufuji (@mittu1204) 's Twitter Profile Photo

Our lab scored 3 papers at #ICML2025!🎉 1. Distillation of Discrete Diffusion through Dimensional Correlations (Di4C), joint work with いんそうさん x.com/kurowassann621… 2. Supervised Contrastive Learning from Weakly-labeled Audio Segments for Musical Version Matching (CLEWS)

Yuki Mitsufuji (@mittu1204) 's Twitter Profile Photo

It’s fulfilling that our lab’s audio papers were accepted to conferences with high H5-index rankings, where audio research is less common. I’d love to see more audio research highlighted there: - SoundCTM #ICLR2025 - MMDisCo #ICLR2025 - MMAudio #CVPR2025 - CLEWS #ICML2025

Yuki Mitsufuji (@mittu1204) 's Twitter Profile Photo

Gemini Diffusion looks great! Our recent work on optimizing sampling schedules (JYS, #ICLR2025) and a distillation method (Di4C, #ICML2025) for discrete diffusion introduces approaches that could potentially make dLLM inference even faster. If you're looking for literature on

Yuki Mitsufuji (@mittu1204) 's Twitter Profile Photo

MMDisCo at #ICLR2025 — a framework for co-generation across any modality combination, assuming modality-specific pretrained models. While "Seeing & Hearing" achieves multimodal alignment via ImageBind, MMDisCo uses a discriminator trained on synced/unsynced pairs as

MMDisCo at #ICLR2025 — a framework for co-generation across any modality combination, assuming modality-specific pretrained models. While "Seeing & Hearing" achieves multimodal alignment via ImageBind, MMDisCo uses a discriminator trained on synced/unsynced pairs as