MMLab@NTU (@mmlabntu) 's Twitter Profile
MMLab@NTU

@mmlabntu

Multimedia Laboratory @NTUsg, affiliated with S-Lab.
Computer Vision, Image Processing, Computer Graphics, Deep Learning

ID: 1394997810584428547

linkhttp://www.mmlab-ntu.com calendar_today19-05-2021 12:46:26

69 Tweet

1,1K Followers

18 Following

Chen Change Loy (@ccloy) 's Twitter Profile Photo

๐Ÿ“ฝ๏ธ๐Ÿ“ฝ๏ธ The code of Rerender A Video is now available at github.com/williamyang199โ€ฆ #SIGGRAPHAsia2023 #SIGGRAPHAsia

Chen Change Loy (@ccloy) 's Twitter Profile Photo

AK Check out MMLab@NTU concurrent work titled "Interpret Vision Transformers as ConvNets with Dynamic Convolutions": ๐Ÿ“„ Read the paper here: arxiv.org/abs/2309.10713 ๐Ÿง We explored replacing softmax in Transformers with constant scaling and ReLU (with optional BN/LN). Constant

Chen Change Loy (@ccloy) 's Twitter Profile Photo

Chase Lean Try StableSR, a diffusion model-based upscaler. We paid extra efforts to maintain fidelity. Code and model: github.com/IceClear/Stablโ€ฆ.

<a href="/chaseleantj/">Chase Lean</a> Try StableSR, a diffusion model-based upscaler. We paid extra efforts to maintain fidelity. 

Code and model: github.com/IceClear/Stablโ€ฆ.
Ziwei Liu (@liuziwei7) 's Twitter Profile Photo

๐Ÿ”ฅ๐Ÿ”ฅWe are excited to announce #Vchitect, an open-source project for video generative models Hugging Face ๐Ÿ“ฝ๏ธLaVie (Text2Video Model) - Code: github.com/Vchitect/LaVie - huggingface.co/spaces/Vchitecโ€ฆ ๐Ÿ“ฝ๏ธSEINE (Image2Video Model) - Code: github.com/Vchitect/SEINE - huggingface.co/spaces/Vchitecโ€ฆ

MMLab@NTU (@mmlabntu) 's Twitter Profile Photo

EdgeSAM - Prompt-In-the-Loop Distillation for On-Device Deployment of SAM ๐Ÿ”— Project page: mmlab-ntu.com/project/edgesaโ€ฆ ๐Ÿ”— GitHub: github.com/chongzhou96/Edโ€ฆ ๐Ÿค— Hugging Face: huggingface.co/spaces/chongzhโ€ฆ

MMLab@NTU (@mmlabntu) 's Twitter Profile Photo

EdgeSAM - Prompt-In-the-Loop Distillation for On-Device Deployment of SAM ๐Ÿ”— Project page: mmlab-ntu.com/project/edgesaโ€ฆ ๐Ÿ”— GitHub: github.com/chongzhou96/Edโ€ฆ ๐Ÿค— Hugging Face: huggingface.co/spaces/chongzhโ€ฆ

Chen Change Loy (@ccloy) 's Twitter Profile Photo

๐Ÿ”ฌ Our study introduces "Upscale-A-Video," a text-guided latent diffusion framework for video upscaling. It ensures temporal coherence locally & globally, balancing fidelity and quality. ๐Ÿš€ Project page: shangchenzhou.com/projects/upscaโ€ฆ ๐Ÿ’ป GitHub: github.com/sczhou/Upscaleโ€ฆ ๐ŸŽฅ Video:

Xingang Pan (@xingangp) 's Twitter Profile Photo

(1/2) We are actively seeking PhD candidates from various countries to foster diversity in our research group at Nanyang Technological University. Know someone interested in a PhD with us? Please refer them to our team. Thanks for supporting diversity in academia! ๐ŸŒ๐ŸŽ“

The AI Talks (@theaitalksorg) 's Twitter Profile Photo

The Upcoming AI talk: ๐ŸŒ‹LLaVA๐Ÿฆ™ A Vision-and-Language Approach to Computer Vision in the Wild by Chunyuan Li Chunyuan Li More info: mailchi.mp/1242f078b2b1/aโ€ฆ Subscribe us: mailchi.mp/4417dc2cde83/tโ€ฆ

The Upcoming AI talk: 

๐ŸŒ‹LLaVA๐Ÿฆ™
A Vision-and-Language Approach to Computer Vision in the Wild by Chunyuan Li <a href="/ChunyuanLi/">Chunyuan Li</a>

More info: 
mailchi.mp/1242f078b2b1/aโ€ฆ
Subscribe us:
mailchi.mp/4417dc2cde83/tโ€ฆ
Chen Change Loy (@ccloy) 's Twitter Profile Photo

๐Ÿ“ธ๐ŸŒŸ Attention all photography and imaging enthusiasts! Join us at the Third MIPI Workshop at #CVPR2024! ๐Ÿ“ Location: Arch 213 โฐ Time: 08:30 AM - 12:10 PM ๐ŸŒ Website: mipi-challenge.org Don't miss out on an exciting lineup of speakers: ๐Ÿ”น Lei Zhang: How Far Are We From

๐Ÿ“ธ๐ŸŒŸ Attention all photography and imaging enthusiasts! Join us at the Third MIPI Workshop at #CVPR2024!

๐Ÿ“ Location: Arch 213 
โฐ Time: 08:30 AM - 12:10 PM 
๐ŸŒ Website: mipi-challenge.org

Don't miss out on an exciting lineup of speakers:
๐Ÿ”น Lei Zhang: How Far Are We From
Chen Change Loy (@ccloy) 's Twitter Profile Photo

We turned our method, rejected by CVPR and ECCV, into the iOS app "Cutcha". EdgeSAM, our fast Segment Anything Model, runs at over 30 FPS on an iPhone 14. Enjoy intuitive one-touch object selection and precise editingโ€”all processed locally on your device. No cloud needed!

We turned our method, rejected by CVPR and ECCV, into the iOS app "Cutcha".

EdgeSAM, our fast Segment Anything Model, runs at over 30 FPS on an iPhone 14. Enjoy intuitive one-touch object selection and precise editingโ€”all processed locally on your device. No cloud needed!
Chen Change Loy (@ccloy) 's Twitter Profile Photo

๐Ÿš€ Meet Harmon โ€“ a unified model for both image generation and understanding! Trained with a shared masked autoregressive encoder, it sets new benchmarks on GenEval & MJHQ30K. ๐Ÿ–ผ๏ธ๐Ÿ’ฌ Try the live demo now on Hugging Face: ๐Ÿ‘‰ huggingface.co/spaces/wusize/โ€ฆ Paper: arxiv.org/abs/2503.21979

AK (@_akhaliq) 's Twitter Profile Photo

Aero-1-Audio is out on Hugging Face Trained in <24h on just 16ร—H100 Handles 15+ min audio seamlessly Outperforms bigger models like Whisper, Qwen-2-Audio & commercial services from ElevenLabs/Scribe

Ziqi Huang (@ziqi_huang_) 's Twitter Profile Photo

๐ŸŽฌ ๐—–๐—ฉ๐—ฃ๐—ฅ ๐Ÿฎ๐Ÿฌ๐Ÿฎ๐Ÿฑ ๐—ง๐˜‚๐˜๐—ผ๐—ฟ๐—ถ๐—ฎ๐—น ๐™๐™ง๐™ค๐™ข ๐™‘๐™ž๐™™๐™š๐™ค ๐™‚๐™š๐™ฃ๐™š๐™ง๐™–๐™ฉ๐™ž๐™ค๐™ฃ ๐™ฉ๐™ค ๐™’๐™ค๐™ง๐™ก๐™™ ๐™ˆ๐™ค๐™™๐™š๐™ก ๐Ÿš€ Hosted by MMLab@NTU ร— Kuaishou, etc ๐Ÿ“… June 11 | Nashville ๐Ÿ”— world-model-tutorial.github.io ๐Ÿง  Video is just the start. World modeling is the goal. #CVPR2025 #WorldModel

๐ŸŽฌ ๐—–๐—ฉ๐—ฃ๐—ฅ ๐Ÿฎ๐Ÿฌ๐Ÿฎ๐Ÿฑ ๐—ง๐˜‚๐˜๐—ผ๐—ฟ๐—ถ๐—ฎ๐—น
๐™๐™ง๐™ค๐™ข ๐™‘๐™ž๐™™๐™š๐™ค ๐™‚๐™š๐™ฃ๐™š๐™ง๐™–๐™ฉ๐™ž๐™ค๐™ฃ ๐™ฉ๐™ค ๐™’๐™ค๐™ง๐™ก๐™™ ๐™ˆ๐™ค๐™™๐™š๐™ก

๐Ÿš€ Hosted by MMLab@NTU ร— Kuaishou, etc
๐Ÿ“… June 11 | Nashville
๐Ÿ”— world-model-tutorial.github.io
๐Ÿง  Video is just the start. World modeling is the goal.
#CVPR2025 #WorldModel
Shulin Tian (@shulin_tian) 's Twitter Profile Photo

๐ŸŽฅ Video is already a tough modality for reasoning. Egocentric video? Even tougher! It is longer, messier, and harder. ๐Ÿ’ก How do we tackle these extremely long, information-dense sequences without exhausting GPU memory or hitting API limits? We introduce ๐Ÿ‘“Ego-R1: A framework