MMLab@NTU (@mmlabntu) Twitter Tweets • TwiCopy

Chen Change Loy

@ccloy

2 years ago

📽️📽️ The code of Rerender A Video is now available at github.com/williamyang199… #SIGGRAPHAsia2023 #SIGGRAPHAsia

thumb_up_off_alt131

chat_bubble_outline2

repeat31

shareShare

AK Check out MMLab@NTU concurrent work titled "Interpret Vision Transformers as ConvNets with Dynamic Convolutions": 📄 Read the paper here: arxiv.org/abs/2309.10713 🧐 We explored replacing softmax in Transformers with constant scaling and ReLU (with optional BN/LN). Constant

thumb_up_off_alt17

chat_bubble_outline0

repeat2

shareShare

MMLab@NTU

@mmlabntu

2 years ago

Free lunch this way 👇

thumb_up_off_alt21

chat_bubble_outline0

repeat2

shareShare

MMLab@NTU

@mmlabntu

2 years ago

Congrats to Guangcong Wang Jianyi Wang and Zhaoxi Chen from MMLab@NTU!

thumb_up_off_alt15

chat_bubble_outline0

repeat3

shareShare

Chen Change Loy

@ccloy

2 years ago

Chase Lean Try StableSR, a diffusion model-based upscaler. We paid extra efforts to maintain fidelity. Code and model: github.com/IceClear/Stabl….

<a href="/chaseleantj/">Chase Lean</a> Try StableSR, a diffusion model-based upscaler. We paid extra efforts to maintain fidelity.

Code and model: github.com/IceClear/Stabl….

thumb_up_off_alt21

chat_bubble_outline2

repeat4

shareShare

Ziwei Liu

@liuziwei7

2 years ago

🔥🔥We are excited to announce #Vchitect, an open-source project for video generative models Hugging Face 📽️LaVie (Text2Video Model) - Code: github.com/Vchitect/LaVie - huggingface.co/spaces/Vchitec… 📽️SEINE (Image2Video Model) - Code: github.com/Vchitect/SEINE - huggingface.co/spaces/Vchitec…

thumb_up_off_alt1,1K

chat_bubble_outline35

repeat335

shareShare

MMLab@NTU

@mmlabntu

2 years ago

EdgeSAM - Prompt-In-the-Loop Distillation for On-Device Deployment of SAM 🔗 Project page: mmlab-ntu.com/project/edgesa… 🔗 GitHub: github.com/chongzhou96/Ed… 🤗 Hugging Face: huggingface.co/spaces/chongzh…

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

MMLab@NTU

@mmlabntu

2 years ago

EdgeSAM - Prompt-In-the-Loop Distillation for On-Device Deployment of SAM 🔗 Project page: mmlab-ntu.com/project/edgesa… 🔗 GitHub: github.com/chongzhou96/Ed… 🤗 Hugging Face: huggingface.co/spaces/chongzh…

thumb_up_off_alt8

chat_bubble_outline0

repeat0

shareShare

Chen Change Loy

@ccloy

2 years ago

🔬 Our study introduces "Upscale-A-Video," a text-guided latent diffusion framework for video upscaling. It ensures temporal coherence locally & globally, balancing fidelity and quality. 🚀 Project page: shangchenzhou.com/projects/upsca… 💻 GitHub: github.com/sczhou/Upscale… 🎥 Video:

thumb_up_off_alt295

chat_bubble_outline12

repeat52

shareShare

SenseTime

@sensetime_ai

2 years ago

thumb_up_off_alt125

chat_bubble_outline8

repeat23

shareShare

Xingang Pan

@xingangp

2 years ago

(1/2) We are actively seeking PhD candidates from various countries to foster diversity in our research group at Nanyang Technological University. Know someone interested in a PhD with us? Please refer them to our team. Thanks for supporting diversity in academia! 🌍🎓

thumb_up_off_alt86

chat_bubble_outline2

repeat17

shareShare

The AI Talks

@theaitalksorg

2 years ago

The Upcoming AI talk: 🌋LLaVA🦙 A Vision-and-Language Approach to Computer Vision in the Wild by Chunyuan Li Chunyuan Li More info: mailchi.mp/1242f078b2b1/a… Subscribe us: mailchi.mp/4417dc2cde83/t…

thumb_up_off_alt33

chat_bubble_outline0

repeat14

shareShare

Chen Change Loy

@ccloy

a year ago

📸🌟 Attention all photography and imaging enthusiasts! Join us at the Third MIPI Workshop at #CVPR2024! 📍 Location: Arch 213 ⏰ Time: 08:30 AM - 12:10 PM 🌐 Website: mipi-challenge.org Don't miss out on an exciting lineup of speakers: 🔹 Lei Zhang: How Far Are We From

thumb_up_off_alt31

chat_bubble_outline2

repeat9

shareShare

Chen Change Loy

@ccloy

a year ago

We turned our method, rejected by CVPR and ECCV, into the iOS app "Cutcha". EdgeSAM, our fast Segment Anything Model, runs at over 30 FPS on an iPhone 14. Enjoy intuitive one-touch object selection and precise editing—all processed locally on your device. No cloud needed!

thumb_up_off_alt214

chat_bubble_outline8

repeat25

shareShare

Size Wu

@wusize

7 months ago

🔥 We release Harmon: a unified framework for multimodal understanding & generation with a shared visual encoder (vs. decoupled Janus/-Pro). 💥 SOTA on GenEval, MJHQ, WISE 🧠 Strong understanding performance 📄 Paper: huggingface.co/papers/2503.21… 🔗 Code: github.com/wusize/Harmon

thumb_up_off_alt18

chat_bubble_outline2

repeat1

shareShare

Chen Change Loy

@ccloy

7 months ago

🚀 Meet Harmon – a unified model for both image generation and understanding! Trained with a shared masked autoregressive encoder, it sets new benchmarks on GenEval & MJHQ30K. 🖼️💬 Try the live demo now on Hugging Face: 👉 huggingface.co/spaces/wusize/… Paper: arxiv.org/abs/2503.21979

thumb_up_off_alt14

chat_bubble_outline0

repeat1

shareShare

AK

@_akhaliq

6 months ago

Aero-1-Audio is out on Hugging Face Trained in <24h on just 16×H100 Handles 15+ min audio seamlessly Outperforms bigger models like Whisper, Qwen-2-Audio & commercial services from ElevenLabs/Scribe

thumb_up_off_alt418

chat_bubble_outline8

repeat64

shareShare

Ziqi Huang

@ziqi_huang_

5 months ago

🎬 𝗖𝗩𝗣𝗥 𝟮𝟬𝟮𝟱 𝗧𝘂𝘁𝗼𝗿𝗶𝗮𝗹 𝙁𝙧𝙤𝙢 𝙑𝙞𝙙𝙚𝙤 𝙂𝙚𝙣𝙚𝙧𝙖𝙩𝙞𝙤𝙣 𝙩𝙤 𝙒𝙤𝙧𝙡𝙙 𝙈𝙤𝙙𝙚𝙡 🚀 Hosted by MMLab@NTU × Kuaishou, etc 📅 June 11 | Nashville 🔗 world-model-tutorial.github.io 🧠 Video is just the start. World modeling is the goal. #CVPR2025 #WorldModel

thumb_up_off_alt137

chat_bubble_outline1

repeat28

shareShare

Shulin Tian

@shulin_tian

4 months ago

🎥 Video is already a tough modality for reasoning. Egocentric video? Even tougher! It is longer, messier, and harder. 💡 How do we tackle these extremely long, information-dense sequences without exhausting GPU memory or hitting API limits? We introduce 👓Ego-R1: A framework

thumb_up_off_alt36

chat_bubble_outline7

repeat9

shareShare