camenduru (@camenduru) 's Twitter Profile
camenduru

@camenduru

building 🥪 @tost_ai ❤ open source github.com/camenduru

ID: 36723

calendar_today02-12-2006 08:14:37

2,2K Tweet

20,20K Followers

4,4K Following

Toby Kim (@_doyeob_) 's Twitter Profile Photo

Two undergrads. One still in the military. Zero funding. One ridiculous goal: build a TTS model that rivals NotebookLM Podcast, ElevenLabs Studio, and Sesame CSM. Somehow… we pulled it off. Here’s how 👇

camenduru (@camenduru) 's Twitter Profile Photo

🎧 Dia is a 1.6B parameter text to speech model created by Nari Labs. (Apache 2.0) 🔊 Jupyter Notebook 🥳 Thanks to Toby Kim ❤ Nari Labs ❤ 🌐page: tally.so/r/meokbo 🧬code: github.com/nari-labs/dia 🍊jupyte: github.com/camenduru/dia-…

Ostris (@ostrisai) 's Twitter Profile Photo

Flex.2-preview is here with text to image, universal control (line, pose, depth), and inpainting all baked into one model. Fine tunable with AI-Toolkit, Apache 2.0 license, 8B parameters. Link in 🧵

Flex.2-preview is here with text to image, universal control (line, pose, depth), and inpainting all baked into one model. Fine tunable with AI-Toolkit, Apache 2.0 license, 8B parameters. Link in 🧵
Yin Cui (@yincuicv) 's Twitter Profile Photo

Introducing the Describe Anything Model (DAM), a powerful Multimodal LLM that generates detailed descriptions for user-specified regions in images or videos using points, boxes, scribbles, or masks. Open-source code, models, demo, data, and benchmark at: describe-anything.github.io

PyTorch (@pytorch) 's Twitter Profile Photo

Update from the PyTorch maintainers: 2.7 is out now. 🔹 Support for NVIDIA Blackwell (CUDA 12.8) 🔹 Mega Cache 🔹 torch.compile for Function Modes 🔹 FlexAttention updates 🔹 Intel GPU perf boost 🔗 Blog: hubs.la/Q03jBPSL0 📄 Release notes: hubs.la/Q03jBPlW0 #PyTorch

Update from the PyTorch maintainers: 2.7 is out now.
🔹 Support for NVIDIA Blackwell (CUDA 12.8)
🔹 Mega Cache
🔹 torch.compile for Function Modes
🔹 FlexAttention updates
🔹 Intel GPU perf boost
🔗 Blog: hubs.la/Q03jBPSL0
📄 Release notes: hubs.la/Q03jBPlW0
#PyTorch
ACE Studio (@acestudio_en) 's Twitter Profile Photo

We’re excited to release ACE-Step / ACE-Step-v1-3.5B, a fast, versatile DiT-based foundation model for music generation that runs on consumer-grade GPUs. With its simple architecture and low hardware requirements, it’s easy to fine-tune for various music tasks, empowering, not

Lightricks (@lightricks) 's Twitter Profile Photo

Meet LTX-Video 13B, our latest and most capable open-source video generation model. - 13B parameters - Multiscale rendering for sharper detail - Smarter motion + scene understanding - Keyframes, character + camera movement, multi-shot support - Still fast – and runs locally

camenduru (@camenduru) 's Twitter Profile Photo

🎸 🎙 🎵 ACE-Step: A Step Towards Music Generation Foundation Model 📻 🎶 Jupyter Notebook 🥳 Thanks to Gong Junmin ❤ Sean Zhao ❤ Sen Wang ❤ Shengyuan Xu ❤ Joe Guo ❤ 🌐page: ace-step.github.io 🧬code: github.com/ace-step/ACE-S… 🍊jupyter: github.com/camenduru/ACE-…

Hunyuan (@tencenthunyuan) 's Twitter Profile Photo

🚀 Introducing HunyuanCustom: An open-source, multimodal-driven architecture for customized video generation, powered by HunyuanVideo-13B. Outperforming existing open-source models, it rivals top closed-source solutions! 🎥 Highlights: ✅Subject Consistency: Maintains identity

Hunyuan (@tencenthunyuan) 's Twitter Profile Photo

🚀You can use HunyuanCustom on ComfyUI. Special thanks to Kijai Jukka Seppänen again! HunyuanCustom has been integrated into [ComfyUI-HunyuanVideoWrapper](github.com/kijai/ComfyUI-…) by [Kijai](github.com/kijai). To use it, you need to: 1️⃣ Download the `fp8_scaled` model

Wan (@alibaba_wan) 's Twitter Profile Photo

✨ All in One, Wan for All✨ We are excited to introduce our latest model to our talented community creators: Wan2.1-VACE, All-in-One Video Creation and Editing model. Model size: 1.3B, 14B License: Apache-2.0 📌 Wan2.1-VACE provides solutions for various tasks, including

Jake Steinerman (@jasteinerman) 's Twitter Profile Photo

We built an entire VR game....and open sourced the entire thing. Introducing "North Star" - play it today on Quest, and download the entire project on Github!

Zachary Novack @ICLR2025 🇸🇬 (@zacknovack) 's Twitter Profile Photo

Releasing Stable Audio Open Small! 75ms GPU latency! 7s *mobile* CPU latency! How? w/Adversarial Relativistic Contrastive (ARC) Post-Training! 📘:arxiv.org/abs/2505.08175 🥁:arc-text2audio.github.io/web/ 🤗:huggingface.co/stabilityai/st… Here’s how we made the fastest TTA out there🧵

Baku (@bk_sakurai) 's Twitter Profile Photo

*動画生成:Sunoで作ったオリジナル曲をComfyUI-FLOATで歌ってもらう #comfyui note投稿しました。 note.com/bakushu/n/n1f8…

Bin Lin (@linbin46984) 's Twitter Profile Photo

🚀UniWorld: a unified model that skips VAEs and uses semantic features from SigLIP! Using just 1% of BAGEL’s data, it outperforms on image editing and excels in understanding & generation. 🌟Now data, model, training & evaluation script are open-source! github.com/PKU-YuanGroup/…

Chenyang Si (@scy994) 's Twitter Profile Photo

⚡️DCM: High-Quality Video Generation Accelerator⚡️ 🚀DCM brings 10× faster inference to video diffusion models! 🚀From 1500s → 120s on models like HunyuanVideo13B. -Paper: huggingface.co/papers/2506.03… -Code: github.com/Vchitect/DCM -Project: vchitect.github.io/DCM/

Matthias Niessner (@mattniessner) 's Twitter Profile Photo

📢Code Release of Pixel3DMM 📢 Looking for a robust and accurate face tracker? Our state-of-the-art tracker handles challenging in-the-wild settings, such as extreme lighting conditions, fast movements, and occlusions. 👨‍💻github.com/SimonGiebenhai… 🌍simongiebenhain.github.io/pixel3dmm/

Hao Zhang (@haozhang623) 's Twitter Profile Photo

🚨 New paper accepted to #ICCV2025! We introduce PhysRig – a differentiable physics-based rigging framework that brings realistic dynamics to characters 🔩🦖 💥 Soft tissues, tails, ears… now move like real flesh, not rigid plastic. #AI #Graphics #Animation #ComputerVision 👇