zach 🏔🎶 (@zqevans) Twitter Tweets • TwiCopy

RoyalCities

a year ago

I’ve officially released the first finetuned Stable Audio Open sample generator on HF. Bringing with this release I’ve also updated the Gradio to take full advantage of the model. So lets dive in, see what I’ve done and figure out where we may be headed. [Long Thread🧵]

thumb_up_off_alt104

chat_bubble_outline6

repeat19

shareShare

Jordi Pons

@jordiponsdotme

a year ago

ICML in Vienna is coming to a close! 🇦🇹 Here are the top-10 general (and audio) trends from ICML 2024. A thread 🧵 1. Open vs. Closed AI: The debate was very present, notable in Soumith Chintala's keynote or by the release of Llama 3.1 (among others). icml.cc/virtual/2024/p…

thumb_up_off_alt37

chat_bubble_outline1

repeat7

shareShare

Yoach

@yoachlacombe

a year ago

🎵 Stable Audio Open 🎵 just landed into diffusers, be ready to get: -> a whole lot of fun 🎹🎺🎷🥁🎼 🎸 -> easy installation: `pip install diffusers` -> easy usage: 5 lines

thumb_up_off_alt166

chat_bubble_outline4

repeat31

shareShare

RoyalCities

@royalcities

a year ago

Flipping a Stable Audio AI Sample into glitchy liquid texture.

thumb_up_off_alt18

chat_bubble_outline2

repeat1

shareShare

zach 🏔🎶

@zqevans

a year ago

This paper title really undersells the fact that this only works because the CLAP model *didn’t* use audio-only data.

thumb_up_off_alt23

chat_bubble_outline1

repeat1

shareShare

zach 🏔🎶

@zqevans

a year ago

call me Kermit the way I'm bout to muP it

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

zach 🏔🎶

@zqevans

a year ago

decrescendos >>>>>>>

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Mat Dryhurst

@matdryhurst

a year ago

AI training can and should be beautiful

thumb_up_off_alt129

chat_bubble_outline6

repeat9

shareShare

zach 🏔🎶

@zqevans

a year ago

Getting the Skrillex meal at McDonald’s (a nice Sprite)

thumb_up_off_alt10

chat_bubble_outline1

repeat0

shareShare

zach 🏔🎶

@zqevans

10 months ago

I've been trying this out today, it's incredible. This is the Flash Attention moment for contrastive model training. Super easy to use the code, it handles all the distributed stuff for you. This should unlock a lot of new research! Great work!

thumb_up_off_alt18

chat_bubble_outline0

repeat2

shareShare

dadabots

@dadabots

9 months ago

Prompt Jockey (n) - it's DJing but harder. With DJing you're playing usually other peoples' tracks. But with this, the tracks don't even exist yet. You're prompting them live with a neural network. lyra bubbles~ ♪❀ zach 🏔🎶 encanti Mr. Bill

thumb_up_off_alt159

chat_bubble_outline12

repeat32

shareShare

RoyalCities

@royalcities

9 months ago

📢ATTN Producers & Musicians📢 Today is the release of the most capable FREE open source sample generator tailored for EDM production. Its the largest model yet with HIGH musicality & VERY robust AI Style transfer. ft. a link to get started with it RIGHT NOW😎 Lets dive in! 👇

thumb_up_off_alt66

chat_bubble_outline7

repeat11

shareShare

zach 🏔🎶

@zqevans

8 months ago

Do you want to do cutting edge research on generative music production tools? Do you want to publish papers and also release open weights and code? Apply to be a research intern on our team!

thumb_up_off_alt19

chat_bubble_outline2

repeat4

shareShare

Stability AI

@stabilityai

5 months ago

What if you could turn everyday sounds into songs? Our Audio Researcher CJ Carr shows you how with Stable Audio. You can now access Stable Audio via the Stability AI API — or, as always, at StableAudio.com. Learn more: bit.ly/4iI0V4q

thumb_up_off_alt85

chat_bubble_outline7

repeat96

shareShare

zach 🏔🎶

@zqevans

4 months ago

Super excited to launch Stable Audio Open Small today! It was great working with Arm on this model to make sure it runs efficiently on Arm CPUs. I'm also now an Arm Ambassador! I'm looking forward to helping the Arm developer community integrate this new model.

thumb_up_off_alt31

chat_bubble_outline0

repeat6

shareShare

Zachary Novack @ICLR2025 🇸🇬

@zacknovack

4 months ago

Releasing Stable Audio Open Small! 75ms GPU latency! 7s *mobile* CPU latency! How? w/Adversarial Relativistic Contrastive (ARC) Post-Training! 📘:arxiv.org/abs/2505.08175 🥁:arc-text2audio.github.io/web/ 🤗:huggingface.co/stabilityai/st… Here’s how we made the fastest TTA out there🧵

thumb_up_off_alt84

chat_bubble_outline2

repeat14

shareShare

lyra bubbles~ ❀

@_lyraaaa_

4 months ago

got stable audio open small training in <12gb VRAM at batch size 8 & default sample size everyone with 16-24gb cards who wanted to locally tune SAO 1.0 but couldn't (27.6gb vram) should be very happy now

thumb_up_off_alt28

chat_bubble_outline2

repeat3

shareShare

lyra bubbles~ ❀

@_lyraaaa_

4 months ago

sample gen models =/= song gen models you can't compare the two, they do very different things

thumb_up_off_alt10

chat_bubble_outline0

repeat2

shareShare

Nate Raw

@_nateraw

3 months ago

Landed a feature in stable audio tools that should make it easier to fine-tune your own custom text to music models - especially if you're GPU poor. Pre-encoding latents ahead of time reduces GPU memory + helps keep your GPU hot🔥 Documentation here: github.com/Stability-AI/s…

thumb_up_off_alt53

chat_bubble_outline3

repeat6

shareShare

Nate Raw

@_nateraw

3 months ago

“Y’all can have your vibe coding, I’m doing vibe science” - zach 🏔🎶

thumb_up_off_alt3

chat_bubble_outline0

repeat2

shareShare