Brandon Yang (@bclyang) Twitter Tweets • TwiCopy

Cartesia

6 months ago

We’re building for developers scaling voice AI. Introducing two features to make collaboration and visibility easier: 🤝 Organizations and 📊 Dashboards. 🤝 The Organizations feature gives teams shared access to API keys, custom voices, and billing–all under one account.

thumb_up_off_alt12

chat_bubble_outline1

repeat3

shareShare

Cartesia

@cartesia_ai

6 months ago

Headed to Customer Contact Week in Vegas next week? Come find us on the floor! We’re building the next generation of real-time Voice AI–faster, more flexible, and ready for the enterprise–and we’d love to meet you. Swing by our booth to see what we’ve been working on, chat with two of

Headed to <a href="/CustContactWeek/">Customer Contact Week</a> in Vegas next week? Come find us on the floor! We’re building the next generation of real-time Voice AI–faster, more flexible, and ready for the enterprise–and we’d love to meet you. Swing by our booth to see what we’ve been working on, chat with two of

thumb_up_off_alt7

chat_bubble_outline1

repeat1

shareShare

Sabri Eyuboglu

@eyuboglusabri

6 months ago

When we put lots of text (eg a code repo) into LLM context, cost soars b/c of the KV cache’s size. What if we trained a smaller KV cache for our documents offline? Using a test-time training recipe we call self-study, we find that this can reduce cache memory on avg 39x

thumb_up_off_alt287

chat_bubble_outline12

repeat66

shareShare

Cartesia

@cartesia_ai

6 months ago

Building voice agents? Meet Ink-Whisper: the fastest, most affordable streaming speech-to-text model. 🌎 Optimized for accuracy in real-world conditions 👯 Pair with our Sonic text-to-speech → fastest duo in voice AI 🔌 Plugs into Vapi,Pipecat AI, LiveKit Read more:

thumb_up_off_alt85

chat_bubble_outline3

repeat23

shareShare

kwindla

@kwindla

6 months ago

Talk to Cartesia speech-to-text about Cartesia speech-to-text. Cartesia launched a streaming STT model today, called Ink-Whisper, that's optimized for realtime voice AI. Pipecat AI has launch-day support for this new model, so I figured I'd talk to the model about itself.

thumb_up_off_alt212

chat_bubble_outline6

repeat31

shareShare

Shayne

@shayneparlo

5 months ago

Ink-Whisper is fast! Cartesia released a new STT model yesterday, and it's as fast as you'd expect. Streamed transcription finishes in <100ms—before you can say the next sentence. I used it to build a live teleprompter that follows along with what you're saying. Code in 🧵

thumb_up_off_alt43

chat_bubble_outline6

repeat14

shareShare

Cartesia

@cartesia_ai

5 months ago

👑 We’re #1! Sonic-2 leads @Labelbox’s Speech Generation Leaderboard topping out in speech quality, word error rate, and naturalness. Build your real-time voice apps with the 🥇 best voice AI model. ➡️ labelbox.com/leaderboards/s…

thumb_up_off_alt31

chat_bubble_outline0

repeat8

shareShare

Cartesia

@cartesia_ai

5 months ago

Thanks to Sapphire Ventures for hosting our co-founder Brandon Yang at the Hypergrowth Engineering Summit. He shared that voice is the next UX frontier and building voice agents requires excellence at every layer. Great to see so many technical leaders leaned in! #SapphireHypergrowthEng

Thanks to <a href="/SapphireVC/">Sapphire Ventures</a> for hosting our co-founder <a href="/bclyang/">Brandon Yang</a> at the Hypergrowth Engineering Summit. He shared that voice is the next UX frontier and building voice agents requires excellence at every layer. Great to see so many technical leaders leaned in!

#SapphireHypergrowthEng

thumb_up_off_alt15

chat_bubble_outline0

repeat3

shareShare

Ricardo Buitrago

@rbuit_

5 months ago

Despite theoretically handling long contexts, existing recurrent models still fall short: they may fail to generalize past the training length. We show a simple and general fix which enables length generalization in up to 256k sequences, with no need to change the architectures!

thumb_up_off_alt183

chat_bubble_outline4

repeat32

shareShare

Albert Gu

@_albertgu

5 months ago

I converted one of my favorite talks I've given over the past year into a blog post. "On the Tradeoffs of SSMs and Transformers" (or: tokens are bullshit) In a few days, we'll release what I believe is the next major advance for architectures.

thumb_up_off_alt516

chat_bubble_outline19

repeat72

shareShare

Sukjun (June) Hwang

@sukjun_hwang

5 months ago

Tokenization has been the final barrier to truly end-to-end language models. We developed the H-Net: a hierarchical network that replaces tokenization with a dynamic chunking process directly inside the model, automatically discovering and operating over meaningful units of data

thumb_up_off_alt2,2K

chat_bubble_outline58

repeat355

shareShare

Cartesia

@cartesia_ai

5 months ago

We're excited to announce a new research release from the Cartesia team, as part of a long-term collaboration to advance deep learning architectures. We've always believed that model architectures remain a fundamental bottleneck in building truly intelligent systems. H-Nets are

thumb_up_off_alt324

chat_bubble_outline5

repeat43

shareShare

micro1

@micro1_ai

3 months ago

The micro1 research team analyzed more than 300,000 AI-led interviews to assess the performance of different combinations of speech-to-text engines, large language models, and text-to-speech systems, now including Cartesia’s latest TTS engine. Each stack was evaluated for

thumb_up_off_alt25

chat_bubble_outline1

repeat4

shareShare

Brandon Yang

@bclyang

3 months ago

We designed Line with everything we learned about how best in class AI agents are built. - Use code. The best AI experiences are too complex to be built any other way. - Build iteratively with evals. Ultimately evals define the capabilities of any AI product (agents or models) -

thumb_up_off_alt35

chat_bubble_outline2

repeat4

shareShare

Cartesia

@cartesia_ai

2 months ago

Sales reps are constantly on the move. How can they get critical deal insights without breaking their stride? The answer is voice AI. We've partnered with Rox to bring our real-time voice AI to their Command AI chat assistant. With it, reps can get pre-meeting briefs and

thumb_up_off_alt18

chat_bubble_outline2

repeat4

shareShare

Cartesia

@cartesia_ai

2 months ago

Earlier this week, we hosted our YC Voice Agents event with an incredible lineup of panelists: Hassaan Raza at Tavus (YC21) Anthony Krivonos at Toma (YCW24) Arkadiy Telegin at Leaping AI (YC W25) (YCW25) Max Child 🌐 at Volley (YC18) YC companies were among Sonic’s earliest

Earlier this week, we hosted our YC Voice Agents event with an incredible lineup of panelists:
<a href="/hassaanraza97/">Hassaan Raza</a> at <a href="/heytavus/">Tavus</a> (YC21)
Anthony Krivonos at <a href="/TomaAuto/">Toma</a> (YCW24)
<a href="/akyshnik/">Arkadiy Telegin</a> at <a href="/leaping_ai/">Leaping AI (YC W25)</a> (YCW25)
<a href="/mlchild/">Max Child 🌐</a> at <a href="/VolleyGames/">Volley</a> (YC18)

YC companies were among Sonic’s earliest

thumb_up_off_alt22

chat_bubble_outline2

repeat7

shareShare

Sarah Chieng

@sarahchieng

2 months ago

Build a real-time AI Voice Agent that can actually sell. In less than 30 minutes, you'll learn how to build a sophisticated real-time voice sales agent that can have natural conversations with potential customers. Includes a full code notebook, recorded tutorial, and docs The

thumb_up_off_alt433

chat_bubble_outline20

repeat51

shareShare

Brandon Yang

@bclyang

2 months ago

I'll be speaking on a panel on Frontier Speech Models at Vapicon and will be hanging out this afternoon! Come say hi!

thumb_up_off_alt6

chat_bubble_outline0

repeat0

shareShare