Aleix Conchillo Flaqué (@aconchillo) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

anant

@anant_world

3 months ago

introducing vibe editing a voice agent built into your document editor making docs feel like convos — voice as the new keyboard

thumb_up_off_alt78

chat_bubble_outline6

repeat5

shareShare

We wrote down everything we've learned building voice AI agents over the past two years. Core technology choices, minimizing latency, managing multimodal context, interruption handling, turn detection, evals, state machines, guardrails, memory, async and realtime function

thumb_up_off_alt1,1K

chat_bubble_outline54

repeat219

shareShare

kwindla

@kwindla

3 months ago

Vibe editing! (Vibe writing?) (Vibe essaying?) I love this demo of voice-driven interactive writing. ➡️ Gemini Multimodal Live API for the speech-to-speech interactions. ➡️ Pipecat Cloud for the voice AI infrastructure, agent orchestration, and WebRTC audio transport.

thumb_up_off_alt134

chat_bubble_outline3

repeat11

shareShare

kwindla

@kwindla

3 months ago

[ Hoisting this out of another thread here on X ... ] I'm having a lot of conversations lately about voice agent components and, relatedly, cost. Lots of people are new to voice AI and are exploring the options! You can build voice agents using a "full stack" platform. Vapi

thumb_up_off_alt12

chat_bubble_outline3

repeat4

shareShare

kwindla

@kwindla

3 months ago

Serverless WebRTC for Voice AI Introducing the new `SmallWebRTCTransport` in Pipecat 0.0.63 ... 🧵

thumb_up_off_alt372

chat_bubble_outline10

repeat35

shareShare

kwindla

@kwindla

3 months ago

Can you beat my 1-929-LLM-GAME high score? We've been exploring what you can do with speech-to-speech models. Here's a word guessing game, built with the Gemini Multimodal Live API, Vercel, and Twilio, that has a bunch of interesting features ... 🧵

thumb_up_off_alt27

chat_bubble_outline4

repeat2

shareShare

👩‍💻 Paige Bailey

@dynamicwebpaige

3 months ago

😍 I've been so impressed with how Pipecat AI is pushing the limits with Gemini's multimodal live mode!

thumb_up_off_alt17

chat_bubble_outline0

repeat4

shareShare

kwindla

@kwindla

3 months ago

Announcing: Voice AI course and online community ... swyx and I are hosting a month-long technical deep dive into Voice AI and Voice Agents. Our goals are to: ➡️ cover all the lessons we've learned over the last two years building realtime, conversational AI, ➡️host fun

Announcing: Voice AI course and online community ...

<a href="/swyx/">swyx</a> and I are hosting a month-long technical deep dive into Voice AI and Voice Agents. Our goals are to:
➡️ cover all the lessons we've learned over the last two years building realtime, conversational AI,
➡️host fun

thumb_up_off_alt240

chat_bubble_outline15

repeat31

shareShare

kwindla

@kwindla

3 months ago

Voice agents + MCP ... When I watched this code walk-through from Laserdisc Librarian, I thought "wait, why didn't she edit out the LLM making those mistakes at the beginning ... oh I get it, good demo!" Vanessa shows how to use multiple MCP servers via the new `MCPClient` class in

thumb_up_off_alt208

chat_bubble_outline3

repeat31

shareShare

Aleix Conchillo Flaqué

@aconchillo

3 months ago

Did you know that the "cat" in Pipecat doesn't actually refer to a cat? I think it's a very easy one... but does anyone know what it could be referring to? Pipecat AI

thumb_up_off_alt6

chat_bubble_outline1

repeat0

shareShare

Pipecat AI

@pipecat_ai

3 months ago

ᓚᘏᗢ // 0.0.67

thumb_up_off_alt2

chat_bubble_outline0

repeat1

shareShare

kwindla

@kwindla

2 months ago

Looking forward to doing a workshop at AI Engineer World's Fair on Tuesday: Building Voice Agents with Gemini and Pipecat. 10:40am in the workshop salon. Shrestha Basu Mallick and Philipp Schmid from Google, and Mark Backman and Aleix Conchillo Flaqué who work on Pipecat, will be there with voice

Looking forward to doing a workshop at <a href="/aiDotEngineer/">AI Engineer</a> World's Fair on Tuesday: Building Voice Agents with Gemini and Pipecat.

10:40am in the workshop salon. <a href="/shresbm/">Shrestha Basu Mallick</a> and <a href="/_philschmid/">Philipp Schmid</a> from Google, and <a href="/mark_backman/">Mark Backman</a> and <a href="/aconchillo/">Aleix Conchillo Flaqué</a> who work on Pipecat, will be there with voice

thumb_up_off_alt16

chat_bubble_outline3

repeat4

shareShare

Daily

@trydaily

2 months ago

10:40 workshop ⁦AI Engineer⁩ Building voice agents with Gemini + ⁦Pipecat AI⁩ 🛠️ Come code with Mark, VP Product and Pipecat maintainers ⁦Aleix Conchillo Flaqué⁩ ⁦Varun Singh⁩

10:40 workshop ⁦<a href="/aiDotEngineer/">AI Engineer</a>⁩ Building voice agents with Gemini + ⁦<a href="/pipecat_ai/">Pipecat AI</a>⁩ 🛠️ Come code with Mark, VP Product and Pipecat maintainers ⁦<a href="/aconchillo/">Aleix Conchillo Flaqué</a>⁩ ⁦<a href="/vr000m/">Varun Singh</a>⁩

thumb_up_off_alt5

chat_bubble_outline0

repeat1

shareShare

kwindla

@kwindla

2 months ago

Full house for the Gemini x Pipecat hands-on workshop at AI Engineer World’s Fair. Link to repo Mark created as a starter kit in 🧵

thumb_up_off_alt87

chat_bubble_outline2

repeat6

shareShare

kwindla

@kwindla

2 months ago

This is my periodic appreciation post about the amazing Krisp noise reduction models. Voice AI working perfectly in a very noisy environment. You can use the Krisp models free in Daily’s voice ai hosting platform, Pipecat Cloud.

thumb_up_off_alt36

chat_bubble_outline2

repeat4

shareShare

kwindla

@kwindla

a month ago

Talk to Cartesia speech-to-text about Cartesia speech-to-text. Cartesia launched a streaming STT model today, called Ink-Whisper, that's optimized for realtime voice AI. Pipecat AI has launch-day support for this new model, so I figured I'd talk to the model about itself.

thumb_up_off_alt212

chat_bubble_outline6

repeat31

shareShare

Aleix Conchillo Flaqué

@aconchillo

a month ago

During college years my friends and I started a demoscene group (Anaconda). Our second demo was called The Requiem (youtube.com/watch?v=eQLp4V…). The other day I woke up with a surprise on our group chat, an AI glitched version (sound on)... goosebumps. Et trobem a faltar chochiwig!

thumb_up_off_alt4

chat_bubble_outline0

repeat2

shareShare

Aleix Conchillo Flaqué

@aconchillo

23 days ago

This is too much fun! Here's an improved version with initial Pipecat (RTVI) messaging support.

thumb_up_off_alt6

chat_bubble_outline2

repeat0

shareShare

kwindla

@kwindla

7 days ago

Smart Turn v2: open source, native audio turn detection in 14 languages. New checkpoint of the open source, open data, open training code, semantic VAD model on Hugging Face, fal, and Pipecat AI. - 3x faster inference (12ms on an L40) - 14 languages (13 more than v1, which

thumb_up_off_alt619

chat_bubble_outline28

repeat76

shareShare

kwindla

@kwindla

3 days ago

You don't need a WebRTC server for voice agents. If you're deploying your own voice AI infrastructure, you should almost certainly be using the new(†) serverless WebRTC approach. Serverless is much simpler, which translates to faster development, better scaling, and higher

thumb_up_off_alt237

chat_bubble_outline5

repeat19

shareShare