Tuhin Srivastava (@tuhinone) Twitter Tweets • TwiCopy

zhyncs

6 months ago

I’ll be joining my Baseten colleague Philip Kiely at the AI Engineer World’s Fair AI Engineer in San Francisco, June 3–5, to Introduce LLM serving with SGLang LMSYS Org. We’d love for you to stop by and exchange ideas in person!🤗

I’ll be joining my <a href="/basetenco/">Baseten</a> colleague <a href="/philip_kiely/">Philip Kiely</a> at the AI Engineer World’s Fair <a href="/aiDotEngineer/">AI Engineer</a> in San Francisco, June 3–5, to Introduce LLM serving with SGLang <a href="/lmsysorg/">LMSYS Org</a>. We’d love for you to stop by and exchange ideas in person!🤗

thumb_up_off_alt42

chat_bubble_outline2

repeat6

shareShare

Amir Haghighat

@amiruci

5 months ago

Product launch with the backstory: Internally we had always said let's do *1 thing* but do it well. For us that was inference. And we said at some point we'll earn the rights to expand the surface area beyond that. That some point is today. The vast majority of our revenue

thumb_up_off_alt33

chat_bubble_outline3

repeat7

shareShare

Baseten

@basetenco

5 months ago

🚀 Our "technical" marketer might not be looped in, but today is our biggest launch day yet. We're introducing two new products to serve the inference lifecycle: Model APIs and Training. Model APIs are frontier models running on the Baseten Inference Stack, purpose-built for

thumb_up_off_alt89

chat_bubble_outline7

repeat22

shareShare

Baseten

@basetenco

5 months ago

We’re working with oxen.ai (moo!) to power model training with robust data management tooling. Learn more here: oxen.ai/entry/fine-tun… We're also partnering with Mixedbread to support their frontier embedding models, and Elias and Amu at Canopy Labs to deliver

thumb_up_off_alt15

chat_bubble_outline1

repeat6

shareShare

zhyncs

@zhyncs42

5 months ago

Lower Latency. Higher Throughput. Lower Cost.🤗 #DeepSeek #Llama

thumb_up_off_alt28

chat_bubble_outline3

repeat5

shareShare

Greg Schoeninger

@gregschoeninger

5 months ago

Excited to be partnering with Baseten to combine oxen.ai's datasets with their multi-node GPU infra 🚀 reach out if you want early access!

Excited to be partnering with <a href="/basetenco/">Baseten</a> to combine <a href="/oxen_ai/">oxen.ai</a>'s datasets with their multi-node GPU infra 🚀 reach out if you want early access!

thumb_up_off_alt37

chat_bubble_outline10

repeat15

shareShare

Baseten

@basetenco

5 months ago

Our secret sauce? The Baseten Inference Stack. It consists of two core layers: the Inference Runtime and Inference-optimized Infrastructure. Our engineers break down all the levers we pull to optimize each layer in our new white paper.

thumb_up_off_alt27

chat_bubble_outline4

repeat6

shareShare

Captions

@getcaptionsapp

5 months ago

Introducing Mirage Studio. Powered by our proprietary omni-modal foundation model. Generate expressive videos at scale, with actors that actually look and feel alive. Our actors laugh, flinch, sing, rap — all of course, per your direction. Just upload an audio, describe the

thumb_up_off_alt706

chat_bubble_outline92

repeat93

shareShare

Baseten

@basetenco

5 months ago

Impressed by these ultra-realistic, multilingual AI actors — a huge unlock for creative teams scaling content. Congrats to our friends at Captions on launch day!

thumb_up_off_alt28

chat_bubble_outline1

repeat3

shareShare

sarah guo // conviction

@saranormous

5 months ago

x.com/i/article/1929…

thumb_up_off_alt1,1K

chat_bubble_outline76

repeat164

shareShare

Baseten

@basetenco

5 months ago

We’re excited to partner with oxen.ai on their fine-tuning launch. It’s almost too easy — zero-code fine-tuning, from dataset to custom model in a few clicks.

thumb_up_off_alt20

chat_bubble_outline2

repeat9

shareShare

Bland

@usebland

5 months ago

Today we’re excited to introduce Bland TTS, the first voice AI to cross the uncanny valley. Several months ago, our team solved one-shot style transfer of human speech. That means, from a single, brief MP3, you can clone any voice or remix another clone’s style (tone, cadence,

thumb_up_off_alt474

chat_bubble_outline95

repeat78

shareShare

Sycomore

@the_sycomore

5 months ago

Riffscape

thumb_up_off_alt614

chat_bubble_outline10

repeat64

shareShare

Baseten

@basetenco

5 months ago

Our customers run AI products where every millisecond and request matter. Over the years, we found fundamental limitations in traditional deployment approaches — single points of failure, regional and cloud-specific capacity constraints, and the operational headache of managing

thumb_up_off_alt21

chat_bubble_outline2

repeat6

shareShare

Baseten

@basetenco

5 months ago

So in early 2024, we launched our multi-cloud capacity management (MCM) system to address those challenges head-on. Today, it powers production workloads at companies like Writer, Abridge, Patreon, and many more. Our MCM system unlocks: ⏫Active-active routing across 10+

thumb_up_off_alt8

chat_bubble_outline1

repeat1

shareShare

Baseten

@basetenco

5 months ago

Best of all, you can choose exactly where to run workloads—Baseten Cloud, Self-hosted, or Hybrid—without changing a line of code. Read our post to learn how MCM makes multi-cloud function as one elastic GPU pool → baseten.co/blog/how-baset…

thumb_up_off_alt7

chat_bubble_outline0

repeat1

shareShare

Redpoint

@redpoint

5 months ago

The Redpoint InfraRed 100 is now live! This list honors 100 infrastructure innovators who are transforming how businesses scale, secure, and succeed. Check out this year's honorees and dive deeper with the dynamic list and our complete InfraRed Report linked below.

thumb_up_off_alt125

chat_bubble_outline14

repeat25

shareShare

Amir Haghighat

@amiruci

5 months ago

"Where are your GPUs?" I get this question on sales calls. The answer is 10 different public clouds in 40+ regions. The hard part wasn't acquiring compute; it was using them dynamically to scale a single model across the world. It took us time to build, but the gains are worth

thumb_up_off_alt45

chat_bubble_outline3

repeat11

shareShare

Tuhin Srivastava

@tuhinone

5 months ago

cloud redundacy is necessary...

thumb_up_off_alt28

chat_bubble_outline0

repeat5

shareShare

Baseten

@basetenco

4 months ago

We're excited to introduce the Baseten Performance Client, a new open-source Python library for up to 12x higher throughput for high-volume embedding tasks! Stand up a new vector database, preprocess text, and run massive workloads in <2 minutes (vs. 15+ with AsyncOpenAI).

thumb_up_off_alt23

chat_bubble_outline5

repeat6

shareShare