Quentin Anthony (@quentinanthon15) Twitter Tweets • TwiCopy

Quentin Anthony

@quentinanthon15

+ Follow

I make models more efficient.
Google Scholar: scholar.google.com/citations?user…

ID: 1141487623803830272

linkhttps://quentin-anthony.github.io/ calendar_today19-06-2019 23:27:12

253 Tweet

1,1K Followers

240 Following

Quentin Anthony

@quentinanthon15

10 months ago

Zyphra speaks! - Two 1.6B TTS Models (Transformer and SSM Hybrid) - Voice cloning - Optimized and cheap API - Apache 2.0, open-weights I could wax poetic about why I think Zonos is great, but listen for yourself.

thumb_up_off_alt42

chat_bubble_outline2

repeat7

shareShare

Quentin Anthony

@quentinanthon15

10 months ago

Solid work

thumb_up_off_alt9

chat_bubble_outline0

repeat0

shareShare

Daniel Vega-Myhre

@vega_myhre

9 months ago

For any ML folks who want to deepen their understanding of ML scalability & performance techniques, I wrote an illustrated deep-dive into Megatron-style tensor parallelism: danielvegamyhre.github.io/ml/performance… any feedback is welcome!

thumb_up_off_alt32

chat_bubble_outline1

repeat9

shareShare

Zyphra

@zyphraai

9 months ago

Today we’re releasing a highly requested feature in our Playground — Multi Voice. Powered by Zonos, it lets you assign different voices to different parts of text, generating one seamless audio clip. Bring expressive Voice AI to all of your content or just have some fun.

thumb_up_off_alt26

chat_bubble_outline1

repeat8

shareShare

Zyphra

@zyphraai

8 months ago

Zyphra is releasing our first reasoning model, ZR1-1.5B. This small but powerful reasoning model excels at both math and code, making it one of the best models in these categories for its size. It also uses 60% less reasoning tokens than comparable models. 🆓Apache 2.0 license.

thumb_up_off_alt501

chat_bubble_outline15

repeat65

shareShare

Daria Soboleva

@dmsobol

8 months ago

Original MoE vision (Jacobs, 1991): experts should COMPETE, not cooperate. Yet modern LLMs ignore this, treating experts as interchangeable compute chunks. With hundreds of experts in trillion-parameter model scales, are we just creating massive redundancy? 1/n

thumb_up_off_alt29

chat_bubble_outline3

repeat4

shareShare

Daniel Vega-Myhre

@vega_myhre

7 months ago

Just wrote an illustrated deep-dive into overlapping the compute and comms in TP+SP using Async TP. My eyeballs hurt now so hopefully somebody finds it useful :) danielvegamyhre.github.io/ml/performance…

thumb_up_off_alt151

chat_bubble_outline3

repeat25

shareShare

Zyphra

@zyphraai

6 months ago

Zyphra is expanding! Join our growing team in Palo Alto. We have multiple roles open across multimodal foundation models, RL, product, and infrastructure. Check them out here: jobs.ashbyhq.com/zyphra

thumb_up_off_alt19

chat_bubble_outline3

repeat5

shareShare