sayshrey (@sayshrey) Twitter Tweets • TwiCopy

Gate.io

5 hours ago

🔥The 9th Round of Easy Loan, Earn $40 Reward is in progress❗️ ⏰ Promotion Period: January 15th - Feburary 15th, 2025 👉 Register now and check more details at gate.io/campaigns/358

thumb_up_off_alt34

chat_bubble_outline39

repeat6

shareShare

was fun discussing our recent work Bespoke Labs on reasoning distillation on Latent.Space 🔜 @aiDotEngineer ! swyx 🔜 @aiDotEngineer (Jun 3-5) and Alessio Fanelli asked very insightful questions that allowed us to dive into the technical details. youtu.be/jrf76uNs77k?si…

thumb_up_off_alt10

chat_bubble_outline1

repeat2

shareShare

Mahesh Sathiamoorthy

@madiator

6 months ago

We are announcing Open Thoughts, our large-scale open-source effort to curate the best open reasoning datasets! DeepSeek-R1 is amazing but we still don't have access to high-quality open reasoning datasets. These datasets are crucial if you want to build your reasoning models!

thumb_up_off_alt1,1K

chat_bubble_outline45

repeat292

shareShare

Trung Phan

@trungtphan

6 months ago

the DeepSeek hype cycle has reached the “parents texting you about it” phase

thumb_up_off_alt3,3K

chat_bubble_outline91

repeat187

shareShare

Maxime Labonne

@maximelabonne

6 months ago

OpenThoughts-114k is another great one distilled from R1, with data generation code and evals. We get a bit more diversity and samples. Verification based on categories is perfect, but it could be even more robust with It might be the best open-source reasoning dataset to

thumb_up_off_alt44

chat_bubble_outline2

repeat10

shareShare

Bespoke Labs

@bespokelabsai

6 months ago

We are rolling out an easy batch mode with Curator so you can get your tokens now for 50% the cost, just by adding `batch=True`. No need for complex logic that involves creating and uploading files, polling, mapping responses back to requests etc! As a promo, please check out

thumb_up_off_alt25

chat_bubble_outline1

repeat9

shareShare

Negin Raoof

@neginraoof_

5 months ago

Announcing OpenThinker-32B: the best open-data reasoning model distilled from DeepSeek-R1. Our results show that large, carefully curated datasets with verified R1 annotations produce SoTA reasoning models. Our 32B model outperforms all 32B models including

thumb_up_off_alt770

chat_bubble_outline12

repeat128

shareShare

Dimitris Papailiopoulos

@dimitrispapail

5 months ago

OK THIS IS REALLY COOL Claude 3.7 Sonnet please draw in tikz - a human that is inside a house - the house is inscribed in a sphere - the sphere is inscribed in a cube - the cube is inscribed in a cylinder - the cylinder is inscribed in a pyramid 0 vs 10k vs 30k vs 64k

thumb_up_off_alt146

chat_bubble_outline7

repeat11

shareShare

Mahesh Sathiamoorthy

@madiator

5 months ago

What if you can ask an LLM to plan the script for a 3Blue1Brown-like video and actually generate it? Or execute complex agentic tasks when generating data? We are launching code execution capability in Curator! This is quite useful in many scenarios: 1. Creating synthetic code

thumb_up_off_alt173

chat_bubble_outline3

repeat24

shareShare

Alex Dimakis

@alexgdimakis

5 months ago

gpt-4.5 , just announced, is 15 times more expensive compared to 4o and 250 times more expensive compared to 4o-mini. This is a good reason why companies need to train their own bespoke models.

thumb_up_off_alt31

chat_bubble_outline1

repeat7

shareShare

Mahesh Sathiamoorthy

@madiator

5 months ago

I wrote an exhaustive article on what DeepSeek and reasoning mean for X, where X is just about anything I could think of. Covers a lot of ground! Link below!

thumb_up_off_alt90

chat_bubble_outline4

repeat14

shareShare

Bespoke Labs

@bespokelabsai

5 months ago

LLM API providers, including Google, offer ~50% discounts through batch mode, which processes large requests asynchronously. However, Gemini batch API is notoriously tricky due to many steps involved and scattered documentation. Curator makes it easy. No manual polling, no file

thumb_up_off_alt32

chat_bubble_outline2

repeat10

shareShare

sayshrey

@sayshrey

4 months ago

Curator makes model distillation super easy!

thumb_up_off_alt7

chat_bubble_outline0

repeat2

shareShare

Etash Guha @ ICLR

@etash_guha

4 months ago

Turns out, it’s possible to outperform DeepSeekR1-32B with only SFT on open data and no RL: Announcing OpenThinker2-32B and OpenThinker2-7B. We also release the data, OpenThoughts2-1M, curated by selecting quality instructions from diverse sources. 🧵 (1/n)

thumb_up_off_alt465

chat_bubble_outline19

repeat171

shareShare

Rohan Jha

@robro612

4 months ago

Luca Soldaini 🎀 Haven't personally used it but I saw Alex Dimakis announce Curator. Has a lot of features one would want to scale as well. github.com/bespokelabsai/…

thumb_up_off_alt12

chat_bubble_outline1

repeat2

shareShare

Bespoke Labs

@bespokelabsai

4 months ago

Announcing Reasoning Datasets Competition📢in collaboration with Hugging Face and Together AI Since the launch of DeepSeek-R1 this January, we’ve seen an explosion of reasoning-focused datasets: OpenThoughts-114k, OpenCodeReasoning, codeforces-cot, and more.

Announcing Reasoning Datasets Competition📢in collaboration with <a href="/huggingface/">Hugging Face</a> and <a href="/togethercompute/">Together AI</a>
Since the launch of DeepSeek-R1 this January, we’ve seen an explosion of reasoning-focused datasets: OpenThoughts-114k, OpenCodeReasoning, codeforces-cot, and more.

thumb_up_off_alt117

chat_bubble_outline3

repeat46

shareShare

Bespoke Labs

@bespokelabsai

3 months ago

OpenAI’s o4 just showed that multi-turn tool use is a huge deal for AI agents. Today, we show how to do the same with your own agents, using RL and open-source models. We used GRPO on only 100 high quality questions from the BFCL benchmark, and post-trained a 7B Qwen model to

thumb_up_off_alt380

chat_bubble_outline21

repeat50

shareShare

Bespoke Labs

@bespokelabsai

3 months ago

Announcing Bespoke-MiniChart-7B, a new SOTA in chart understanding for models of comparable size on seven benchmarks, on par with Gemini-1.5-Pro and Claude-3.5! 🚀 Beyond its real-world applications, chart understanding is a good challenging problem for VLMs, since it requires

thumb_up_off_alt67

chat_bubble_outline2

repeat15

shareShare

Bespoke Labs

@bespokelabsai

3 months ago

Thanks to all who contributed to the work! Liyan Tang, kartik sharma, sayshrey, Mahesh Sathiamoorthy, Greg Durrett and Bespoke Labs team. Shoutout to Lambda for compute credits. 8/8

thumb_up_off_alt8

chat_bubble_outline1

repeat2

shareShare

Liyan Tang

@liyantang4

3 months ago

Check out my work at Bespoke Labs We release Bespoke-MiniChart-7B, a new SOTA in chart understanding of its size Chart understanding is really fun and challenging and requires reasoning skills beyond math reasoning It's a great starting point for open chart model development!

thumb_up_off_alt30

chat_bubble_outline0

repeat9

shareShare

sayshrey

Gate.io

Trung Vu

Mahesh Sathiamoorthy

Trung Phan

Maxime Labonne

Bespoke Labs

Negin Raoof

Dimitris Papailiopoulos

Mahesh Sathiamoorthy

Alex Dimakis

Mahesh Sathiamoorthy

Bespoke Labs

sayshrey

Etash Guha @ ICLR

Rohan Jha

Bespoke Labs

Bespoke Labs

Bespoke Labs

Bespoke Labs

Liyan Tang