sayshrey (@sayshrey) 's Twitter Profile
sayshrey

@sayshrey

ID: 1759127811833393153

calendar_today18-02-2024 08:09:08

78 Tweet

79 Followers

266 Following

Trung Vu (@trungthvu) 's Twitter Profile Photo

was fun discussing our recent work Bespoke Labs on reasoning distillation on Latent.Space 🔜 @aiDotEngineer ! swyx 🔜 @aiDotEngineer (Jun 3-5) and Alessio Fanelli asked very insightful questions that allowed us to dive into the technical details. youtu.be/jrf76uNs77k?si…

Mahesh Sathiamoorthy (@madiator) 's Twitter Profile Photo

We are announcing Open Thoughts, our large-scale open-source effort to curate the best open reasoning datasets! DeepSeek-R1 is amazing but we still don't have access to high-quality open reasoning datasets. These datasets are crucial if you want to build your reasoning models!

We are announcing Open Thoughts, our large-scale open-source effort to curate the best open reasoning datasets!

DeepSeek-R1 is amazing but we still don't have access to high-quality open reasoning datasets. These datasets are crucial if you want to build your reasoning models!
Maxime Labonne (@maximelabonne) 's Twitter Profile Photo

OpenThoughts-114k is another great one distilled from R1, with data generation code and evals. We get a bit more diversity and samples. Verification based on categories is perfect, but it could be even more robust with It might be the best open-source reasoning dataset to

OpenThoughts-114k is another great one distilled from R1, with data generation code and evals. 

We get a bit more diversity and samples. Verification based on categories is perfect, but it could be even more robust with

It might be the best open-source reasoning dataset to
Bespoke Labs (@bespokelabsai) 's Twitter Profile Photo

We are rolling out an easy batch mode with Curator so you can get your tokens now for 50% the cost, just by adding `batch=True`. No need for complex logic that involves creating and uploading files, polling, mapping responses back to requests etc! As a promo, please check out

We are rolling out an easy batch mode with Curator so you can get your tokens now for 50% the cost, just by adding `batch=True`. 

No need for complex logic that involves creating and uploading files, polling, mapping responses back to requests etc!

As a promo, please check out
Negin Raoof (@neginraoof_) 's Twitter Profile Photo

Announcing OpenThinker-32B: the best open-data reasoning model distilled from DeepSeek-R1. Our results show that large, carefully curated datasets with verified R1 annotations produce SoTA reasoning models. Our 32B model outperforms all 32B models including

Announcing OpenThinker-32B: the best open-data reasoning model distilled from DeepSeek-R1.
Our results show that large, carefully curated datasets with verified R1 annotations produce SoTA reasoning models. Our 32B model outperforms all 32B models including
Dimitris Papailiopoulos (@dimitrispapail) 's Twitter Profile Photo

OK THIS IS REALLY COOL Claude 3.7 Sonnet please draw in tikz - a human that is inside a house - the house is inscribed in a sphere - the sphere is inscribed in a cube - the cube is inscribed in a cylinder - the cylinder is inscribed in a pyramid 0 vs 10k vs 30k vs 64k

OK THIS IS REALLY COOL

Claude 3.7 Sonnet please draw in tikz 
- a human that is inside a house
- the house is inscribed in a sphere
- the sphere is inscribed in a cube
- the cube is inscribed in a cylinder
- the cylinder is inscribed in a pyramid

0 vs 10k vs 30k vs 64k
Mahesh Sathiamoorthy (@madiator) 's Twitter Profile Photo

What if you can ask an LLM to plan the script for a 3Blue1Brown-like video and actually generate it? Or execute complex agentic tasks when generating data? We are launching code execution capability in Curator! This is quite useful in many scenarios: 1. Creating synthetic code

Alex Dimakis (@alexgdimakis) 's Twitter Profile Photo

gpt-4.5 , just announced, is 15 times more expensive compared to 4o and 250 times more expensive compared to 4o-mini. This is a good reason why companies need to train their own bespoke models.

gpt-4.5 , just announced, is 15 times more expensive compared to 4o and 250 times more expensive compared to 4o-mini.  
This is a good reason why companies need to train their own bespoke models.
Mahesh Sathiamoorthy (@madiator) 's Twitter Profile Photo

I wrote an exhaustive article on what DeepSeek and reasoning mean for X, where X is just about anything I could think of. Covers a lot of ground! Link below!

I wrote an exhaustive article on what DeepSeek and reasoning mean for X, where X is just about anything I could think of. Covers a lot of ground!

Link below!
Bespoke Labs (@bespokelabsai) 's Twitter Profile Photo

LLM API providers, including Google, offer ~50% discounts through batch mode, which processes large requests asynchronously. However, Gemini batch API is notoriously tricky due to many steps involved and scattered documentation. Curator makes it easy. No manual polling, no file

LLM API providers, including Google, offer ~50% discounts through batch mode, which processes large requests asynchronously. However, Gemini batch API is notoriously tricky due to many steps involved and scattered documentation.
Curator makes it easy. No manual polling, no file
Etash Guha @ ICLR (@etash_guha) 's Twitter Profile Photo

Turns out, it’s possible to outperform DeepSeekR1-32B with only SFT on open data and no RL: Announcing OpenThinker2-32B and OpenThinker2-7B. We also release the data, OpenThoughts2-1M, curated by selecting quality instructions from diverse sources. 🧵 (1/n)

Turns out, it’s possible to outperform DeepSeekR1-32B with only SFT on open data and no RL: Announcing OpenThinker2-32B and OpenThinker2-7B. We also release the data, OpenThoughts2-1M, curated by selecting quality instructions from diverse sources. 🧵 (1/n)
Rohan Jha (@robro612) 's Twitter Profile Photo

Luca Soldaini 🎀 Haven't personally used it but I saw Alex Dimakis announce Curator. Has a lot of features one would want to scale as well. github.com/bespokelabsai/…

Bespoke Labs (@bespokelabsai) 's Twitter Profile Photo

Announcing Reasoning Datasets Competition📢in collaboration with Hugging Face and Together AI Since the launch of DeepSeek-R1 this January, we’ve seen an explosion of reasoning-focused datasets: OpenThoughts-114k, OpenCodeReasoning, codeforces-cot, and more.

Announcing Reasoning Datasets Competition📢in collaboration with <a href="/huggingface/">Hugging Face</a>  and <a href="/togethercompute/">Together AI</a>
Since the launch of DeepSeek-R1 this January, we’ve seen an explosion of reasoning-focused datasets: OpenThoughts-114k, OpenCodeReasoning, codeforces-cot, and more.
Bespoke Labs (@bespokelabsai) 's Twitter Profile Photo

OpenAI’s o4 just showed that multi-turn tool use is a huge deal for AI agents. Today, we show how to do the same with your own agents, using RL and open-source models. We used GRPO on only 100 high quality questions from the BFCL benchmark, and post-trained a 7B Qwen model to

OpenAI’s o4 just showed that multi-turn tool use is a huge deal for AI agents.
Today, we show how to do the same with your own agents, using RL and open-source models.

We used GRPO on only 100 high quality questions from the BFCL benchmark, and post-trained a 7B Qwen model to
Bespoke Labs (@bespokelabsai) 's Twitter Profile Photo

Announcing Bespoke-MiniChart-7B, a new SOTA in chart understanding for models of comparable size on seven benchmarks, on par with Gemini-1.5-Pro and Claude-3.5! 🚀 Beyond its real-world applications, chart understanding is a good challenging problem for VLMs, since it requires

Liyan Tang (@liyantang4) 's Twitter Profile Photo

Check out my work at Bespoke Labs We release Bespoke-MiniChart-7B, a new SOTA in chart understanding of its size Chart understanding is really fun and challenging and requires reasoning skills beyond math reasoning It's a great starting point for open chart model development!