Sanyam Bhutani (@bhutanisanyam1) 's Twitter Profile
Sanyam Bhutani

@bhutanisanyam1

👨‍💻 Working on llama models @AIatMeta | Previously: @h2oai, @weights_biases 🎙 Podcast @ctdsshow 👨‍🎓 Fellow @fastdotai 🎲 Grandmaster @Kaggle

ID: 784597005825871876

linkhttps://www.youtube.com/c/chaitimedatascience calendar_today08-10-2016 03:31:18

8,8K Tweet

39,39K Followers

989 Following

Sanyam Bhutani (@bhutanisanyam1) 's Twitter Profile Photo

We are hiring on the PyTorch team! 🙏 Partner Engineer is a mix of building applications with real use cases, applied research and software engineering I work on the llama wing but learn so much everytime I speak with PyTorch org. They are some of the smartest and most humble

We are hiring on the PyTorch team! 🙏

Partner Engineer is a mix of building applications with real use cases, applied research and software engineering

I work on the llama wing but learn so much everytime I speak with PyTorch org. They are some of the smartest and most humble
Suhail (@suhail) 's Twitter Profile Photo

Playbook to defeat frontier ai labs without billions of dollars initially: - build an app on top of their models - solve an important, large problem for humanity - resume training on top OSS models to reduce dependency, lower costs for certain tasks, increase performance -

Sanyam Bhutani (@bhutanisanyam1) 's Twitter Profile Photo

Llama 4 supports 10M Context length! 🙏 Reading AN ENTIRE GitHub repo of 900k tokens and writing a guide on it takes under 3 minutes! We are launching two new models Scout and Maverick: - Upto 10M context length - Scout fits on single H100 with int4 quant - Upto 5 images -

Llama 4 supports 10M Context length! 🙏

Reading AN ENTIRE GitHub repo of 900k tokens and writing a guide on it takes under 3 minutes! 

We are launching two new models Scout and Maverick:

- Upto 10M context length
- Scout fits on single H100 with int4 quant
- Upto 5 images
-
Artificial Analysis (@artificialanlys) 's Twitter Profile Photo

Llama 4 Intelligence Index Update: We have now replicated Meta’s claimed values for MMLU Pro and GPQA Diamond, pushing our Intelligence Index scores for both Scout and Maverick higher Key update details: ➤ We noted in our first post 48 hours ago that we noticed discrepancies

Llama 4 Intelligence Index Update: We have now replicated Meta’s claimed values for MMLU Pro and GPQA Diamond, pushing our Intelligence Index scores for both Scout and Maverick higher

Key update details:
➤ We noted in our first post 48 hours ago that we noticed discrepancies
Sanyam Bhutani (@bhutanisanyam1) 's Twitter Profile Photo

1.5M tokens to website in 5 minutes 🙏 - Upload an entire repo of apps - Upload multiple sketches website - Use the repo content to populate the template Llama 4 supports 10M context + upto 10 images in a session:

Daniel Han (@danielhanchen) 's Twitter Profile Photo

Also note if you're not getting good Llama 4 results, there are a few bugs: 1. QK Norm eps should be 1e-5 - collabed with HF on the fix! github.com/huggingface/tr… 2. RoPE scaling for Scout changed: github.com/ggml-org/llama… 3. vLLM +2% acc shared QK norm fix: github.com/vllm-project/v…

Unsloth AI (@unslothai) 's Twitter Profile Photo

We’re excited to showcase all the amazing ways you’ve been building with Llama + Unsloth at LlamaCon 2025! 🦥🦙 Get ready for surprises and exciting announcements from Meta and us on Apr 29 in SF. 👀 Also big thanks to AI at Meta for the support and awesome merch!

We’re excited to showcase all the amazing ways you’ve been building with Llama + Unsloth at LlamaCon 2025! 🦥🦙

Get ready for surprises and exciting announcements from Meta and us on Apr 29 in SF. đź‘€

Also big thanks to <a href="/AIatMeta/">AI at Meta</a> for the support and awesome merch!
Sanyam Bhutani (@bhutanisanyam1) 's Twitter Profile Photo

Super excited to launch Synthetic-Data-Kit! 🙏 Fine-tuning LLMs is easy, there are many packages to get started, Unsloth AI is my absolute favorite. However, there is still a BIG HURDLE when working on fine-tuning: Data preparation Today I’m super grateful to be launching a

Super excited to launch Synthetic-Data-Kit! 🙏

Fine-tuning LLMs is easy, there are many packages to get started, <a href="/UnslothAI/">Unsloth AI</a> is my absolute favorite. 

However, there is still a BIG HURDLE when working on fine-tuning: Data preparation

Today I’m super grateful to be launching a
Unsloth AI (@unslothai) 's Twitter Profile Photo

We partnered with AI at Meta on a free notebook that turns your documents into high-quality synthetic datasets using Llama! Features: • Parses PDFs, websites, videos • Use Llama to generate QA pairs + auto-filter data • Fine-tunes dataset with Llama 🔗colab.research.google.com/github/unsloth…

We partnered with <a href="/AIatMeta/">AI at Meta</a> on a free notebook that turns your documents into high-quality synthetic datasets using Llama!

Features:
• Parses PDFs, websites, videos
• Use Llama to generate QA pairs + auto-filter data
• Fine-tunes dataset with Llama

🔗colab.research.google.com/github/unsloth…
Sanyam Bhutani (@bhutanisanyam1) 's Twitter Profile Photo

Llama Synthetic Data Fine-Tuning Guide! 🙏 My favourite thing about this tutorial-everything is powered by 3B model. It covers a step overlooked everywhere-data preparation and generation for fine tuning. Thanks Unsloth AI team for this gem: colab.research.google.com/github/unsloth…

Llama Synthetic Data Fine-Tuning Guide! 🙏

My favourite thing about this tutorial-everything is powered by 3B model. 

It covers a step overlooked everywhere-data preparation and generation for fine tuning. 

Thanks <a href="/UnslothAI/">Unsloth AI</a> team for this gem:

colab.research.google.com/github/unsloth…
Unsloth AI (@unslothai) 's Twitter Profile Photo

You can now fine-tune TTS models with Unsloth! Train, run and save models like Sesame-CSM and OpenAI's Whisper locally with our free notebooks. Unsloth makes TTS training 1.5x faster with 50% less VRAM. GitHub: github.com/unslothai/unsl… Docs & Notebooks: docs.unsloth.ai/basics/text-to…

Daniel Han (@danielhanchen) 's Twitter Profile Photo

We're bringing the Unsloth magic to TTS and audio models! There are multiple free Colab notebooks with free GPUs for Whisper, Sesame, Orpheus, Spark, Llasa & Oute on our docs! docs.unsloth.ai/basics/text-to…

Sanyam Bhutani (@bhutanisanyam1) 's Twitter Profile Photo

Every birthday, I recap my favourite learnings 🙏 This time, I’ve been laser focussed on simple ideas for fine-tuning and data generation. A roadmap of 8 ideas from papers that I’ve enjoyed the most, along with our implementations: 1. How much data do you need to fine-tune?

Every birthday, I recap my favourite learnings 🙏

This time, I’ve been laser focussed on simple ideas for fine-tuning and data generation.

A roadmap of 8 ideas from papers that I’ve enjoyed the most, along with our implementations:

1. How much data do you need to fine-tune?