Arnav Garg (@grg_arnav) 's Twitter Profile
Arnav Garg

@grg_arnav

Leading ML @Predibase | Previously @Atlassian @Tesla @UCLA | Co-Founder of @DataresUcla

ID: 819976407941926912

calendar_today13-01-2017 18:36:24

162 Tweet

128 Followers

262 Following

Predibase (@predibase) 's Twitter Profile Photo

šŸš€ #RFT vs. #SFT: When to Use Each for Maximum Impact #DeepSeek -R1 made #Reinforcement #FineTuning (RFT) the hot new thing—but is it better than #Supervised Fine-Tuning (SFT)? šŸ¤” Here’s when RFT wins: āœ… No labeled data? If you can verify correctness, RFT works. āœ… <100

šŸš€ #RFT vs. #SFT: When to Use Each for Maximum Impact

#DeepSeek -R1 made #Reinforcement #FineTuning (RFT) the hot new thing—but is it better than #Supervised Fine-Tuning (SFT)? šŸ¤”

Here’s when RFT wins:
āœ… No labeled data? If you can verify correctness, RFT works.
āœ… &lt;100
Predibase (@predibase) 's Twitter Profile Photo

Today we're thrilled to announce the first end-to-end platform for Reinforcement Fine-Tuning. With just a dozen labeled data points, you can outperform #OpenAI o1 and #DeepSeekR1 on complex tasks. Built on the #GRPO methodology that DeepSeek-R1 popularized, our platform delivers

Arnav Garg (@grg_arnav) 's Twitter Profile Photo

šŸš€ Launching Reinforcement Fine-Tuning (RFT) at Predibase - the first platform to fine-tune LLMs with just a few prompts & reward functions. No massive datasets needed, just GPT-4o & GPT-o1 beating performance made simple.

Saam Motamedi (@saammotamedi) 's Twitter Profile Photo

Huge release from @Predibase today -- the first end-to-end platform for Reinforcement Fine-Tuning Bringing the techniques that power DeepSeekR1 to any open source model and data

Piero Molino (@w4nderlus7) 's Twitter Profile Photo

Fine-tuning is great for adapting LLMs to specific tasks, but what if you don’t have much data? Starting today, you can use the world’s first end-to-end Reinforcement Fine-Tuning (RFT) Platform within Predibase and train models with zero data! We’ve enhanced GRPO, the

Fine-tuning is great for adapting LLMs to specific tasks, but what if you don’t have much data? Starting today, you can use the world’s first end-to-end Reinforcement Fine-Tuning (RFT) Platform within <a href="/predibase/">Predibase</a>  and train models with zero data!

We’ve enhanced GRPO, the
Sebastian Raschka (@rasbt) 's Twitter Profile Photo

As we all know by now, reasoning models often generate longer responses, which raises compute costs. Now, this new paper (arxiv.org/abs/2504.05185) shows that this behavior comes from the RL training process, not from an actual need for long answers for better accuracy. The RL

As we all know by now, reasoning models often generate longer responses, which raises compute costs. Now, this new paper (arxiv.org/abs/2504.05185) shows that this behavior comes from the RL training process, not from an actual need for long answers for better accuracy. The RL
Predibase (@predibase) 's Twitter Profile Photo

🐳 AI teams are testing DeepSeek—but nobody agrees on when to use it In our recent survey of 500+ AI professionals, DeepSeek-R1 is getting serious attention—but it's far from mainstream. Here’s what we uncovered: šŸ“Š 57% of teams have experimented with DeepSeek-R1 āš ļø Only 3%

🐳 AI teams are testing DeepSeek—but nobody agrees on when to use it

In our recent survey of 500+ AI professionals, DeepSeek-R1 is getting serious attention—but it's far from mainstream. Here’s what we uncovered:

šŸ“Š 57% of teams have experimented with DeepSeek-R1
āš ļø Only 3%
Predibase (@predibase) 's Twitter Profile Photo

šŸš€ Serve and fine-tune #Qwen3 — in your cloud or ours with blazing fast #inference speeds! No need to share your data. šŸš€ Qwen 3 is the latest #opensource LLM dominating the leaderboards. Don't get left behind! Now you can serve and customize the latest Qwen models instantly

šŸš€ Serve and fine-tune #Qwen3 — in your cloud or ours with blazing fast #inference speeds! No need to share your data. šŸš€

Qwen 3 is the latest #opensource LLM dominating the leaderboards. Don't get left behind! 

Now you can serve and customize the latest Qwen models instantly
Andrew Ng (@andrewyng) 's Twitter Profile Photo

New Course: Reinforcement Fine-Tuning LLMs with GRPO! Learn to use reinforcement learning to improve your LLM performance in this short course, built in collaboration with @Predibase, and taught by Travis Addair, its Co-Founder and CTO, and Arnav Garg, its Senior Engineer and

Travis Addair (@travisaddair) 's Twitter Profile Photo

It was an honor getting to work together with the DeepLearning.ai team and my colleague Arnav Garg on this course covering all things Reinforcement Fine-Tuning and GRPO. Similar to our last course on efficient LLM inference, we wanted to really drill into the intuition

Arnav Garg (@grg_arnav) 's Twitter Profile Photo

I had a blast working with the DeepLearning.AI team and my colleague Travis Addair over the last few months to put this course together on Reinforcement Fine-Tuning with GRPO! We’ve tried to make this course as practical as possible and help you build intuition. Hope you enjoy!

Predibase (@predibase) 's Twitter Profile Photo

šŸš€ Fresh off our hit DeepLearning.AI course on RFT + #GRPO, we’re going live! šŸŽ™ļø Let’s Talk Tokens: Live #AMA on Reinforcement Fine-Tuning with the Experts Who Built the Definitive Course! #RFT isn’t just research any more—it’s driving real-world GenAI with tighter feedback

šŸš€ Fresh off our hit <a href="/DeepLearningAI/">DeepLearning.AI</a> course on RFT + #GRPO, we’re going live!

šŸŽ™ļø Let’s Talk Tokens: Live #AMA on Reinforcement Fine-Tuning with the Experts Who Built the Definitive Course!

#RFT isn’t just research any more—it’s driving real-world GenAI with tighter feedback
Predibase (@predibase) 's Twitter Profile Photo

Big news! We will be joining @RubrikInc to accelerate agentic AI adoption from pilot to production at scale! āš”ļø Together, we can deliver radical simplicity in models and data. This is an exciting next step in our journey. More from Dev Rishi here: pbase.ai/45yUL2O

Predibase (@predibase) 's Twitter Profile Photo

🧠 Join the 10k developers supercharging their #LLM skills with Reinforcement Fine-tuning—and it's free! 🧠 Reinforcement Fine-Tuning (#RFT) and #GRPO are fast becoming popular techniques to teach LLMs how to reason. We teamed up with DeepLearning.AI to build the definitive

🧠 Join the 10k developers supercharging their #LLM skills with Reinforcement Fine-tuning—and it's free! 🧠

Reinforcement Fine-Tuning (#RFT) and #GRPO are fast becoming popular techniques to teach LLMs how to reason. 

We teamed up with <a href="/DeepLearningAI/">DeepLearning.AI</a> to build the definitive