Daniel Han (@danielhanchen) Twitter Tweets • TwiCopy

Daniel Han

@danielhanchen

+ Follow

Building @UnslothAI. Finetune train LLMs faster. LLMs bug hunter. OSS package github.com/unslothai/unsl…. YC S24. Prev ML at NVIDIA. Hyperlearn used by NASA.

ID: 717359704226172928

linkhttps://unsloth.ai/ calendar_today05-04-2016 14:34:16

2,2K Tweet

23,23K Followers

1,1K Following

Daniel Han

@danielhanchen

5 months ago

The Mistral team at it again with Magistral! GRPO with edits: 1. Removed KL Divergence 2. Normalize by total length (Dr. GRPO style) 3. Minibatch normalization for advantages 4. Relaxing trust region Paper: mistral.ai/static/researc… Docs to run Magistral: docs.unsloth.ai/basics/magistr…

thumb_up_off_alt679

chat_bubble_outline9

repeat99

shareShare

Vaibhav (VB) Srivastav

@reach_vb

5 months ago

Unsloth released optimised GGUFs for llama.cpp, LMStudio and Ollama as well 💥 Love the sheer execution speed of the community! 🤗 huggingface.co/unsloth/Magist…

thumb_up_off_alt236

chat_bubble_outline2

repeat36

shareShare

Daniel Han

@danielhanchen

5 months ago

Wow! Unsloth AI on the Nasdaq tower!🦥Thank you Redpoint for naming Unsloth one of the top 100 most impactful and fastest-growing infra companies in their 2025 report And it’s all thanks to you - the community! We truly appreciate it and couldn’t have done it without you all🥰

Wow! <a href="/UnslothAI/">Unsloth AI</a> on the Nasdaq tower!🦥Thank you <a href="/Redpoint/">Redpoint</a> for naming Unsloth one of the top 100 most impactful and fastest-growing infra companies in their 2025 report

And it’s all thanks to you - the community! We truly appreciate it and couldn’t have done it without you all🥰

thumb_up_off_alt244

chat_bubble_outline16

repeat9

shareShare

Daniel Han

@danielhanchen

5 months ago

Get 2x faster for reward model serving and sequence classification inference through Unsloth AI! Nice benchmarks Kyle!

thumb_up_off_alt82

chat_bubble_outline1

repeat11

shareShare

Daniel Han

@danielhanchen

5 months ago

I'll be giving a talk on the 'Future of Reinforcement Learning and Training' at AMD's 2025 Advancing AI event today! 👋 See you all at 2:25pm PT in Room 230A-C. Excited to chat and meet!

I'll be giving a talk on the 'Future of Reinforcement Learning and Training' at <a href="/AMD/">AMD</a>'s 2025 Advancing AI event today! 👋

See you all at 2:25pm PT in Room 230A-C. Excited to chat and meet!

thumb_up_off_alt83

chat_bubble_outline2

repeat5

shareShare

Chris Lattner

@clattner_llvm

5 months ago

This is just me unapologetically nerd crushing on the Unsloth AI duo, legendary developers with a shared goal of democratizing AI compute:

This is just me unapologetically nerd crushing on the <a href="/UnslothAI/">Unsloth AI</a> duo, legendary developers with a shared goal of democratizing AI compute:

thumb_up_off_alt690

chat_bubble_outline17

repeat23

shareShare

Daniel Vila Suero

@dvilasuero

4 months ago

New tutorial: how to build a synthetic dataset with recent information and use it to fine tune with Unsloth AI Check out the collab: colab.research.google.com/drive/1JK04IBE… Steps in the 🧵

New tutorial: how to build a synthetic dataset with recent information and use it to fine tune with <a href="/UnslothAI/">Unsloth AI</a>

Check out the collab:
colab.research.google.com/drive/1JK04IBE…

Steps in the 🧵

thumb_up_off_alt30

chat_bubble_outline3

repeat6

shareShare

Daniel Han

@danielhanchen

4 months ago

Managed to mostly fix Mistral 3.2 tool calling for GGUF / transformers! 1. 3.2 tool calling is different from 3.1 2. timedelta(days=1) (yesterday) changed with a if-else - supports 2024 to 2028 dates - so now word for word same sys prompt! 3. Made experimental FP8 quant as well!

thumb_up_off_alt76

chat_bubble_outline6

repeat6

shareShare

Daniel Han

@danielhanchen

4 months ago

We need r/LocalLlama back :( Hopefully a good neutral moderator takes the reins asap!

thumb_up_off_alt187

chat_bubble_outline25

repeat8

shareShare

Daniel Han

@danielhanchen

4 months ago

r/LocalLlama is back!! reddit.com/r/LocalLLaMA/c…

thumb_up_off_alt84

chat_bubble_outline5

repeat5

shareShare

Daniel Han

@danielhanchen

4 months ago

Excited to see you all tomorrow for our Google Gemma & Unsloth developer meetup! 🦥 We'll be having @Grmcameron from Artificial Analysis and 👩‍💻 Paige Bailey & more amazing talks! Location has been updated so please check & if you need help please DM me! lu.ma/gemma-unsloth

thumb_up_off_alt25

chat_bubble_outline0

repeat3

shareShare

👩‍💻 Paige Bailey

@dynamicwebpaige

4 months ago

💎 Celebrating the official release of Gemma 3n with the inaugural Gemma Community meetup at Google San Francisco, cohosted with @Unsloth! Great presentations from the Unsloth founders on agents, the Gemma team on architectural internals, and how to craft effective evals.

💎 Celebrating the official release of Gemma 3n with the inaugural Gemma Community meetup at <a href="/Google/">Google</a> San Francisco, cohosted with @Unsloth!

Great presentations from the Unsloth founders on agents, the Gemma team on architectural internals, and how to craft effective evals.

thumb_up_off_alt71

chat_bubble_outline8

repeat3

shareShare

Daniel Han

@danielhanchen

4 months ago

Huge thanks to everyone who attended our Google & Unsloth AI Gemma developer meetup yesterday! 🦥 Was amazing meeting you all & thank you to Taka Shinagawa for hosting the event with us. Thank you to the Google speakers: 👩‍💻 Paige Bailey, Doug Reid, Mayank Chaturvedi, @GrmCameron and of

Huge thanks to everyone who attended our <a href="/Google/">Google</a> & <a href="/UnslothAI/">Unsloth AI</a> Gemma developer meetup yesterday! 🦥 Was amazing meeting you all & thank you to <a href="/blueviggen/">Taka Shinagawa</a> for hosting the event with us.

Thank you to the Google speakers: <a href="/DynamicWebPaige/">👩‍💻 Paige Bailey</a>, Doug Reid, <a href="/imayank42/">Mayank Chaturvedi</a>, @GrmCameron and of

thumb_up_off_alt84

chat_bubble_outline3

repeat6

shareShare

Daniel Han

@danielhanchen

4 months ago

Gemma 3N quirks! 1. Vision NaNs on float16 2. Conv2D weights are large FP16 overflows to infinity 3. Large activations fixed vs Gemma 3 4. 6-7 training losses: normal for multimodal? 5. Large nums in msfa_ffn_pw_proj 6. NaNs fixed in Unsloth AI Details: docs.unsloth.ai/basics/gemma-3…

thumb_up_off_alt297

chat_bubble_outline9

repeat33

shareShare

👩‍💻 Paige Bailey

@dynamicwebpaige

4 months ago

🦥 Fine-tuning with Unsloth AI now supports Gemma 3n! ✨ Friendly reminder: the Gemma 3n models can understand not just text and code, but also images, audio, video, and a whole lot more.

🦥 Fine-tuning with <a href="/UnslothAI/">Unsloth AI</a> now supports Gemma 3n! ✨

Friendly reminder: the Gemma 3n models can understand not just text and code, but also images, audio, video, and a whole lot more.

thumb_up_off_alt95

chat_bubble_outline3

repeat15

shareShare

Daniel Han

@danielhanchen

4 months ago

You can utilize our Gemma 3n multimodal and fine-tuning Kaggle notebook for any submission to the $150,000 challenge! The $10,000 is specifically for the Unsloth track - but you can submit it for the main track as well! Kaggle notebook: kaggle.com/code/danielhan…

thumb_up_off_alt85

chat_bubble_outline2

repeat13

shareShare