Daniel Han (@danielhanchen) 's Twitter Profile
Daniel Han

@danielhanchen

Building @UnslothAI. Finetune train LLMs faster. LLMs bug hunter. OSS package github.com/unslothai/unsl…. YC S24. Prev ML at NVIDIA. Hyperlearn used by NASA.

ID: 717359704226172928

linkhttps://unsloth.ai/ calendar_today05-04-2016 14:34:16

2,2K Tweet

23,23K Followers

1,1K Following

Daniel Han (@danielhanchen) 's Twitter Profile Photo

The Mistral team at it again with Magistral! GRPO with edits: 1. Removed KL Divergence 2. Normalize by total length (Dr. GRPO style) 3. Minibatch normalization for advantages 4. Relaxing trust region Paper: mistral.ai/static/researc… Docs to run Magistral: docs.unsloth.ai/basics/magistr…

The Mistral team at it again with Magistral!

GRPO with edits:
1. Removed KL Divergence
2. Normalize by total length (Dr. GRPO style)
3. Minibatch normalization for advantages
4. Relaxing trust region

Paper: mistral.ai/static/researc…
Docs to run Magistral: docs.unsloth.ai/basics/magistr…
Vaibhav (VB) Srivastav (@reach_vb) 's Twitter Profile Photo

Unsloth released optimised GGUFs for llama.cpp, LMStudio and Ollama as well 💥 Love the sheer execution speed of the community! 🤗 huggingface.co/unsloth/Magist…

Daniel Han (@danielhanchen) 's Twitter Profile Photo

Wow! Unsloth AI on the Nasdaq tower!🦥Thank you Redpoint for naming Unsloth one of the top 100 most impactful and fastest-growing infra companies in their 2025 report And it’s all thanks to you - the community! We truly appreciate it and couldn’t have done it without you all🥰

Wow! <a href="/UnslothAI/">Unsloth AI</a> on the Nasdaq tower!🦥Thank you <a href="/Redpoint/">Redpoint</a> for naming Unsloth one of the top 100 most impactful and fastest-growing infra companies in their 2025 report

And it’s all thanks to you - the community! We truly appreciate it and couldn’t have done it without you all🥰
Daniel Han (@danielhanchen) 's Twitter Profile Photo

I'll be giving a talk on the 'Future of Reinforcement Learning and Training' at AMD's 2025 Advancing AI event today! 👋 See you all at 2:25pm PT in Room 230A-C. Excited to chat and meet!

I'll be giving a talk on the 'Future of Reinforcement Learning and Training' at <a href="/AMD/">AMD</a>'s 2025 Advancing AI event today! 👋

See you all at 2:25pm PT in Room 230A-C. Excited to chat and meet!
Daniel Vila Suero (@dvilasuero) 's Twitter Profile Photo

New tutorial: how to build a synthetic dataset with recent information and use it to fine tune with Unsloth AI Check out the collab: colab.research.google.com/drive/1JK04IBE… Steps in the 🧵

New tutorial: how to build a synthetic dataset with recent information and use it to fine tune with <a href="/UnslothAI/">Unsloth AI</a> 

Check out the collab:
colab.research.google.com/drive/1JK04IBE…

Steps in the 🧵
Daniel Han (@danielhanchen) 's Twitter Profile Photo

Managed to mostly fix Mistral 3.2 tool calling for GGUF / transformers! 1. 3.2 tool calling is different from 3.1 2. timedelta(days=1) (yesterday) changed with a if-else - supports 2024 to 2028 dates - so now word for word same sys prompt! 3. Made experimental FP8 quant as well!

Daniel Han (@danielhanchen) 's Twitter Profile Photo

Excited to see you all tomorrow for our Google Gemma & Unsloth developer meetup! 🦥 We'll be having @Grmcameron from Artificial Analysis and 👩‍💻 Paige Bailey & more amazing talks! Location has been updated so please check & if you need help please DM me! lu.ma/gemma-unsloth

👩‍💻 Paige Bailey (@dynamicwebpaige) 's Twitter Profile Photo

💎 Celebrating the official release of Gemma 3n with the inaugural Gemma Community meetup at Google San Francisco, cohosted with @Unsloth! Great presentations from the Unsloth founders on agents, the Gemma team on architectural internals, and how to craft effective evals.

💎 Celebrating the official release of Gemma 3n with the inaugural Gemma Community meetup at <a href="/Google/">Google</a> San Francisco, cohosted with @Unsloth!

Great presentations from the Unsloth founders on agents, the Gemma team on architectural internals, and how to craft effective evals.
Daniel Han (@danielhanchen) 's Twitter Profile Photo

Huge thanks to everyone who attended our Google & Unsloth AI Gemma developer meetup yesterday! 🦥 Was amazing meeting you all & thank you to Taka Shinagawa for hosting the event with us. Thank you to the Google speakers: 👩‍💻 Paige Bailey, Doug Reid, Mayank Chaturvedi, @GrmCameron and of

Huge thanks to everyone who attended our <a href="/Google/">Google</a> &amp; <a href="/UnslothAI/">Unsloth AI</a> Gemma developer meetup yesterday! 🦥 Was amazing meeting you all &amp; thank you to <a href="/blueviggen/">Taka Shinagawa</a> for hosting the event with us.

Thank you to the Google speakers: <a href="/DynamicWebPaige/">👩‍💻 Paige Bailey</a>, Doug Reid, <a href="/imayank42/">Mayank Chaturvedi</a>, @GrmCameron and of
Daniel Han (@danielhanchen) 's Twitter Profile Photo

Gemma 3N quirks! 1. Vision NaNs on float16 2. Conv2D weights are large FP16 overflows to infinity 3. Large activations fixed vs Gemma 3 4. 6-7 training losses: normal for multimodal? 5. Large nums in msfa_ffn_pw_proj 6. NaNs fixed in Unsloth AI Details: docs.unsloth.ai/basics/gemma-3…

Gemma 3N quirks!

1. Vision NaNs on float16
2. Conv2D weights are large FP16 overflows to infinity
3. Large activations fixed vs Gemma 3
4. 6-7 training losses: normal for multimodal?
5. Large nums in msfa_ffn_pw_proj
6. NaNs fixed in <a href="/UnslothAI/">Unsloth AI</a> 

Details: docs.unsloth.ai/basics/gemma-3…
👩‍💻 Paige Bailey (@dynamicwebpaige) 's Twitter Profile Photo

🦥 Fine-tuning with Unsloth AI now supports Gemma 3n! ✨ Friendly reminder: the Gemma 3n models can understand not just text and code, but also images, audio, video, and a whole lot more.

🦥 Fine-tuning with <a href="/UnslothAI/">Unsloth AI</a> now supports Gemma 3n! ✨

Friendly reminder: the Gemma 3n models can understand not just text and code, but also images, audio, video, and a whole lot more.
Daniel Han (@danielhanchen) 's Twitter Profile Photo

You can utilize our Gemma 3n multimodal and fine-tuning Kaggle notebook for any submission to the $150,000 challenge! The $10,000 is specifically for the Unsloth track - but you can submit it for the main track as well! Kaggle notebook: kaggle.com/code/danielhan…