6 cute pastel coloured sloths staring at their computer screens happy
Fine-tune & Run Qwen3

Apr 28, 2025 • By Daniel & Michael

Apr 28, 2025

By Daniel & Michael

Qwen's new Qwen3 models deliver state-of-the-art advancements in reasoning, instruction-following, agent capabilities, and multilingual support.

All Qwen3 uploads use our new Unsloth Dynamic 2.0 methodology, delivering the best performance on 5-shot MMLU and KL Divergence benchmarks. This means, you can run and fine-tune quantized Qwen3 LLMs with minimal accuracy loss!

We also uploaded Qwen3 with native 128K context length. Qwen achieves this by using YaRN to extend its original 40K window to 128K.
  • Fine-tune Qwen3 (8B) for free using our Colab notebook
  • Unsloth makes Qwen3 (8B) finetuning 4x faster, use 70% less VRAM, and enables 10x longer than all environments with Flash Attention 2.
  • We uploaded all versions of Qwen3, including Dynamic 2.0 GGUFs, dynamic 4-bit and more on Hugging Face here.
  • Read our guide on How to correctly Run Qwen3 here.
Unsloth now also supports EVERYTHING* including: full fine-tuning, 8-bit, pretraining, ALL transformer-style models (Mixtral, MOE, Cohere etc.) and ANY training algorithms like GRPO with VLMs.

Also big thanks to the Qwen team for collabing and support us!
✨Qwen3 Details

Performance benchmarks

Model
VRAM
🦥Unsloth speed
🦥 VRAM reduction
🦥 Longer context
🤗Hugging Face+FA2
Qwen3-14B
24GB
3x
>70%
10xlonger
1x
We tested using the Alpaca Dataset, a batch size of 2, gradient accumulation steps of 4, rank = 32, and applied QLoRA on all linear layers (q, k, v, o, gate, up, down).
💕 Thank you! 
A huge thank you to the Qwen team for their support and everyone for using & sharing Unsloth - we really appreciate it. 🙏

As always, be sure to join our Reddit page and Discord server for help or just to show your support! You can also follow us on Twitter and join our newsletter.
Thank you for reading!
Daniel & Michael Han 🦥
28 Apr 2025

Fine-tune Qwen for free now!

Join Our Discord