6 cute pastel coloured sloths staring at their computer screens happy

Fine-tuning TTS models

May 2, 2025 • By Daniel & Michael

May 15, 2025

• By Daniel & Michael

You can now train Text-to-Speech (TTS) models in Unsloth! Fine-tuning TTS models enables them to adapt to your specific dataset, use case, or desired vocal style and tone. The goal is to customize these models for tasks such as voice cloning, speaking style adaptation, tone modulation, multilingual support, and other specialized applications.

This support also includes Speech-to-Text (STT) models like OpenAI's Whisper and standard TTS models like Sesame's CSM, Orpheus, and models supported by transformers (e.g. CrisperWhisper, Spark, Outte and more).
Training is ~1.5x faster with 50% less VRAM compared to all other setups with FA2.
We’ve made notebooks to train, run, and save these models for free on Google Colab. Some models aren’t supported by llama.cpp and will be saved only as safetensors, but others should work. See all our TTS (including CSM-1B, Whisper etc.) notebooks here.
Read our detailed guide on How to Fine-tune TTS models here.

🔊TTS Fine-tuning

The training process is similar to normal SFT fine-tuning, but the dataset includes audio clips with transcripts. We use a dataset called ‘Elise’ that embeds emotion tags like <sigh> or <laughs> into transcripts, triggering expressive audio that matches the emotion.

We recommend starting with Orpheus-TTS-3B as it is Llama-based architecture and therefore easy to train and compatible in many places. Since TTS models are usually small, you can train them using 16-bit LoRA, or go with FFT. Loading a 16-bit LoRA model is simple.

We've uploaded most of the TTS models (quantized and original) to Hugging Face here.

Ensure you use the latest pip version of Unsloth. To update use:pip install --upgrade --force-reinstall --no-cache-dir unsloth unsloth_zoo

💕 Thank you!

A huge thank you Etherll for helping us out with TTS implementation. And of course everyone for using & sharing Unsloth - we really appreciate it. 🙏

As always, be sure to join our Reddit page and Discord server for help or just to show your support! You can also follow us on Twitter and join our newsletter.

Thank you for reading!

Daniel & Michael Han 🦥
15 May 2025

Fine-tune TTS for free now!

Get started for free