Spark-TTS
FreeLLM-based TTS with efficient single-stream tokens.
Open-SourceVoice CloningLLM-Based
Overview
Spark-TTS builds on a Qwen-based LLM and the BiCodec single-stream token representation for efficient, high-quality speech. It supports zero-shot cloning as well as creating brand-new voices by adjusting attributes like gender, pitch and speed. The streamlined design keeps the pipeline simple.
What makes Spark-TTS special?
- Focus: LLM token-based synthesis.
- Availability: Free & Open-Source.
- Use cases: Zero-shot cloning and controllable voice creation.