The world’s fastest open-source TTS: Supertonic

Posted by ANLGBOY@reddit | LocalLLaMA | View on Reddit | 31 comments

Demo https://huggingface.co/spaces/Supertone/supertonic#interactive-demo

Code https://github.com/supertone-inc/supertonic

Hello!

I want to share Supertonic, a newly open-sourced TTS engine that focuses on extreme speed, lightweight deployment, and real-world text understanding.

It’s available in 8+ programming languages: C++, C#, Java, JavaScript, Rust, Go, Swift, and Python, so you can plug it almost anywhere — from native apps to browsers to embedded/edge devices.

Technical highlights are

(1) Lightning-speed — Real-time factor:

0.001 on RTX4090

0.006 on M4 Pro

(2) Ultra lightweight — 66M parameters

(3) On-device TTS — Complete privacy and zero network latency

(4) Advanced text understanding — Handles complex, real-world inputs naturally

(5) Flexible deployment — Works in browsers, mobile apps, and small edge devices

Regarding (4), one of my favorite test sentences is: 

He spent 10,000 JPY to buy tickets for a JYP concert.

Here, “JPY” refers to Japanese yen, while “JYP” refers to a name — Supertonic handles the difference seamlessly.

Hope it's useful for you!