Best open source realtime tts?
Posted by Sudonymously@reddit | LocalLLaMA | View on Reddit | 39 comments
Hey ya’ll what is the best open source tts that is super fast! I’m looking to replace Elevenlabs in my workflow for being too expensive
Individual_Math_8254@reddit
have you tried kitten tts?
gijdillaxfason@reddit
Dubvoice.ai stable and best voices all languages.
Any_Cut9536@reddit
What tts or real time is similar to 2022 anon voice videos
GenAI-Evangelist@reddit
Best leaderboard
https://huggingface.co/spaces/Pendrokar/TTS-Spaces-Arena
bsenftner@reddit
Why are there so many "leaderboards"? This entire space is getting over run with scam artists extremely fast.
g14loops@reddit
kokoro
Osama_Saba@reddit
How VRAM it much?
pigeon57434@reddit
kokoro is like 82M paramters you could run it on your toaster
BasicBelch@reddit
challenge accepted
pingwin@reddit
I run https://github.com/remsky/Kokoro-FastAPI at home, it usually eats around 2.5G VRAM
Osama_Saba@reddit
Nooooooooo really????? So it doesn't fit with qwen 14 ffs iguana at your face
CommunityTough1@reddit
There's actually a version that runs 100% locally... In your browser. It even works on mobile. The model is very small (only 82 million parameters), so running it 100% in the browser isn't a big deal.
sherlockAI@reddit
Here's a batch implementation of Kokoro for interested folks. We wanted to run it on-device but should help in any deployment. Takes about 400MB RAM if using int8 quantized version. Honestly, don't see much difference in fp32 vs int8.
https://www.nimbleedge.com/blog/how-to-run-kokoro-tts-model-on-device
GrayPsyche@reddit
can you train voices for it
g14loops@reddit
No, they ddin't public their training code.
pingwin@reddit
I run https://github.com/remsky/Kokoro-FastAPI at home, it usually eats around 2.5G VRAM.
plurch@reddit
Here are some other repos in the same neighborhood as kokoro
Osama_Saba@reddit
How does it vrams?
NAKOOT@reddit
IndexTTS, even works with 6GB VRAM and it's really easy to use.
Original_Finding2212@reddit
We ported KokoroTTS to Jetson-containers and it takes a few hundred MB RAM.. I think 300-600?
But you need one that supports working in stream or small chunks. There are other, bigger models with better voice.
YearnMar10@reddit
It takes me on jetson 3gig once everything is loaded… which container are you using?
Original_Finding2212@reddit
Use jetson-containers repo (disclaimer: I joined as a maintainer there). It completely changes how we work on jetson.
It supports old models as well!
YearnMar10@reddit
I started up the PyTorch container and loaded Kokoro in there. Docker stats show that the container uses 250mb, but with top I see that 3gigs of ram are more in use as soon as it is fired up and being used. I’ll investigate a bit more.
nrkishere@reddit
Kokoro
Osama_Saba@reddit
Describe the VRAM of it
LewisTheScot@reddit
Bros been talking to too much LLM's that he's replying in prompts
MINIMAN10001@reddit
When LLMs came out it was clear that the way I would talk to people when trying to get help was the same way I would talk to an LLM.
Horrible for getting help because it lacks context. Ended up with was to much back and forth because I wouldn't just tell them everything that needed to be said.
MindOrbits@reddit
Jst w8 4 txting proms
Rectangularbox23@reddit
I'd say GptSoVits-4, though not entirely sure if it's real time tbh
atypicalbit@reddit
Smallest.ai tts models
n1c39uy@reddit
I've used mozilla tts with success for this
mythicinfinity@reddit
If you were looking at closed source alternatives, what kind of target price would you be looking for?
alew3@reddit
Any recommendations on open source Speech-to-Speech models?
Ok_Nail7177@reddit
https://huggingface.co/nari-labs/Dia-1.6B is also good.
woadwarrior@reddit
If you’re fine with occasional hallucinations. Kokoro is deterministic.
paranoidray@reddit
https://github.com/KoljaB/RealtimeVoiceChat
brahh85@reddit
kokoro with this https://github.com/remsky/Kokoro-FastAPI
Fair-Spring9113@reddit
https://huggingface.co/nari-labs/Dia-1.6B
or https://huggingface.co/hexgrad/Kokoro-82M
markeus101@reddit
Check out orpheus mainly the q4 and q2 quants