GTX 1650,4 gb vram, I want a decent local tts.
Posted by Remote-Ad-8129@reddit | LocalLLaMA | View on Reddit | 5 comments
At this moment I am broke, so pls dont laugh at my specs, I am making vidoes at this moments but I want a deep male voice, I did try eleven labs but ts is too costly, then I tried qwen tts but it was slow as heck, does anybody know lighter tts model ? I dont want emotions at present.
macboller@reddit
Kokoro should work on your card
https://github.com/remsky/Kokoro-FastAPI
goldenjm@reddit
Agreed- Kokoro, at 82M parameters, is a much smaller and faster model than Qwen. It has high quality output. But, it doesn't have a deep male voice. You could try pitch shifting to get one though.
macboller@reddit
https://huggingface.co/hexgrad/Kokoro-82M/blob/main/VOICES.md
"am_michael" is pretty deep
Remote-Ad-8129@reddit (OP)
bro is there any way to add voices to the kokoro? thought I got one but it is not suitable for me? by the way thanks u so so so so so much, like I cant express my relief
macboller@reddit
Yeah you can pick loads.
https://huggingface.co/hexgrad/Kokoro-82M/blob/main/VOICES.md
You can make your own too, and mix voices.