Improved Text to Speech model: Parler TTS v1 by Hugging Face
Posted by vaibhavs10@reddit | LocalLLaMA | View on Reddit | 73 comments
Hi everyone, I'm VB, the GPU poor in residence (focus on open source audio and on-device ML) at Hugging Face! 🤗
Quite please to introduce you to Parler TTS v1 🔉 - 885M (Mini) & 2.2B (Large) - fully open-source Text-to-Speech models! 🤙
Some interesting things about it:
1. Trained on 45,000 hours of open speech (datasets released as well)
2. Upto 4x faster generation thanks to torch compile & static KV cache (compared to previous v0.1 release)
3. Mini trained on a larger text encoder, large trained on both larger text & decoder
4. Also supports SDPA & Flash Attention 2 for an added speed boost
5. In-built streaming, we provide a dedicated streaming class optimised for time to the first audio
5. Better speaker consistency, more than a dozen speakers to choose from or create a speaker description prompt and use that
6. Not convinced with a speaker? You can fine-tune the model on your dataset (only couple of hours would do)
Apache 2.0 licensed codebase, weights and datasets! 🤗
Can't wait to see what y'all would build with this!🫡
Quick links:
Model checkpoints: https://huggingface.co/collections/parler-tts/parler-tts-fully-open-source-high-quality-tts-66164ad285ba03e8ffde214c
Space: https://huggingface.co/spaces/parler-tts/parler_tts
GitHub Repo: https://github.com/huggingface/parler-tts
73 Comments
SituationMan@reddit
muchCode@reddit
vaibhavs10@reddit (OP)
kkchangisin@reddit
mekarpeles@reddit
ChuckBaggett@reddit
mpasila@reddit
assadollahi@reddit
mpasila@reddit
ZealousidealAir9567@reddit
SirLazarusTheThicc@reddit
vaibhavs10@reddit (OP)
redfairynotblue@reddit
bihungba1101@reddit
randomfoo2@reddit
bihungba1101@reddit
Bound4OuterSpace@reddit
theCapNemo@reddit
jd_3d@reddit
yungdrater@reddit
Hefty_Wolverine_553@reddit
chibop1@reddit
vaibhavs10@reddit (OP)
chibop1@reddit
anfedoro@reddit
Wonderful-Top-5360@reddit
Tough_Blueberry2837@reddit
jd_3d@reddit
Creepy-Muffin7181@reddit
LicoriceDuckConfit@reddit
Creepy-Muffin7181@reddit
Creepy-Muffin7181@reddit
LicoriceDuckConfit@reddit
iloveloveloveyouu@reddit
RemindMeBot@reddit
TastesLikeOwlbear@reddit
LicoriceDuckConfit@reddit
Ok_Maize_3709@reddit
Evening_Ad6637@reddit
kI3RO@reddit
Evening_Ad6637@reddit
shibe5@reddit
Pvt_Twinkietoes@reddit
laterral@reddit
Darkboy5000@reddit
Rivarr@reddit
Xanjis@reddit
Enough-Meringue4745@reddit
privacyparachute@reddit
coder543@reddit
msbeaute00000001@reddit
ShengrenR@reddit
coder543@reddit
ShengrenR@reddit
Severin_Suveren@reddit
NickUnrelatedToPost@reddit
ShengrenR@reddit
LMLocalizer@reddit
ShengrenR@reddit
ShengrenR@reddit
bigattichouse@reddit
coder543@reddit
RenoHadreas@reddit
vaibhavs10@reddit (OP)
Evening_Ad6637@reddit
hyperdynesystems@reddit
artificial_genius@reddit
vert1s@reddit
Express-Director-474@reddit
Few_Painter_5588@reddit
vaibhavs10@reddit (OP)
Few_Painter_5588@reddit
coder543@reddit