Generate Realistic Podcast Sessions Programmatically

Hey everyone! 👋

I just released podcast_tts, a Python library that generates realistic podcasts and dialogues with multi-speaker audio, background music, and professional-quality mixing—all running 100% locally.

What My Project Does

podcast_tts allows you to programmatically create high-quality audio sessions with multiple speakers, dynamic or premade voice profiles, and customizable background music. You can save the output as MP3 or WAV files and even assign playback to specific audio channels for spatial separation.

It’s designed to be flexible, whether you’re building an API with FastAPI or experimenting with personal projects.

Target Audience

This library is perfect for:

Developers needing a local TTS solution for privacy or offline use.
Engineers building backend systems for audio generation (e.g., podcasts or virtual assistants).
Anyone looking for an all-in-one tool for dialogue generation with professional audio quality.

Comparison to Alternatives

Unlike many TTS libraries that rely on cloud services, podcast_tts is fully offline, ensuring privacy and reducing latency. It also integrates features like multi-speaker support, background music mixing, and text normalization, which are often missing or require multiple tools to achieve.

The project is open source, and you can find it on GitHub here: GitHub Repo.
It’s also available on PyPI for easy installation: pip install podcast_tts.

I’ve shared more details in a blog post on LinkedIn and would love to hear your feedback! Let me know if you try it out or have ideas for improvement. 😊

[-]

ekbravo@reddit

It takes me less than a second to unsubscribe from AI generated speech.

Love the project as a developer.

Popular_Being7765@reddit (OP)

Me too! That's why I wanted a way to create realistic audio sessions; also this is akin of a good designed document to someone visually impaired; so I believe these tools also help us as developers to create more good quality accesible content.