(Partly) Open Video Overview – Generate narrated videos from text with AI (requires Gemini API)

Posted by arbayi@reddit | LocalLLaMA | View on Reddit | 0 comments

I loved NotebookLM's Video Overview but ran into four issues: it puts its own logo on the videos, the voices are not good as ElevenLabs, I want to have music and sounds (I'll add it later) and I wanted to create a YouTube channel called "Science Anime Hub" to automate educational content and I built this as an alternative.

Takes text, generates MP4s with AI narration and images. Uses Nano Banana Pro for images, ElevenLabs for voice, ffmpeg for assembly.Currently supports 25 visual styles (watercolor, anime, retro-style, etc.) and 16 languages.

It's rough but works for my use case. Sharing in case others want something similar or want to help add more styles and improve it.

I’m hoping it will improve over time and I think the next must be making this fully Open using open alternatives for image and voice.

https://github.com/baturyilmaz/open-video-overview
https://www.youtube.com/watch?v=jy_Z54TKGTw