EasySubber: Automatic subtitles for your videos
Posted by Straight_Tone_8059@reddit | Python | View on Reddit | 5 comments
Hey,
I’d like to showcase EasySubber, a tool I developed to automatically generate subtitles from video files. If you’ve ever spent hours manually creating subtitles, this project could save you time.
What My Project Does:
EasySubber uses Whisper (OpenAI's speech recognition model) for transcription and FFmpeg for audio processing. It supports video files like .mkv
, .mp4
, and .avi
, and automatically generates .srt
subtitle files. The program includes a simple GUI (built with Tkinter) to ensure accessibility for users who may not be familiar with the command line.
Target Audience:
EasySubber is primarily aimed at video creators and content developers who need to generate subtitles quickly and easily. However, it’s also suitable for hobbyists or anyone working with video/audio who wants to automate the transcription process. This is not yet intended for production but is a stable and functional tool that anyone can try out.
Comparison with Existing Alternatives:
Compared to existing alternatives like Aegisub or commercial subtitle tools, EasySubber focuses on automating the subtitle generation process. It uses Whisper’s advanced speech recognition for accuracy and simplicity. While other tools require manual intervention or editing, EasySubber minimizes the need for human input, especially for straightforward transcription tasks.
Demo Video:
If you're interested in seeing how it works, here's a demo video: EasySubber demo
Source Code and GitHub:
Check out the source code here: Source code
Feel free to follow my work on GitHub: Ignabelitzky
Let me know if you have any feedback or suggestions on improving EasySubber!
sMASS_@reddit
Made a similar project except it doesn't output .srt files, instead it just burns the subtitles in the original video after a user checkup (niche usecases compared to your approach but they might be complementary). But have you considered using faster-whisper ?
Straight_Tone_8059@reddit (OP)
I didn't consider faster-whisper, but I will check it out for sure. And that's a cool idea about the option of burn the subtitles on the video. But before, I need to improve the transcription a lot.
sMASS_@reddit
faster-whisper could actually indirectly help you with that : you could use larger models in less time, therefore improving the transcription. Could you try to run your example videos through my program to (kind of)benchmark that ?
gpahul@reddit
Great project. I think the main challenge here is to - Get the word level mapping like word should pop up when person speaks it - And then further next step, it can show the complete like where each word will be styled as soon it's spoken!
Straight_Tone_8059@reddit (OP)
Thank you for the feedback, I will try to improve on that.