We just Fine-Tuned a Japanese Manga OCR Model with PaddleOCR-VL!

Posted by erinr1122@reddit | LocalLLaMA | View on Reddit | 22 comments

Hi all! 👋
Hope you don’t mind a little self-promo, but we just finished fine-tuning PaddleOCR-VL to build a model specialized in Japanese manga text recognition — and it works surprisingly well! 🎉

Model: PaddleOCR-VL-For-Manga

Dataset: Manga109-s + 1.5 million synthetic samples

Accuracy: 70% full-sentence accuracy (vs. 27% from the original model)

It handles manga speech bubbles and stylized fonts really nicely. There are still challenges with full-width vs. half-width characters, but overall it’s a big step forward for domain-specific OCR.

How to use
You can use this model with Transformers, PaddleOCR, or any library that supports PaddleOCR-VL to recognize manga text.
For structured documents, try pairing it with PP-DocLayoutV2 for layout analysis — though manga layouts are a bit different.

We’d love to hear your thoughts or see your own fine-tuned versions!
Really excited to see how we can push OCR models even further. 🚀