PaddlePaddle/PaddleOCR-VL-1.6

Posted by SarcasticBaka@reddit | LocalLLaMA | View on Reddit | 10 comments

[-]

ehpehp@reddit

I'm impressed with GLM OCR's handwriting recognition. It's a bit better than Gemini 3 Pro. Agree with some others that an easy to deploy method of Paddle would be helpful.

[-]

Rude_Marzipan6107@reddit

Is there an easy way to deploy this? Last time I tried I went into dependency hell and would like to keep my cuda more up to date.

[-]

Beginning-Window-115@reddit

docker

[-]

thisissuchanoriginal@reddit

I wish it was that simple.

I went through that same dep nightmare. Per their official docs, there are like references to 5 different outdated images locked behind a closed baidu registry...

Its absolute hell for non chinese users. They also dont compare to us or eu ocr models or even recent qwen 3.x releases. This makes a fair open weight ocr model comparision practically impossible.

[-]

DevilaN82@reddit

I use llama-swap in Docker with no problem at all.
My config:
```
"PaddleOCR":

proxy: "http://127.0.0.1:9999"

ttl: 600

cmd: >

/app/llama-server

-m /root/.cache/PaddleOCR/PaddleOCR-VL-1.5.gguf

--mmproj /root/.cache/PaddleOCR/PaddleOCR-VL-1.5-mmproj.gguf

--temp 0

--port 9999
```

You need download gguf and mmproj.gguf files first and place them in properly bind mounted directory. I hope that it is the same with 1.6. Good luck!

[-]

ortsevlised@reddit

docker, vlm, use their product pipeline? I dont know i didnt have any problem before.

[-]

SarcasticBaka@reddit (OP)

New entry in the race to OCR perfection, seems to be a slight upgrade on v1.5 so not sure how it compares to newer models such as dots.mocr or chandra-2.

[-]