If I want to use a small model to "decode" scanned pdf with graphs and tables etc to feed it to a large non multimodal model. What is my best option?

Posted by Windowsideplant@reddit | LocalLLaMA | View on Reddit | 2 comments

The large one would be on the cloud but not multimodal and the small one on a laptop.