Getting the info you need from bank account statement PDFs: what's the best way?

Posted by dirtyring@reddit | LocalLLaMA | View on Reddit | 0 comments

I'm building an app that retrieves certain information from bank account statements. This information might be something like "largest balance in the month of March" or "name of account holder".

I'm using LMMs (vision) because I do not know the format of the bank account I'll receive. It might be from an American bank or a Vietnamese bank, for example.

In your experience, for building this type of application, what's the best solution? I have some thoughts:

1) OCR the information (with a library like IBM's docling) then feed the LLM with that markdown information and query it 2) Use Llama 3.2 11b or 90b (vision) to request that information straight away (no need to extract information then query it) 3) Any other suggestion?