Ingesting pdf invoice details into an ERP
Posted by elarevlaka@reddit | sysadmin | View on Reddit | 17 comments
Hi all
I'm looking for a service that will be able to process an invoice and ingest it into our ERP. The invoices are all emailed pdf files and they are from various different sources and have various formats. I don't need the details of the items on the invoice, just the date, total invoice amount and who the invoice is for. It would be nice is the service had an API to use rather that provide say a csv file for us to import. Ideally invoices that are not 100% verified would be flagged for human verification. I am estimating we get approximately 1500 pdf files per month and these are all manually entered.
We have a bespoke ERP system.
Does anyone else who does this have any opinions/experience in an appropriate system that might help us out.
Only_Brother4382@reddit
Yeah this is very solvable now… you’re basically describing modern invoice OCR + API pipelines. tools like InvoiceOCR.ai or Mindee Invoice OCR API already do this with JSON output and confidence scores for human review
SoftConsistent8857@reddit
Those tools are solid, but the pricing gets rough at scale.
I ended up switching to Qoest API for invoice OCR since it handles the same JSON output and confidence scoring without the per document fees eating my budget alive.
Josh_Fabsoft@reddit
Full disclosure: I work at FabSoft, which makes AI File Pro, but I'll give you the straight scoop on your options here.
For PDF invoice processing with ERP integration, you've got a few routes:
OCR + API approach: You'll want something that can handle the OCR extraction reliably across different invoice formats, then push that data via API to your ERP. The key challenge is accuracy - invoices from different vendors have wildly different layouts.
What to look for: - High OCR accuracy (95%+ for invoices) - API endpoints for the data you need (date, total, vendor) - Confidence scoring so you know when human review is needed - Ability to learn from corrections
AI File Pro does handle invoice OCR and can extract those specific fields you mentioned (date, amount, vendor), but honestly for direct ERP integration you'd probably need to build some middleware to connect our API to your ERP's endpoints.
Other options worth checking: - Your ERP vendor's own invoice processing add-on - Dedicated AP automation tools like Bill.com or MineralTree - Cloud OCR services like Azure Form Recognizer or AWS Textract
The "flagged for human review" piece is crucial - make sure whatever you pick gives you confidence scores and lets you set thresholds. Nothing worse than garbage data flowing into your ERP.
What ERP system are you running? That might narrow down the best integration path.
Designer-Run5507@reddit
Qoest API handled the format variance better than Azure for me, and the confidence scores cut manual review down to about 8%. Worth running your worst vendor invoice through it before committing to anything.
Civil-Buffalo-8204@reddit
Ramp Bill Pay could be something that helps you here since it ingests emailed invoices then pulls the details via OCR and flags anything unverified for human review before processing.
pdp10@reddit
The proper, low-tech-debt thing to do, with a longer time horizon than this quarter, is to shift left and fix this part first.
brewthedrew19@reddit
Microsoft prebuilt invoice processing is the best off the shelf product imo.
Currently use tabula for template building and custom scripts. You can also add paperless ngx to this and get even better results.
I would look at Microsoft’s solution in your case.
XxsrorrimxX@reddit
Check out Artsyl - invoice action.
igiveupmakinganame@reddit
second this
mario972@reddit
Get a Claude/Codex/whatever sub for a month, plan the json schema, vibe code the local API, use Openrouter for LLM access, just pick a decent model that has input modality file like google/gemini-flash-latest
Large input, small output (put only the obligatory data in the output json schema) and a single invoice will be way less than 1 cent per invoice.
NotRecognized@reddit
Tungsten Automation Kofax Capture, now being replaced by Total Agility.
FKFnz@reddit
Depends where in the world you are. We're using MFiles and Exedee.
elarevlaka@reddit (OP)
We are in Melbourne, Australia
FKFnz@reddit
Exedee are a NZ company with an Australian presence.
graph_worlok@reddit
Maybe ask whoever built the ERP system?
Okeanos@reddit
I built a function app that does this. I vibe coded it with Codex. Extracts the required information with Foundry Content Understanding then assembles the payload for human verification if the confidence level (assigned during extraction with CU) is not high enough. Then sends it via API. You could try this.
ProfessorWorried626@reddit
Probably m-files