Python Libraries to Extract Table from PDF

Posted by Rare_Confusion6373@reddit | Python | View on Reddit | 3 comments

Here's a blog with a tutorial using multiple Python libraries to extract tables: https://unstract.com/blog/extract-tables-from-pdf-python/

Video tutorial: https://www.youtube.com/live/YfW5vVwgbyo?t=2799s

[-]

axxel12341@reddit

I used Camelot and Tabula gif

[-]

Rapid1898@reddit

Why not PyMuPDF?

[-]

serjester4@reddit

0.05$ a page is a highway robbery. Using Gemini flash with batch processing you can parse 20k pages per dollar with near perfect accuracy - even complex tables.