Need guidance from AI-native builders

Posted by Disastrous-Bee7598@reddit | LocalLLaMA | View on Reddit | 12 comments

Hey all,

I’m building a full automation pipeline for my procurement business and want to sanity-check my architecture before I go too deep.

Stack I’m considering:

•   n8n (orchestration)

•   Ollama + Gemma (local AI)

•   OCR (Tesseract / Google Vision if needed)

•   Python scripts where required

Workflows I want to automate:

Email Classification

• Gmail + Yahoo (2 companies)

• Auto-classify into PO / Quotation / Tender / Invoice / Misc
Govt Tender Scraping (Daily 7 AM)

• eProc + GeM + Newspaper (uploaded online)

• Filter by category/ deadline/ relevance/ budget

• Biggest blocker: captchas / anti-bot
L1 Price Comparer

• Compare GeM live listings vs internal Excel of our prices

• Output missing uploads, category gaps, stock mapping
Quotation Generator

• Input: scanned PDFs

• Output: structured DOCX (with letterhead)

• Auto-fill product + price from given price lists
Tender Parser

• Extract annexures/tables from tender PDFs

• Convert into structured submission-ready docs
Geo CRM

• Offline-first

• Map-based client tracking + leads + visit history + Institutions / departments / client details + pending supplies

⸻

Questions:

•   Is n8n + local LLM (gemma 4) a good backbone, or will this become messy fast?

•   Where should I avoid AI and stick to deterministic pipelines?

•   Best reliable OCR + table extraction combo you’ve used?

•   How are people handling captcha-heavy sites in production?

•   Would you modularize this or centralize everything? Also can you suggest a tool which lets me do/ keep track of all these tasks in a single dashboard or maybe 2 dashboards that i will check daily?

⸻

Looking for people who’ve actually built similar pipelines not theoretical suggestions as I’ve listed these tools entirely with the suggestion of various AI’s as I’m a non-tech person. Any and all suggestions welcome :)

[-]

Dapper-Surprise-867@reddit

your setup is gonna get messy fast with n8n and a local llm handling all that, especially the document parts. you should keep the scraping and ocr pipelines deterministic, ai just adds complexity where you dont need it.

for the ocr and pdf extraction, i had a similar headache with scanned price lists and tender docs. i started using reseek for that exact problem, it pulls text from images and pdfs automatically and the search finds everything later. it cut out my need for separate tesseract scripts and google vision api calls.

handling captcha heavy sites in production is its own beast, youll probably need a dedicated service for that. modularize the system but keep the dashboards to two, one for daily scrapes and one for the crm stuff. that keeps the chaos contained.

for a single place to manage these tasks, reseek works as that unified dashboard for me. it holds all the extracted content from emails, pdfs, and web pages with smart tags, so i can just search for what i need instead of juggling multiple apps. it made the procurement workflows way simpler to track.

LionStrange493@reddit

This is a pretty solid setup tbh.

One thing I’d watch for in flows like this is how unpredictable the behavior can get once inputs aren’t clean.

Like emails, PDFs, scraped content small variations can sometimes change how the model behaves in ways that are hard to notice until later.

Have you tested anything like that yet?

Disastrous-Bee7598@reddit (OP)

Not yet, still everything is in pipeline but I’ll update when i reach critical stages of my mission

yeah makes sense, might be worth testing early though, some of the weird edge cases only show up once inputs get messy

CapMonster1@reddit

Good direction overall, but it will get messy without strict boundaries. n8n + local LLM works, just don’t let the LLM handle critical logic. Use it as a helper, keep core flows deterministic.

For captchas — don’t overengineer bypassing. Use Playwright + proxy rotation + external solvers like CapMonster.

Modularize everything. Separate workflows → then surface results in 1–2 dashboards (admin + BI). Don’t over-centralize in n8n.

In short:

Backbone → ok (n8n + LLM as helper)
No AI → classification, pricing, structured parsing
OCR → Google Vision + layout parser
Captchas → external solvers
Architecture → modular
Dashboard → n8n + Metabase/Superset

Tysm for ur input

Helps me so much!

Voxandr@reddit

1 - STOP using ollama ., it sucks , slower , outdated , incompatabile and spywware.
2 - use DIFY , better than N8N
3 - Your case is better develop using Microsoft Agent Framework or other agent frameworks instead of GUI tools.

Okay, I’ll look more into ur suggestions tysm!

Additional_Tap4479@reddit

I went down a very similar rabbit hole for a sourcing biz and n8n + local models worked, but only after I drew a hard line on where AI is allowed. Anything with fixed labels (email classification, PO vs quote vs tender), price comparisons, and CRM updates I forced into deterministic rules or Python, then let the LLM handle fuzzy bits like summarizing tenders or cleaning messy text.

For PDFs, I ended up using Google Vision + pdfplumber for text/tables; Camelot was ok for clean tables, but Vision rescued a lot of ugly scans. For captchas, I stopped fighting them and switched to getting the data via official exports, email alerts, or mild manual steps queued in a To‑Do app.

I tried Make and Airbyte for orchestration before landing on n8n plus a simple Metabase dashboard; Pulse for Reddit only came into the mix later to catch procurement threads we were missing, kind of like another inbox to watch for leads alongside CRM and email.

Thankyou so much for ur valuable input yay

opentabs-dev@reddit

the captcha problem on govt tender sites is genuinely the hard part — playwright with a persisted logged-in browser profile (not fresh headless) gets you much further than standard automation. for sites that still block you, 2captcha or capmonster as an n8n HTTP request node is the standard production workaround. some portals are just unreliable to automate, worth having a manual fallback.

for the gmail/yahoo classification, if you want AI to read emails without wrestling with OAuth/API token setup: I built an open-source MCP server called OpenTabs that routes AI tool calls through your existing logged-in browser session instead. you stay logged into gmail in chrome, an AI agent can read and classify emails directly, no credentials to manage. the catch is it works with interactive agents (claude code, cursor) not n8n background flows, so depends on your workflow. https://github.com/opentabs-dev/opentabs

for PDF tables: pdfplumber handles structured tables way better than tesseract. use google vision only if you're dealing with scanned docs of genuinely low quality.

on the architecture question: modularize early. separate n8n workflows for email, scraping, and doc processing. when something breaks, you'll be glad they're isolated.

Okay thanks for your input, I’ll passover your guidance to AI for further explaining me how to complete the tasks

Will await more suggestions and then choose the best one out. Thankyouuuu so much