Looking for a local AI tool that can extract any info from high-quality sources (papers + reputable publications) with real citations
Posted by Inflation_Artistic@reddit | LocalLLaMA | View on Reddit | 12 comments
I’m trying to set up a fully local AI workflow (English/Chinese) that can dig through both scientific papers and reputable publications things like Bloomberg, Economist, reputable industry analyses, tech reports, etc.
The main goal:
I want to automatically extract any specific information I request, not just statistics, but any data, like:
- numbers
- experimental details
- comparisons
- anything else I ask for
And the most important requirement:
The tool must always give real citations (article, link, page, paragraph) so I can verify every piece of data. No hallucinated facts.
Ideally, the tool should:
- run 100% locally
- search deeply and for long periods
- support Chinese + English
- extract structured or unstructured data depending on the query
- keep exact source references for everything
- work on an RTX 3060 12GB
Basically, I’m looking for a local “AI-powered research engine” that can dig through a large collection of credible sources and give me trustworthy, citation-backed answers to complex queries.
Has anyone built something like this?
What tools, models, or workflows would you recommend for a 12GB GPU?
ekaj@reddit
Yes and no. I have built something like what you want, but it’s not easily usable by non technical people yet, but it also sounds like you want a deep research solution as well? The biggest limiter is your VRAM and using only local models for answer generation.
Also, you will have to build a custom ETL for any data you’re ingesting as you’re describing your wanted solution as having structured/unstructured data ingest for a variety of media formats(no matter the solution you go with). Could just rip out the media ingestion module and the RAG pipeline and use those as starter pieces to help you save some time building.
https://github.com/rmusser01/tldw_server
mahmood454@reddit
Can I have a local AI tool just for extracting text out of images?
Just the text no deep thinking or anything
I have I7 8550U 8GB ram no graphic card, so is it possible?
ekaj@reddit
yea, you want OCR https://blog.ngxson.com/using-ocr-models-with-llama-cpp
No-Consequence-1779@reddit
Yes. The deep research (downloading a buncha stuff) is the easiest part.
Melodic_Coffee_833@reddit
Inflation_Artistic@reddit (OP)
Melodic_Coffee_833@reddit
I have been building RAG for 2 years, and would like to share :
The most intense part should be ingestion of massive documents and creation of their indexing in 3 layers (dense, sparse, graph) - local brings nothing here
Search is instant on ranking similarity top X, reranking top Y , even if you deep dive it in multiple iterations you are talking 30s max..
Local business case would mean you have gov secret sources, you don't want to vectorize, even in a multi tenants env.
The current set up you are talking about would cost 50X what elastic cloud would help you do..
thatguyinline@reddit
Lightrag.
exaknight21@reddit
I posted a few days ago about this very use case. You wanna use the qwen3-2b-VL for this. Strictly because the accuracy was my original key concern too.
Coincidentally, I used 3060 12 GB too.
The github: https://github.com/ikantkode/qwen3-2b
beppled@reddit
You can try the Jan series of models, if you're thinking of things like mcp tools and browser use and they'll fit on your GPU perfectly ...
But coming to the main part of your question ... Honestly, from what I've experienced, you'd be better off using Claude's Research Feature or even Perplexity (if you could snag a 1 year free somewhere). I just wanna save you some frustration 🥹
Local Model are great but they are task and domain specific ... Jan maybe great at using tools, but I've seen it hallucinate left and right. Gemma 3 12B is great, but bad at tools.
Inflation_Artistic@reddit (OP)
It's not that I don't want to spend money on a subscription, I just don't think it's what I need. In my case, I need to read through as much data as possible and get the most out of it. So I'll probably have to process hundreds of files, and ordinary deep research tools can only handle a few dozen at most.
Permtato@reddit
I'm not affiliated in any way but used this a fair bit last year and found it pretty good, using both local models and external.
kotaemon
There's probably similar more recently updated / less open issues repos but should do what you're looking for if you hook it up with ollama for local model support.
For 'Chinese + English' and with 12GB ram, you could comfortably use one of the deepseek R1 distils like DeepSeek-R1-Distill-Qwen-1.5B.