Current state of local research tools as of May 2026
Posted by Shoddy-Tutor9563@reddit | LocalLLaMA | View on Reddit | 29 comments
I was thinking, that some folks in this community will be interested to see what current options are on local deep research field. So I spent some time to collect everything I could find together. Enjoy.
TLDR: the most healthiest and local-friendly projects are "GPT Researcher" by assafelovic and "Local Deep Research" by LearningCircuit.
"Local Deep Research" by LearningCircuit
Observations:
- python
- alive - last commit made yesterday
- medium number of contributors - 46
- 75 opened issues (half from the contributor, half from users but no comments for long months) / 254 closed (many self-reported)
- 161 opened PR (many from contributor hanging for long weeks - what's the point??) / 3309 closed PRs (visually 95% from contributor or dependobot)
- uses SearXNG
Reddit - https://www.reddit.com/r/LocalLLaMA/s/F4o4jCL4IA
Subreddit - https://www.reddit.com/r/LocalDeepResearch/
Github - https://github.com/LearningCircuit/local-deep-research
Benchmark - https://huggingface.co/datasets/local-deep-research/ldr-benchmarks
"STORM" by Stanford
Observations:
- python
- abandoned - last commit 8 months ago
- small number of contributors - 23
- 58 opened issues (many bug reports with no replies) / 164 closed (mostly without resolution as not planned)
- 60 PRs (mostly with no replies) / 111 closed (for last 2 years just cancelled)
- uses various retrival services - YouRM, BingSearch, VectorRM, SerperRM, BraveRM, SearXNG, DuckDuckGoSearchRM, TavilySearchRM, GoogleSearch, and AzureAISearch
Github - https://github.com/stanford-oval/storm
Website - https://storm-project.stanford.edu/
"GPT Researcher" by assafelovic
Observations:
- python + typescript
- semi-alive - last commit 3 weeks ago
- poorly maintained - lots of stale branches
- large number of contributors - 211
- 173 opened issues (almost no reaction to 2026 issues) / 511 closed (mostly with fixes)
- 44 opened PRs (some are 6 months old without review and comments) / 785 closed (60-70% merged)
- obsessed with MCP - internet search & web scraping is done via separate MCP https://github.com/assafelovic/gptr-mcp which uses 3rd party API
Github - https://github.com/assafelovic/gpt-researcher
Documentation - https://docs.gptr.dev/
Website - https://gptr.dev/
"Local Deep Research" by LangChain
Observations:
- python
- semi-alive - last commit 2 weeks ago
- small number of contributors - 14
- 36 opened issues (many with no reply) / 39 closed (with solutions)
- 6 opened PR (some are hanging more than a year) / 48 closed (mostly from dependabot, no recent contributions from users)
- DuckDuckGo, SearXNG + commercial providers
Github - https://github.com/langchain-ai/local-deep-researcher
"Open Deep Research" by LangChain
What are these LangChain guys smoking? Two similarly named projects, one is most probably a successor of the other, but not a word being said on readme about it.
Observations:
- python + Jupyter notebook (???)
- abandoned - last dev work by human ended in Aug 2025
- small number of contributors - 26
- 34 opened issues (no replies since Nov 2025) / 95 closed ones
- 24 opened PRs (no comments/ no reviews) / 114 closed ones (community contribution is mostly discarded)
- no info on what it uses as internet search engine
GitHub - https://github.com/langchain-ai/open_deep_research
"Open Deep Research" by Together
Observations:
- python
- abandoned - last commit year ago, 3 commits in total
- one contributor
- no opened and closed issues
- no PRs
- relies on TAVILY for web search
Github - https://github.com/togethercomputer/open_deep_research
Blogpost - https://www.together.ai/blog/open-deep-research
"Deer flow" (Deep Exploration and Efficient Research Flow) by ByteDance
Supports any OpenAI compatible providers
Observations:
- python
- alive - last commit 19 minutes ago
- large number of contributors - 253
- 444 opened issues (mostly from Chinese folks, many have replies) / 735 closed (half with code changes)
- 257 opened pull requests, lots are pending for review and merge / 1230 closed (visually 70% merged)
- uses "Info Quest" for internet search (proprietary, paid)
Github - https://github.com/bytedance/deer-flow
Website - https://deerflow.tech/
"Deep Research" by Alibaba
Observations:
- python
- abandoned - last commits months ago
- small number of contributors - 27
- focused on using a single model - their own "Tongyi-DeepResearch-30B-A3B"
- vendor locked-in - glued its ass to Serper.dev for search and Jina.ai for scraping
Github - https://github.com/Alibaba-NLP/DeepResearch
"MiroThinker" by MiroMindAI
Observations:
- semi-alive - last commit 3 weeks ago
- small number of contributors - 19
- focused on using their own models - "MiroThinker-1.7-mini" (30B) or "MiroThinker-1.7" (235B)
- vendor locked-in - bring your own SERPER_API_KEY, JINA_API_KEY
- tried to run a test research from their demo page - fall on it's face
Github - https://github.com/MiroMindAI/MiroThinker
Website - https://www.miromind.ai/
"Deep-searcher" by Zilliztech
Observations:
- abandoned - last commit 6 months ago
- small number of contributors - 31
- 40 issues, 50 closed
- 6 pending PRs, 167 closed (mostly merged)
Github - https://github.com/zilliztech/deep-searcher
PS
No LLM assisted research tools were used to gather the above table. Just me and my own hands. Only few out of the above projects had a demo website - Mirothinker, Storm and DeerFlow - but:
- Mirothinker produced a quite comprehensive report after an hour, but it hallucinated one half of github metrics and didn't give a fuck to collect the other half. Untrusted and unusable.
- Storm is basically unusable for deep research tasks as you cannot provide an extended instruction on what to research and what kind of results you need, just a shitty short string of how your research paper should be titled
- DeerFlow site is just broken, cannot get past the authentication + various 404. Shame on you, ByteDance web developers!
If you have time and your local deep research agent is sitting nearby, try to give it below prompt. I'm sincerely curious what your results will be. Especially how many hallucinations in github figures.
Find and compare the best local deep research projects. Compose a table with results. The table must contain:
- vendor / company name
- project name
- github URL
- product website or blog URL where it was announced
- when the last commit to github was made
- number of github issues and PRs
- number of contributors to github project
- if project docs are suggesting to use a bespoke LLM model
- if project is coming with its own web search and web page scraping tool
AI_Only@reddit
I've had nothing but issues with DeerFlow and DeerFlow V2. What's a good alternative?
Shoddy-Tutor9563@reddit (OP)
Come back in a couple of days. I'll be able to tell you how good or bad my two top candidates
dtdisapointingresult@reddit
!RemindMe 1 week
RemindMeBot@reddit
I will be messaging you in 7 days on 2026-05-12 21:56:15 UTC to remind you of this link
CLICK THIS LINK to send a PM to also be reminded and to reduce spam.
^(Parent commenter can ) ^(delete this message to hide from others.)
ManuGamer96_@reddit
!RemindMe 1week
dtdisapointingresult@reddit
Try DeerFlow 2 please, out of all of those it's the most likely to have a future.
Also, you could consider expanding your search to general Claw assistants, just add a Deep Research skill. (https://skills.sh/?q=deep-research) Hell, just install the skill in any agentic app and just invoke /deep-research.
Many of the Claws probably ship one built-in. Are DR-specific tools necessary when it's just tool-calling + prompts which can be encoded in a skill?
I did a similar research to you for Claws yesterday: https://reddit.com/r/LocalLLaMA/comments/1t3lwji/comparison_of_the_development_status_of_various/
Shoddy-Tutor9563@reddit (OP)
Sorry, I'm very opinionated reg some harnesses - specifically I'm allergic to Claw for many reasons. I prefer tools I invoke, not the ones that are invoked themselves.
As for DeerFlow 2 - will test it for sure, thank you for the cue.
McSendo@reddit
So I think the deerflow.tech page is a mock. It's not an actual functional page.
I've been using deerflow for a month now. Make sure you use the stable 2.0 version. They currently have some bugs with subagents (regarding multiple sys messages) with the main branch.
I've been using it for 3 weeks and I like the setup and architecture so far that I might just build on it myself.
DeltaSqueezer@reddit
How would you rate GPT Researcher vs LDR? Do either support a big model for planning and synthesis but a smaller faster model for retrieval and exploration?
Shoddy-Tutor9563@reddit (OP)
I cannot rate them yet. I was composing a list of projects to try. So these two are my top candidates :) will be able to address your question in a couple of days when I run them both side by side
joshp23@reddit
Interested in this as well.
Shoddy-Tutor9563@reddit (OP)
Allright. I have installed https://github.com/LearningCircuit/local-deep-research and played with it for a while. Have mixed feelings.
Pros:
- it works - I managed to install it quite easily, hooked it up to my local SearXNG instance and Ollama Cloud and even conducted few sample researches
- it has a lot of settings
Cons:
- the out-of-the-box settings are only fit for some simple researches. To make research a much more deeper, you need to fine-tune them. There're two quick selection profiles how deep you want to dig, so even in the full-scale mode it's still doing very shallow job
- it's clearly not mature enough and QA left to be desired - I found a bug within my first 5 minutes of usage
- it's not stable yet - after a couple of researches it ate all the available memory and OS froze
mj3815@reddit
Perplexica qualify? https://github.com/kiranz/perplexica
Shoddy-Tutor9563@reddit (OP)
Not really - it looks like it's someone's private (bus factor=1) fork of https://github.com/ItzCrazyKns/Vane
ridablellama@reddit
now ask yourself why are so many of them are dead?
Getting web search results at scale for free is not a problem any of these projects will solve for you.
y4m4@reddit
there's a handful of good search APIs you can use for cheap and some offer a decent amount of free queries (serper gives 2500 queries, tavily gives 1000 api credits per mo, brave gives some but has recently changed the terms) . You can't scale it beyond personal use for free though.
Shoddy-Tutor9563@reddit (OP)
SearXNG?
y4m4@reddit
In my experience, it gets instantly throttled if you're trying to scrape any useful search engine. SearXNG is great for locally aggregating the APIs I listed though, takes some monkeying around to get it working though..
Shoddy-Tutor9563@reddit (OP)
My personal workloads probably never hit that limit so SearXNG did amazingly well for them. If we're thinking about professional / enterprise use cases, then it's perfectly fine to pay for your search requests.
McSendo@reddit
Brave (llm context api), Tavily, and Exa are all good for prototyping phase to get the harness working. Then you can spend time on searxng and crawl4ai for local deployment.
Shoddy-Tutor9563@reddit (OP)
Sorry I'm not getting your point.
postitnote@reddit
Maybe you should use each research agent to research the latest research agents?
Shoddy-Tutor9563@reddit (OP)
It's at the bottom of my post: I tried, but they failed on their faces
ketosoy@reddit
Does Nvidia’s aiq meet your criteria?
https://github.com/NVIDIA-AI-Blueprints/aiq
Shoddy-Tutor9563@reddit (OP)
Formally it does. But I don't like what I see:
- leans towards bespoke models
- uses Tavily and Serper
- overengineered to my taste as suggests to use 6 (!!!!) different LLMs
- 2 issues (small user base - is anyone using it at all?)
FeiX7@reddit
which you find best one?
Shoddy-Tutor9563@reddit (OP)
Not tried my top two candidates yet. So cannot say yet
MustBeSomethingThere@reddit
Answer to OP's challenge. I used my own agent harness with Gemma 4 26B. I had to add clarifications for "best" (number of contributors) and for "recent" (last 6 months). Dates and numbers are pretty much all hallucinated.
Shoddy-Tutor9563@reddit (OP)
Thank you so much for sharing. My gut feeling tells me these agents are still somewhat useful to "scratch the surface" and prepare some point for the following proper human research or clarification.