Current state of local research tools as of May 2026

Posted by Shoddy-Tutor9563@reddit | LocalLLaMA | View on Reddit | 29 comments

I was thinking, that some folks in this community will be interested to see what current options are on local deep research field. So I spent some time to collect everything I could find together. Enjoy.

TLDR: the most healthiest and local-friendly projects are "GPT Researcher" by assafelovic and "Local Deep Research" by LearningCircuit.

"Local Deep Research" by LearningCircuit

Observations:

python
alive - last commit made yesterday
medium number of contributors - 46
75 opened issues (half from the contributor, half from users but no comments for long months) / 254 closed (many self-reported)
161 opened PR (many from contributor hanging for long weeks - what's the point??) / 3309 closed PRs (visually 95% from contributor or dependobot)
uses SearXNG

Reddit - https://www.reddit.com/r/LocalLLaMA/s/F4o4jCL4IA
Subreddit - https://www.reddit.com/r/LocalDeepResearch/
Github - https://github.com/LearningCircuit/local-deep-research
Benchmark - https://huggingface.co/datasets/local-deep-research/ldr-benchmarks

"STORM" by Stanford

Observations:

python
abandoned - last commit 8 months ago
small number of contributors - 23
58 opened issues (many bug reports with no replies) / 164 closed (mostly without resolution as not planned)
60 PRs (mostly with no replies) / 111 closed (for last 2 years just cancelled)
uses various retrival services - YouRM, BingSearch, VectorRM, SerperRM, BraveRM, SearXNG, DuckDuckGoSearchRM, TavilySearchRM, GoogleSearch, and AzureAISearch

Github - https://github.com/stanford-oval/storm
Website - https://storm-project.stanford.edu/

"GPT Researcher" by assafelovic

Observations:

python + typescript
semi-alive - last commit 3 weeks ago
poorly maintained - lots of stale branches
large number of contributors - 211
173 opened issues (almost no reaction to 2026 issues) / 511 closed (mostly with fixes)
44 opened PRs (some are 6 months old without review and comments) / 785 closed (60-70% merged)
obsessed with MCP - internet search & web scraping is done via separate MCP https://github.com/assafelovic/gptr-mcp which uses 3rd party API

Github - https://github.com/assafelovic/gpt-researcher
Documentation - https://docs.gptr.dev/
Website - https://gptr.dev/

"Local Deep Research" by LangChain

Observations:

python
semi-alive - last commit 2 weeks ago
small number of contributors - 14
36 opened issues (many with no reply) / 39 closed (with solutions)
6 opened PR (some are hanging more than a year) / 48 closed (mostly from dependabot, no recent contributions from users)
DuckDuckGo, SearXNG + commercial providers

Github - https://github.com/langchain-ai/local-deep-researcher

"Open Deep Research" by LangChain

What are these LangChain guys smoking? Two similarly named projects, one is most probably a successor of the other, but not a word being said on readme about it.

Observations:

python + Jupyter notebook (???)
abandoned - last dev work by human ended in Aug 2025
small number of contributors - 26
34 opened issues (no replies since Nov 2025) / 95 closed ones
24 opened PRs (no comments/ no reviews) / 114 closed ones (community contribution is mostly discarded)
no info on what it uses as internet search engine

GitHub - https://github.com/langchain-ai/open_deep_research

"Open Deep Research" by Together

Observations:

python
abandoned - last commit year ago, 3 commits in total
one contributor
no opened and closed issues
no PRs
relies on TAVILY for web search

Github - https://github.com/togethercomputer/open_deep_research
Blogpost - https://www.together.ai/blog/open-deep-research

"Deer flow" (Deep Exploration and Efficient Research Flow) by ByteDance

Supports any OpenAI compatible providers

Observations:

python
alive - last commit 19 minutes ago
large number of contributors - 253
444 opened issues (mostly from Chinese folks, many have replies) / 735 closed (half with code changes)
257 opened pull requests, lots are pending for review and merge / 1230 closed (visually 70% merged)
uses "Info Quest" for internet search (proprietary, paid)

Github - https://github.com/bytedance/deer-flow
Website - https://deerflow.tech/

"Deep Research" by Alibaba

Observations:

python
abandoned - last commits months ago
small number of contributors - 27
focused on using a single model - their own "Tongyi-DeepResearch-30B-A3B"
vendor locked-in - glued its ass to Serper.dev for search and Jina.ai for scraping

Github - https://github.com/Alibaba-NLP/DeepResearch

"MiroThinker" by MiroMindAI

Observations:

semi-alive - last commit 3 weeks ago
small number of contributors - 19
focused on using their own models - "MiroThinker-1.7-mini" (30B) or "MiroThinker-1.7" (235B)
vendor locked-in - bring your own SERPER_API_KEY, JINA_API_KEY
tried to run a test research from their demo page - fall on it's face

Github - https://github.com/MiroMindAI/MiroThinker
Website - https://www.miromind.ai/

"Deep-searcher" by Zilliztech

Observations:

abandoned - last commit 6 months ago
small number of contributors - 31
40 issues, 50 closed
6 pending PRs, 167 closed (mostly merged)

Github - https://github.com/zilliztech/deep-searcher

PS

No LLM assisted research tools were used to gather the above table. Just me and my own hands. Only few out of the above projects had a demo website - Mirothinker, Storm and DeerFlow - but:

Mirothinker produced a quite comprehensive report after an hour, but it hallucinated one half of github metrics and didn't give a fuck to collect the other half. Untrusted and unusable.
Storm is basically unusable for deep research tasks as you cannot provide an extended instruction on what to research and what kind of results you need, just a shitty short string of how your research paper should be titled
DeerFlow site is just broken, cannot get past the authentication + various 404. Shame on you, ByteDance web developers!

If you have time and your local deep research agent is sitting nearby, try to give it below prompt. I'm sincerely curious what your results will be. Especially how many hallucinations in github figures.

Find and compare the best local deep research projects. Compose a table with results. The table must contain:
- vendor / company name
- project name
- github URL
- product website or blog URL where it was announced
- when the last commit to github was made
- number of github issues and PRs
- number of contributors to github project
- if project docs are suggesting to use a bespoke LLM model
- if project is coming with its own web search and web page scraping tool

[-]

AI_Only@reddit

I've had nothing but issues with DeerFlow and DeerFlow V2. What's a good alternative?

[-]

Shoddy-Tutor9563@reddit (OP)

Come back in a couple of days. I'll be able to tell you how good or bad my two top candidates

[-]

dtdisapointingresult@reddit

!RemindMe 1 week

[-]

RemindMeBot@reddit

I will be messaging you in 7 days on 2026-05-12 21:56:15 UTC to remind you of this link

CLICK THIS LINK to send a PM to also be reminded and to reduce spam.

^(Parent commenter can ) ^(delete this message to hide from others.)

^(Info)	^(Custom)	^(Your Reminders)	^(Feedback)

[-]

ManuGamer96_@reddit

!RemindMe 1week

[-]

dtdisapointingresult@reddit

Try DeerFlow 2 please, out of all of those it's the most likely to have a future.

Also, you could consider expanding your search to general Claw assistants, just add a Deep Research skill. (https://skills.sh/?q=deep-research) Hell, just install the skill in any agentic app and just invoke /deep-research .

Many of the Claws probably ship one built-in. Are DR-specific tools necessary when it's just tool-calling + prompts which can be encoded in a skill?

I did a similar research to you for Claws yesterday: https://reddit.com/r/LocalLLaMA/comments/1t3lwji/comparison_of_the_development_status_of_various/

[-]

Shoddy-Tutor9563@reddit (OP)

Sorry, I'm very opinionated reg some harnesses - specifically I'm allergic to Claw for many reasons. I prefer tools I invoke, not the ones that are invoked themselves.

As for DeerFlow 2 - will test it for sure, thank you for the cue.

[-]

McSendo@reddit

So I think the deerflow.tech page is a mock. It's not an actual functional page.

I've been using deerflow for a month now. Make sure you use the stable 2.0 version. They currently have some bugs with subagents (regarding multiple sys messages) with the main branch.

I've been using it for 3 weeks and I like the setup and architecture so far that I might just build on it myself.

[-]

DeltaSqueezer@reddit

How would you rate GPT Researcher vs LDR? Do either support a big model for planning and synthesis but a smaller faster model for retrieval and exploration?

[-]

Shoddy-Tutor9563@reddit (OP)

I cannot rate them yet. I was composing a list of projects to try. So these two are my top candidates :) will be able to address your question in a couple of days when I run them both side by side

[-]

joshp23@reddit

Interested in this as well.

[-]

Shoddy-Tutor9563@reddit (OP)

Allright. I have installed https://github.com/LearningCircuit/local-deep-research and played with it for a while. Have mixed feelings.

Pros:

- it works - I managed to install it quite easily, hooked it up to my local SearXNG instance and Ollama Cloud and even conducted few sample researches
- it has a lot of settings

Cons:

- the out-of-the-box settings are only fit for some simple researches. To make research a much more deeper, you need to fine-tune them. There're two quick selection profiles how deep you want to dig, so even in the full-scale mode it's still doing very shallow job
- it's clearly not mature enough and QA left to be desired - I found a bug within my first 5 minutes of usage
- it's not stable yet - after a couple of researches it ate all the available memory and OS froze

[-]

mj3815@reddit

Perplexica qualify? https://github.com/kiranz/perplexica

[-]

Shoddy-Tutor9563@reddit (OP)

Not really - it looks like it's someone's private (bus factor=1) fork of https://github.com/ItzCrazyKns/Vane

[-]

ridablellama@reddit

now ask yourself why are so many of them are dead?

Getting web search results at scale for free is not a problem any of these projects will solve for you.

[-]

y4m4@reddit

there's a handful of good search APIs you can use for cheap and some offer a decent amount of free queries (serper gives 2500 queries, tavily gives 1000 api credits per mo, brave gives some but has recently changed the terms) . You can't scale it beyond personal use for free though.

[-]

Shoddy-Tutor9563@reddit (OP)

SearXNG?

[-]

y4m4@reddit

In my experience, it gets instantly throttled if you're trying to scrape any useful search engine. SearXNG is great for locally aggregating the APIs I listed though, takes some monkeying around to get it working though..

[-]

Shoddy-Tutor9563@reddit (OP)

My personal workloads probably never hit that limit so SearXNG did amazingly well for them. If we're thinking about professional / enterprise use cases, then it's perfectly fine to pay for your search requests.

[-]

McSendo@reddit

Brave (llm context api), Tavily, and Exa are all good for prototyping phase to get the harness working. Then you can spend time on searxng and crawl4ai for local deployment.

[-]

Shoddy-Tutor9563@reddit (OP)

Sorry I'm not getting your point.

[-]

postitnote@reddit

Maybe you should use each research agent to research the latest research agents?

[-]

Shoddy-Tutor9563@reddit (OP)

It's at the bottom of my post: I tried, but they failed on their faces

[-]

ketosoy@reddit

Does Nvidia’s aiq meet your criteria?

https://github.com/NVIDIA-AI-Blueprints/aiq

[-]

Shoddy-Tutor9563@reddit (OP)

Formally it does. But I don't like what I see:

- leans towards bespoke models
- uses Tavily and Serper
- overengineered to my taste as suggests to use 6 (!!!!) different LLMs
- 2 issues (small user base - is anyone using it at all?)

[-]

FeiX7@reddit

which you find best one?

[-]

Shoddy-Tutor9563@reddit (OP)

Not tried my top two candidates yet. So cannot say yet

[-]

MustBeSomethingThere@reddit

Answer to OP's challenge. I used my own agent harness with Gemma 4 26B. I had to add clarifications for "best" (number of contributors) and for "recent" (last 6 months). Dates and numbers are pretty much all hallucinated.

[-]

Shoddy-Tutor9563@reddit (OP)

Thank you so much for sharing. My gut feeling tells me these agents are still somewhat useful to "scratch the surface" and prepare some point for the following proper human research or clarification.