What's the best ready-to-use local run RAG solution?
Posted by hyxon4@reddit | LocalLLaMA | View on Reddit | 55 comments
I'm looking for recommendations on the best ready-to-use local RAG solutions out there. I’d like something I can run locally without needing to deal with cloud services or setting up my own RAG. Preferably something like NotebookLM, but without the podcast feature.
teoamico@reddit
I use ThinkableSpace, it’s a desktop app where you can just drag and drop files into a folder and it connects via MCP to Claude code. You can also hook it up to ChatGPT using a cloudflare tunnel, which is pretty convenient.
https://thinkablespace.app
Material_Shopping496@reddit
Nexa's Hyperlink product: https://hyperlink.nexa.ai/
FlatConversation7944@reddit
Checkout PipesHub: https://github.com/pipeshub-ai/pipeshub-ai
Demo Video: https://www.youtube.com/watch?v=xA9m3pwOgz8
Disclaimer: I am co-founder of PipesHub
kesor@reddit
This is an excellent entry point that supports multiple types of databases and gives you an API to do storage and retrieval https://github.com/openai/chatgpt-retrieval-plugin
RHM0910@reddit
That's not local of it requires API to chatgpt
Majestical-psyche@reddit
I’m shocked no one mentioned open-web-UI. It’s feature rich, clean, and is rapidly evolving with a big team behind it and lots of support.
baldamenu@reddit
What setup do you use for documents on open webui? I constantly struggle with this and no matter what embedding model I use my RAG doesn't seem to work. So far snowflake embed is the only one that doesn't fail for me but its still not great imo
MarsCityVR@reddit
Just wish it didn't have login
CuteSpecific9116@reddit
You can disable login by setting WEBUI_AUTH=False
MarsCityVR@reddit
Oh wow, thanks! Know if you can default the RAG tag?
EmberGlitch@reddit
If by default tag you mean have it default to checking a specific knowledge base, yes that's possible.
You can create a "custom model". That new model will use a model of your choosing as a base model, and you can specify a different system prompt, and also make it use RAG with one or multiple specific knowledge bases.
CuteSpecific9116@reddit
Not sure. TBH I just started setting up openwebui and discovered this page:
https://docs.openwebui.com/getting-started/env-configuration
I hope you can find what you are looking for in there.
Majestical-psyche@reddit
Yea they should have an option by now to disable it. But it’s not that big of a deal to log in though.
brotie@reddit
You can disable with an env var on startup afaik
MarsCityVR@reddit
Oh wow, thanks! Know if you can default the RAG tag?
privacyparachute@reddit
https://www.papeg.ai has that feature, and it's about as "ready to use" as it gets: all you need to do is visit the website, and drag a few files into the window.
It's a web-app that's designed to run 100% locally, and that includes the LLM's. Test it yourself: use it to download an AI model, and then turn off your WiFi. You can even reload the page, and it's still there. Magic!
the_little_alex@reddit
wow, amazing solution, thanks!
privacyparachute@reddit
Thanks :-)
the_little_alex@reddit
Is anybody familiar with some ready-to-go graph RAG solutions? I heard they perform better.
first2wood@reddit
Cinnamon/kotaemon: An open-source RAG-based tool for chatting with your documents. (github.com)
snexus/llm-search: Querying local documents, powered by LLM (github.com)
stanford-oval/WikiChat: WikiChat is an improved RAG. It stops the hallucination of large language models by retrieving data from a corpus. (github.com)
infiniflow/infinity: The AI-native database built for LLM applications, providing incredibly fast hybrid search of dense vector, sparse vector, tensor (multi-vector), and full-text (github.com)
v2rockets/Loyal-Elephie: Your Trusty Memory-enabled AI Companion - Simple RAG chatbot optimized for local LLMs | 12 Languages Supported | OpenAI API Compatible (github.com)
neuml/txtai: 💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows (github.com)
AI-Commandos/RAGMeUp: Generic rag framework to apply the power of LLMs on any given dataset (github.com)
langflow-ai/langflow: Langflow is a low-code app builder for RAG and multi-agent AI applications. It’s Python-based and agnostic to any model, API, or database. (github.com)
Obsidian plugin is enough for me, but I collected some on GitHub. Maybe you can have a look. My filter is "Don't need docker, ready-to-use, hybrid search. Or just generally feels good." If you use docker, you have more choices.
the_little_alex@reddit
what do you think about Nvidia ChatRTX RAG?
first2wood@reddit
I think it looks good but never had actually tried. I have tried notebook LLM, but still I prefer my current setup - A simple but okay RAG with Obsidian.md. Because I have a lot of notes, it's not easy to upload all and I don't want to upload.
Melodic_Ice_3265@reddit
Any chance you can point me to a good overview of using Obsidian as a defacto local RAG? I’m a newbie in this arena.
first2wood@reddit
It's just a simple rag for your vault. I manage my notes and files with obsidian and I only need the semantic search for the related notes. Vector stores locally but the embedding is through the huggingface API. LLM is API or local. If you want to use it in normal way, you can compare the responses. It shows the original answer and the note-contexted response in the chat history file. If you are using obsidian, then you better have a try yourself with API. It's just minutes.
arcandor@reddit
Which obsidian plugins do you use?
first2wood@reddit
smart connections
the_little_alex@reddit
what do you think about Nvidia ChatRTX RAG?
docsoc1@reddit
For developers, I highly highly recommend R2R - https://github.com/SciPhi-AI/R2R
AnxietyNumerous26@reddit
I don't know about "best" but this solution has been mentioned before
https://github.com/Mintplex-Labs/anything-llm
Judtoff@reddit
This is what I personally use. The UI could use some work (especially around document selection). But it works right out of the box. Even has ollama baked in (I just use Llama.cpp api).
rambat1994@reddit
Working on the document picker 🙃
Judtoff@reddit
Greatly appreciated! I hope i didn't come off too unappreciative, anythingllm is a fantastic product, and it is understandable for there to be minor issues with bleeding edge products.
rambat1994@reddit
Oh i did not read it like that at all! The document picker is bad and we do need to work on it. Its a valid critique and I appreciate you voicing it! It let's me know what to focus on between milestones :)
Melodic_Ice_3265@reddit
AnythingLLM has been amazing. I’m a tech savvy attorney (but no expert at all). I really want to use a local llm for privacy purposes but the document picker is a deal killer for me at this point. Mostly because I am too short on time to play around with it and have thousands of word docs and pdfs already organized in windows directories. As soon as doc picker is better I will become devote and sing its praises from the rooftops. It’s a great product.
rorowhat@reddit
You can't pick your provider, like CPU vs GPU etc. I wish they would add that option.
YouWillConcur@reddit
reorproject/reor: Private & local AI personal knowledge management app.
Packsod@reddit
sillytavern, I can't call it the best, but it is definitely the most plug-and-play. It has a built-in extension called data bank, which is actually a locally hosted rag
https://docs.sillytavern.app/usage/core-concepts/data-bank/
kif88@reddit
It's pretty useful. Can even take URL and scrape a website automatically.
Majestical-psyche@reddit
Does SillyTaven have a blank page yet for stories? Like Kobold? I don’t use ST for that reason, pressing the edit button every time I want to edit something is a killer for me.
Packsod@reddit
I have no idea, what do you mean, auto-completion? I use kobold.cpp as the backend but I'm not familiar with its interface.
Sure_Ad9815@reddit
You might want to check out EyeLevel.ai's tools; they offer flexible options for building AI applications with a focus on privacy and local deployment.
RequirementQuick6057@reddit
Privategpt
Fit-Key-8352@reddit
I just use Open webui capability.
Eugr@reddit
Open-WebUI, msty.app The latter can integrate with Obsidian vaults.
ekaj@reddit
I’ll throw mine out there, https://github.com/rmusser01/tldw It’s like an open source NotebookLM without the podcast generation feature. WIP, but very much useable.
desexmachina@reddit
Video? How is it analyzing or chatting about video?
ekaj@reddit
It transcribes the audio content from the video, its not performing visual analysis.
desexmachina@reddit
Makes sense. What about integrating a vision LLM to grab frames and analyze that?
ekaj@reddit
Yep, that’s planned: https://github.com/rmusser01/tldw/issues/233 I plan to implement it as part of likely the next batch of features/fixes after my current one.
The issue is doing it in such a way that people can use it effectively, even on low VRAM systems, i.e. waiting for cpu offload for video models, haven’t looked much into them so I don’t know if that’s already available.
desexmachina@reddit
Maybe just fork a GPU one. Issue is that a single video generates a ton of frames
preperstion@reddit
Remind me
GradatimRecovery@reddit
There is no open stack that approaches NotebookLM in functionality. But, just being able to chat with your PDFs might solve your problems
Good-Coconut3907@reddit
For a no code, local solution to get you going:
* Langflow (no-code langchain) https://docs.langflow.org/components-rag
* Ragflow https://ragflow.io/docs/dev/
Can't go wrong with either
JPD0c@reddit
I haven’t tested it but I would look into RAGBuilder. They claim that it is a easy way to test different RAG settings in one go. They say you can use Ollama so it would run locally although not sure if you can run it 100% local. Really interested to if someone has use it 100% local with Ollama as I will need to do the same.
pip25hu@reddit
I know of frameworks; Llamaindex has a lot of building blocks that you can readily use, for example. But this is still a framework, not an application, so you can't avoid some coding yourself.