What's the best ready-to-use local run RAG solution?

[-]

teoamico@reddit

I use ThinkableSpace, it’s a desktop app where you can just drag and drop files into a folder and it connects via MCP to Claude code. You can also hook it up to ChatGPT using a cloudflare tunnel, which is pretty convenient.

https://thinkablespace.app

[-]

Material_Shopping496@reddit

Nexa's Hyperlink product: https://hyperlink.nexa.ai/

[-]

FlatConversation7944@reddit

Checkout PipesHub: https://github.com/pipeshub-ai/pipeshub-ai
Demo Video: https://www.youtube.com/watch?v=xA9m3pwOgz8

Disclaimer: I am co-founder of PipesHub

[-]

kesor@reddit

This is an excellent entry point that supports multiple types of databases and gives you an API to do storage and retrieval https://github.com/openai/chatgpt-retrieval-plugin

[-]

RHM0910@reddit

That's not local of it requires API to chatgpt

[-]

Majestical-psyche@reddit

I’m shocked no one mentioned open-web-UI. It’s feature rich, clean, and is rapidly evolving with a big team behind it and lots of support.

[-]

baldamenu@reddit

What setup do you use for documents on open webui? I constantly struggle with this and no matter what embedding model I use my RAG doesn't seem to work. So far snowflake embed is the only one that doesn't fail for me but its still not great imo

[-]

MarsCityVR@reddit

Just wish it didn't have login

[-]

CuteSpecific9116@reddit

You can disable login by setting WEBUI_AUTH=False

[-]

MarsCityVR@reddit

Oh wow, thanks! Know if you can default the RAG tag?

[-]

EmberGlitch@reddit

If by default tag you mean have it default to checking a specific knowledge base, yes that's possible.

You can create a "custom model". That new model will use a model of your choosing as a base model, and you can specify a different system prompt, and also make it use RAG with one or multiple specific knowledge bases.

[-]

CuteSpecific9116@reddit

Not sure. TBH I just started setting up openwebui and discovered this page:

https://docs.openwebui.com/getting-started/env-configuration

I hope you can find what you are looking for in there.

[-]

Majestical-psyche@reddit

Yea they should have an option by now to disable it. But it’s not that big of a deal to log in though.

[-]

brotie@reddit

You can disable with an env var on startup afaik

[-]

MarsCityVR@reddit

Oh wow, thanks! Know if you can default the RAG tag?

[-]

privacyparachute@reddit

https://www.papeg.ai has that feature, and it's about as "ready to use" as it gets: all you need to do is visit the website, and drag a few files into the window.

It's a web-app that's designed to run 100% locally, and that includes the LLM's. Test it yourself: use it to download an AI model, and then turn off your WiFi. You can even reload the page, and it's still there. Magic!

[-]

the_little_alex@reddit

wow, amazing solution, thanks!

[-]

privacyparachute@reddit

Thanks :-)

[-]

the_little_alex@reddit

Is anybody familiar with some ready-to-go graph RAG solutions? I heard they perform better.

[-]

first2wood@reddit

Cinnamon/kotaemon: An open-source RAG-based tool for chatting with your documents. (github.com)

snexus/llm-search: Querying local documents, powered by LLM (github.com)

stanford-oval/WikiChat: WikiChat is an improved RAG. It stops the hallucination of large language models by retrieving data from a corpus. (github.com)

infiniflow/infinity: The AI-native database built for LLM applications, providing incredibly fast hybrid search of dense vector, sparse vector, tensor (multi-vector), and full-text (github.com)

v2rockets/Loyal-Elephie: Your Trusty Memory-enabled AI Companion - Simple RAG chatbot optimized for local LLMs | 12 Languages Supported | OpenAI API Compatible (github.com)

neuml/txtai: 💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows (github.com)

AI-Commandos/RAGMeUp: Generic rag framework to apply the power of LLMs on any given dataset (github.com)

langflow-ai/langflow: Langflow is a low-code app builder for RAG and multi-agent AI applications. It’s Python-based and agnostic to any model, API, or database. (github.com)

Obsidian plugin is enough for me, but I collected some on GitHub. Maybe you can have a look. My filter is "Don't need docker, ready-to-use, hybrid search. Or just generally feels good." If you use docker, you have more choices.

[-]

the_little_alex@reddit

what do you think about Nvidia ChatRTX RAG?

[-]

first2wood@reddit

I think it looks good but never had actually tried. I have tried notebook LLM, but still I prefer my current setup - A simple but okay RAG with Obsidian.md. Because I have a lot of notes, it's not easy to upload all and I don't want to upload.

[-]

Melodic_Ice_3265@reddit

Any chance you can point me to a good overview of using Obsidian as a defacto local RAG? I’m a newbie in this arena.

[-]

first2wood@reddit

It's just a simple rag for your vault. I manage my notes and files with obsidian and I only need the semantic search for the related notes. Vector stores locally but the embedding is through the huggingface API. LLM is API or local. If you want to use it in normal way, you can compare the responses. It shows the original answer and the note-contexted response in the chat history file. If you are using obsidian, then you better have a try yourself with API. It's just minutes.

[-]

arcandor@reddit

Which obsidian plugins do you use?

[-]

first2wood@reddit

smart connections

[-]

the_little_alex@reddit

what do you think about Nvidia ChatRTX RAG?

[-]

docsoc1@reddit

For developers, I highly highly recommend R2R - https://github.com/SciPhi-AI/R2R

[-]

AnxietyNumerous26@reddit

I don't know about "best" but this solution has been mentioned before

https://github.com/Mintplex-Labs/anything-llm

[-]

Judtoff@reddit

This is what I personally use. The UI could use some work (especially around document selection). But it works right out of the box. Even has ollama baked in (I just use Llama.cpp api).

[-]

rambat1994@reddit

Working on the document picker 🙃

[-]

Judtoff@reddit

Greatly appreciated! I hope i didn't come off too unappreciative, anythingllm is a fantastic product, and it is understandable for there to be minor issues with bleeding edge products.

[-]

rambat1994@reddit

Oh i did not read it like that at all! The document picker is bad and we do need to work on it. Its a valid critique and I appreciate you voicing it! It let's me know what to focus on between milestones :)

[-]

Melodic_Ice_3265@reddit

AnythingLLM has been amazing. I’m a tech savvy attorney (but no expert at all). I really want to use a local llm for privacy purposes but the document picker is a deal killer for me at this point. Mostly because I am too short on time to play around with it and have thousands of word docs and pdfs already organized in windows directories. As soon as doc picker is better I will become devote and sing its praises from the rooftops. It’s a great product.

[-]

rorowhat@reddit

You can't pick your provider, like CPU vs GPU etc. I wish they would add that option.

[-]

YouWillConcur@reddit

reorproject/reor: Private & local AI personal knowledge management app.

[-]

Packsod@reddit

sillytavern, I can't call it the best, but it is definitely the most plug-and-play. It has a built-in extension called data bank, which is actually a locally hosted rag
https://docs.sillytavern.app/usage/core-concepts/data-bank/

[-]

kif88@reddit

It's pretty useful. Can even take URL and scrape a website automatically.

[-]

Majestical-psyche@reddit

Does SillyTaven have a blank page yet for stories? Like Kobold? I don’t use ST for that reason, pressing the edit button every time I want to edit something is a killer for me.

[-]

Packsod@reddit

I have no idea, what do you mean, auto-completion? I use kobold.cpp as the backend but I'm not familiar with its interface.

[-]

Sure_Ad9815@reddit

You might want to check out EyeLevel.ai's tools; they offer flexible options for building AI applications with a focus on privacy and local deployment.

[-]

RequirementQuick6057@reddit

Privategpt

[-]

Fit-Key-8352@reddit

I just use Open webui capability.

[-]

Eugr@reddit

Open-WebUI, msty.app The latter can integrate with Obsidian vaults.

[-]

ekaj@reddit

I’ll throw mine out there, https://github.com/rmusser01/tldw It’s like an open source NotebookLM without the podcast generation feature. WIP, but very much useable.

[-]

desexmachina@reddit

Video? How is it analyzing or chatting about video?

[-]

ekaj@reddit

It transcribes the audio content from the video, its not performing visual analysis.

[-]

desexmachina@reddit

Makes sense. What about integrating a vision LLM to grab frames and analyze that?

[-]

ekaj@reddit

Yep, that’s planned: https://github.com/rmusser01/tldw/issues/233 I plan to implement it as part of likely the next batch of features/fixes after my current one.

The issue is doing it in such a way that people can use it effectively, even on low VRAM systems, i.e. waiting for cpu offload for video models, haven’t looked much into them so I don’t know if that’s already available.

[-]

desexmachina@reddit

Maybe just fork a GPU one. Issue is that a single video generates a ton of frames

[-]

preperstion@reddit

Remind me

[-]

GradatimRecovery@reddit

There is no open stack that approaches NotebookLM in functionality. But, just being able to chat with your PDFs might solve your problems

[-]

Good-Coconut3907@reddit

For a no code, local solution to get you going:

* Langflow (no-code langchain) https://docs.langflow.org/components-rag

* Ragflow https://ragflow.io/docs/dev/

Can't go wrong with either

[-]

JPD0c@reddit

I haven’t tested it but I would look into RAGBuilder. They claim that it is a easy way to test different RAG settings in one go. They say you can use Ollama so it would run locally although not sure if you can run it 100% local. Really interested to if someone has use it 100% local with Ollama as I will need to do the same.

[-]

pip25hu@reddit

I know of frameworks; Llamaindex has a lot of building blocks that you can readily use, for example. But this is still a framework, not an application, so you can't avoid some coding yourself.