Windows program for RAG using local pdf files, magazines, and technical documents using gguf models

Posted by cmdrmcgarrett@reddit | LocalLLaMA | View on Reddit | 12 comments

I am older and all this stuff is so confusing to me but I am trying to learn. Please bear with me.

Right now I am using Backyard AI and LM Studio as my chat programs.

I would like to use RAG to load all my computer magazines, manuals,and technical books into a model and have it searchable to answer questions and offer ideas. Same with other possible models. I would be using llama 3.2-3B uncensored or dolphin 2.9.2 with llama 3.1.

I know how to use the AI as Context Docs but I want a permanent solution so I can have one model for computer questions and help, another for medical, and so forth.

Are there any free LLM programs that can do this? If the ones that I am using already can do this can someone guide me?

Thank you in advance

[-]

Additional_Ad_7718@reddit

LMStudio is easy to use and it has RAG.

[-]

mtomas7@reddit

In my test LM Studio's Chat with document had best results, but it doesn't have a feature to add a large collection of documents.

[-]

Additional_Ad_7718@reddit

Yeah that's the main reason I somewhat hesitated to recommend it here

[-]

DryContact6504@reddit

I've been happy with anythingllm as a RAG. It works with lmstudio both for the inference and for the embeddings if you have an embedding model loaded.

I wish there was more detailed control, but it is what it is and it's not bad.

Markdown from github is pretty nice to load in.

Is that in the ballpark of what you were looking for? Presumably there will be many, many RAGs created over time.

[-]

ArmoredBattalion@reddit

since it's rag could i use a worse model like Llama 3.2 1b

[-]

DryContact6504@reddit

Being able to reference things directly does seem to help. I still like 3B personally but 1B is worth a shot.

[-]

happy_dreamer10@reddit

You can use simply one model , and you can have multiple vector store as to how many you need

[-]

privacyparachute@reddit

Try www.papeg.ai

Just drag your files into the browser window.

[-]

eggs-benedryl@reddit

I use MSTY and it's the best of the frontends I've tried, it's RAG implementation is pretty damn simple.

You can add them as a whole folder or individually, it also allows loading of youtube links that have transcriptions. I like it so much I bought their lifetime premium subscription.

[-]

Eugr@reddit

You can try Open-WebUI and load all your PDFs into a knowledgebase. You can then refer to it in the prompt. You will need to configure it a bit to work - such as choosing an embedding model (I use nomic-embed-text) and embeddings model engine (I use ollama as my API backend). You may also want to play with a context window and some other parameters to get better replies, as the default 2048 token context is not usually enough. 8K tokens work well though.

I did have issues with some PDFs that failed to upload - not sure if there is a size limitation or some parser issues.

[-]

Super_Spot3712@reddit

open-webui doesn't always work well for me. Sometimes when I attach a PDF and ask for a summary, the response is - which pdf? Sometimes it works, sometimes not.

[-]

Eugr@reddit

You need to increase the context length. It does the same for me with default context, but works better if I use >8K. And I’m not sure if it’s aware of the file type - I just ask about “paper” or “article” or “document”.