AMA – I’ve built 7 commercial RAG projects. Got tired of copy-pasting boilerplate, so we open-sourced our internal stack.

[-]

Extra-Whereas-9408@reddit

You wanna move fast for PoCs but hate black boxes?

[-]

What are your takes on LLMs performances for RAG? Which ones shine more than others? Do you see a significant drop in performance for quantized models? I saw you're using gpt4o, are open source models catching up?

[-]

Loud_Picture_1877@reddit (OP)

I usually recommend openai / claude / gemini to people just because of not having devops overhead. I think all 3 major providers does good job, but I had worked mostly with openai.

We had one project that required self-hosted LLM: we used Mistral NeMo (12B parameters) and used vLLM to deploy it. Model was kinda dumb, but overall the project was a success. We just had to spend more time tweaking the prompts.

[-]

Cybertrucker01@reddit

Thanks for sharing your work. What additional hardware demands are they beyond running the local LLM?

[-]

waiting_for_zban@reddit

Mistral NeMo

Thanks for the insights. Funny you mention it, we're using it for classification, and it's doing an okay job. We benchmarked it against the top models, and it came 4th compared to gpt4o and gemini2.5-flash and qwen3A22b.

[-]

ReactionMiserable118@reddit

How did you evaluate whether the retrieved documents in your RAG system were actually useful for generating correct or relevant answers?

[-]

Loud_Picture_1877@reddit (OP)

Hi! We've an evaluation included in a ragbits-evaluate package. Then for a given dataset we calculate following metrics:

- Context Precision, Recall, F1 (rank-unaware)
- Average Precision, Reciprocal Rank, NDCG (rank-aware)

There are some examples of how to do it in our repo: https://github.com/deepsense-ai/ragbits/tree/main/examples/evaluation/document-search

Also my colleague is working on evaluation quickstart - I'll make sure to post it here when it's published :))

[-]

kzkv0p@reddit

Do you manually define the expected results in order to calculate precision and recall?

[-]

Loud_Picture_1877@reddit (OP)

When it's possible I like to engage SME's (subject matter experts) to define a validation dataset. That usually makes the best quality evaluation.

If that is not possible (or we need more data) then generating dataset with a LLM may be a case.

[-]

kzkv0p@reddit

Thank you

[-]

miketran134@reddit

Loud_Picture_1877 I think it would be great if you would create a Discord Community for this Ragbits…

[-]

Lonhanha@reddit

As a developer for 3 years but only 1 in AI, god I am very green on this subject. I literally am building a RAG app at my job but the things approached here are very valuable that I had not thought of yet. Thanks for sharing and everyone in the comments sharing as well.

[-]

Loud_Picture_1877@reddit (OP)

Thanks! Good luck on your journey with AI :))

[-]

evilbarron2@reddit

I use Anythingllm as my front end - is it possible to integrate this as an alternative RAG solution?

[-]

Loud_Picture_1877@reddit (OP)

Thanks for the suggestion, we will look into that!

[-]

noclip1@reddit

Thanks for sharing! We're just starting our own journey internally on a complex multi-agent (with multi-tool) chatbot to answer questions specific to our industry. There's been a lot of information to parse through, libraries to examine, and approaches to take.

I suppose more than anything I'm curious to understand the pitfalls you hit along the way and why you decided to choose a different path when you did. The journey ahead seems so long, daunting, and outdated by the next week so it feels like fighting in a tornado to decide what is the best approach to choose and commit to it. Or more succinctly, if you could condense down 3 years of fighting in this tornado, what would you say are the biggest takeaways?

[-]

Loud_Picture_1877@reddit (OP)

Good one!

I think my biggest takeways are:

* be prepared to pivot, throw away chunks of system that got outdated, abstract interfaces to easily change underlying implementation (seems familiar huh?). With ever-changing environment, new models around the corner - we had encountered situations that just before project handover new SoTA model appeared and to deliver best quality we just had to change things quickly.

* deliver small chunks of value early and build upon it. I've seen a tendency of people to have really unrealistic understanding what AI can do for you - smart management over people hopes is really important. It is better to deliver very limited agent fast, get feedback and then iterate over it.

* observability is really important, debugging non-deterministic systems can be a nightmare - better have good tools for that

* do not throw too much at one prompt / agent / etc. Break down the things like in normal software engineering - single responsibility rule works here as well :)

[-]

-Ulkurz-@reddit

I'm currently building an agentic assistant for text-2-sql using OpenSearch (context for schema, relationships, examples, and domain mappings) and LangGraph. I, however, do see several issues with SQL quality and consistency in generation. Any suggestions? How can I systematically identify what the root issue is (most likely context but it can be huge and diverse) and accordingly decide the fix?

[-]

Loud_Picture_1877@reddit (OP)

Hi!

My take on text-2-sql solutions is that you should really think about it what data should be available to LLM and in what way.

Create views on top of your data, think about what columns are really needed for LLM to query, maybe pre-join some tables that are commonly joined. Just hide as most of complexity as you can.
Maybe you have some example queries? You can have a dataset of mappings question <-> SQL query in your vector database and retrieve examples for reference based on similarity to the current question.
Some of the complex & common tasks may be extracted to function_called prepared scenarios, for example if you often create some sort of analytical query with month summary you may have a function available to LLM - montly_report($month)
Vector db is also a great place to store categorical values. This is particularly useful when users make filtering queries but include typos in values such as city names.

[-]

auldwiveslifts@reddit

Does ragbits have direct support for text2SQL tasks too or is it mainly RAG focused?

[-]

chitown160@reddit

text2SQL is RAG

[-]

auldwiveslifts@reddit

They are pretty different in set up and capability. RAG retrieves related vectors as context for answering a question or carrying out some task. Text2sql retrieves relevant tabular data from a database for precise calculations, etc. both are for retrieval but different frameworks under the hood and accomplish different tasks.

[-]

chitown160@reddit

RAG is not limited to vector embeddings, vector databases or similarity searches. The operation is in the definition of the term which does not define datasource.

[-]

auldwiveslifts@reddit

I see what you mean. I was more so talking about an agentic question answering workflow that feels less like RAG in my mind. Something more like this: https://python.langchain.com/docs/tutorials/sql_qa/

[-]

-Ulkurz-@reddit

Mind sharing a brief on how are you approaching text-2-sql? I'm working on a similar project - using agentic workflow with RAG

[-]

auldwiveslifts@reddit

Wish I could say a lot, can’t per company policy. But here’s a public tutorial that’s a great starting point. If you have a lot of tabular data this is a great way to go. LLM sees db tables, then can look into schema of relevant tables. Finally it generates a query which another agent reviews for correctness. Happy to answer specific questions you might have.

https://langchain-ai.github.io/langgraph/tutorials/sql-agent/

[-]

Loud_Picture_1877@reddit (OP)

Hi!

Ragbits is designed to be modular, on pypi it is now 8 independent packages. RAG-related features are just one module: ragbits-document-search.

In ragbits-core you can find common things like connection to LLMs, common interface to various vector stores and observability. Right now we're working on ragbits-agents package to better support agentic / tool-use cases.

We don't have any text-2-sql specific code yet, but ragbits-core (and ragbits-agents soon) components may be useful while building such. In the future we may think about integrating one of available tools out there.

[-]

musicmakingal@reddit

You mention tools-use and work-in-progress on agentic use cases. LangGraph supports both (built in ReACT as well). Is there any reason I would use ragbits over LangGraph?

[-]

Loud_Picture_1877@reddit (OP)

u/musicmakingal hi! In ragbits we have RAG-specific features, monitoring, user interface and more that may be interesting for you - then I would recommend to use it.

There is no need to choose one over the other - ragbits components can be orchestrated by LangGraph. I see ragbits in future to be easily integrated with other frameworks (especially I am looking towards pydantic-ai)

[-]

Initial-Swan6385@reddit

so we can got money from YC xD

[-]

night0x63@reddit

Open Web UI has built in RAG. Slick GUI with easy directory of ingest and # to reference collections.

Unfortunately for coding I found it lacking.

How is your solution compared to open webui rag? (How would you rate their solution)

Specifically... I found it didn't get right document sometimes... Document separators did not work... Filenames were missing. I ended up just doing short script of filename and contents and document separators... Worked better.

[-]

Loud_Picture_1877@reddit (OP)

Hi! I agree, Open WebUI looks stunning! Also was more than once referenced in comments for this posts.
We're looking into it - maybe it would be possible to have ragbits document retrieval connected with their UI :)

[-]

HilLiedTroopsDied@reddit

Ragbits can be containerized and sourced as a pipeline, or you could branch owui and include ragbits as a default "documents" engine. I believe with ragbits complexity if they merged your work to master that they'd give you a license exclusion for production use.

[-]

_underlines_@reddit

Cool post. I wish we could share our code base too. But we can't.

We did a 1M USD RAG Project for Gov in Switzerland and did very formal Optimization in the last 2 years via a Hypothesis and Evaluation loop.

I wonder if others did the same and have some comparable results and insights. For example:

We used RAGAS with our human expert crafted gold Q&A dataset and never really got much improvements implementing SOTA Papers into our code base.
LazyGraphRag got no measurable difference
Reranking brought the results down a bit (but we kept it)
HyDE was also bad, lowering RAGAS scores
Hybrid Retriaval activated in Azure AI Search (using BM25 and embeddings) wasn't an improvement either
Lots and lots of prompt engineering was also useless
We moved from a workflow based approach to a ReAct agent. Got no improvement in RAGAS metrics but it's super cool, and we show the user the thinking process
We decided against libraries such as langchain or open source RAG stacks early on, because RAG is not rocket science and building the components with a good onion-architecture was a good choice for us. Very maintainable code.
We used Factory Patterns to create additional search strategies as hypothesis that we can test and then release or discard.
When we moved to a ReAct agent, we started implementing all hypotheses as tools as well as our RAG flow as a single tool call.
We're now adding text2sql, but since the source database is from a complex ERP with tons of tables and complex business logic, we plan to create a simplified abstraction layer with views, having a few simple entities such as Person, Company etc... and let the LLM pick those. We then fetch those into a temporary inmemory DB where the agent finaly does text2sql.

What are your thoughts? Any insights to share of similar topics?

[-]

Loud_Picture_1877@reddit (OP)

Thanks!

I have a common experience with all the advanced techniques for RAG - I had cases that accuracy would barely improve, but the added complexity was not worth it.

Things that seems to do the trick (and still not very complex) for me are: hybrid search with sparse embeddings (bm25 or Splade), query reprhrasing / multi-query rephrasing with LLM, reranking. Apart from that I try to keep the chunks reasonably large and not split in weird places.

We decided against libraries such as langchain or open source RAG stacks early on

Same here, we started with custom implementations, then shared snippets between teams and somehow ragbits was created :D Probably a lot of value for us is that we can control framework roadmap based on the projects that we're doing, but I hope somebody else will find it useful as well.

We're now adding text2sql, but since the source database is from a complex ERP with tons of tables and complex business logic, we plan to create a simplified abstraction layer with views, having a few simple entities such as Person, Company etc... and let the LLM pick those. We then fetch those into a temporary inmemory DB where the agent finaly does text2sql.

That's interesting! Good luck with a project, text2sql can be tricky.
I did something similar in the past with abstraction layer and it worked quite good. Basically LLM was doing more of function calling than sql-generation in our approach - it can work really well if you have "finite" amount of views / tables you want to support.

[-]

paranoidray@reddit

Thanks for teaching me about hybrid search!
I learned something new today!

[-]

Hertigan@reddit

Can you elaborate on Factory Patterns for additional search? Doing that right now

[-]

alexvazqueza@reddit

But Ragas is more oriented to NLU processing isn’t it? Not like a RAG framework

[-]

DavidTech66@reddit

Is the plan to stay open source?

[-]

Loud_Picture_1877@reddit (OP)

Yes! ragbits will stay open-source under MIT licence.

[-]

capitalizedtime@reddit

What do you use for your frontend on this and what are the core flows for the clients here?

[-]

Loud_Picture_1877@reddit (OP)

Hi! We have a react application for a frontend. Right now we're in the process of separating all the communication logic from it as a typescript packages (react hooks etc) - to make it really easy to integrate ragbits with existing frontends.

Majority of the clients want some sort of chat interface (either text or voice) - it is getting common to integrate it directly in their existing platforms, website or even desktop applications. That's why we treat our frontend as great tool for early PoCs and then a starting point to adapt it to specific needs.

[-]

outthemirror@reddit

This post tells u rag based ChatGPT wrappers do not sell.

[-]

Loud_Picture_1877@reddit (OP)

I would say:

"rag based ChatGPT wrappers do not scale"

For a simple use-case or a PoC generic tool may be okay, buuut when your system grows you need to have much more granular control.

We've seen it even in ragbits on our projects - sometimes the default docling document parser we provide in ragbits was enough, but there were cases where we had to extend it to meet problem-specific needs

[-]

Acrobatic-Aerie-4468@reddit

Interesting package. Keep up the good work

[-]

Porespellar@reddit

A couple questions.

I love Open WebUI as my front end. How hard would it be to integrate this as RAG pipeline.
What’s your recommended chunking strategy for long document use case? (chunk size, chunk overlap, top k, embedding model, reranker, etc).

[-]

Loud_Picture_1877@reddit (OP)

Open WebUI looks really good, I'll explore integrating it into ragbits for sure, thanks for the recommendation!

For chunking I recommend keeping them longer, it is more important to have full paragraphs / sections even if they get big. You can summarize the chunks if needed. Modern models are quite good with bigger contexts, so this is not so important topic anymore (compared to 2yrs ago for example).

Top k usually is somewhere between 3-10.

Reranker? I recommend going with LLM log-probs based as a starter - it doesn't require to invovle another component into architecture (take a look here)

Embedding model usually something from big players (OpenAI, Google), along with Splade for sparse embeddings. If you want self-hosted then I find models available through FastEmbed good: https://github.com/qdrant/fastembed

These recommendations may vary from case to case - it is important to build evaluation dataset for your retrieval and figure out what parameters are the best for you :)

[-]

Porespellar@reddit

Thanks for the response I will look into FastEmbed!

Do you consider a Chunk Size of 2000 with a Chunk overlap of 500 as long enough for long document use cases?

[-]

Loud_Picture_1877@reddit (OP)

Yes, 2000 should be enough! But also I would try to find a good stopping point between chunks (section / paragraph / sentence end) rather than fixing on the chunk size. Even if they are smaller that is okay. I just treat chunk size value as something I try to be close to when merging / splitting chunks.

[-]

LienniTa@reddit

sooo how do you handle table extraction? just with visual llms?

[-]

Loud_Picture_1877@reddit (OP)

It really depends on the particular dataset.

Overall, I'd say that I'm happy with how docling handles tables - and I would start with that. On one project, we experienced quite an odd format for tables, but it was very repeatable across documents, so then a custom parser was the best choice (https://ragbits.deepsense.ai/how-to/document_search/ingest-documents/#parsing-documents).

For tables, I would treat multi-modal LLMs as a last resort, especially when you have a lot of numbers - it can be error-prone, but on the other hand, it can handle almost anything you throw at it :D

[-]

Amgadoz@reddit

How does docling handle tables?

[-]

Glxblt76@reddit

Came to the same conclusion with RAG. Fancy OCR should always be the last resort fallback when everything else fails.

[-]

LienniTa@reddit

did you try sort of automatic cross verification, when table is extracted using different methods and result is compared? so that human is only needed when deviation between methods is too high? i only tried it for translation

[-]

Loud_Picture_1877@reddit (OP)

That seems like a really good idea! Haven't tried anything like that, usually I was relying either on retrieval or e2e evaluation (like here)

How did it work for you in your translation use-case?

[-]

LienniTa@reddit

task was to translate goods descriptions from english to japanese(and later to chinese). i chunked descriptions and compared translations made by google tranlsate api, chatgpt api and meta's NLLB. Then sorted by score and this way could check only suspicious translations. I also calculated same score between initial chunks and english-japanese-english translation, but this probably not gonna work for table extraction xD

[-]

un_passant@reddit

Do your chunks of retrieved context have ids and can one make the LLMs to cite the chunks used to generate specific sentences (sourced / grounded RAG) ?

[-]

Loud_Picture_1877@reddit (OP)

Yes! We have IDs and full metadata objects for every chunk (source document, location, etc). You can access this information at any time and build a Prompt with it to cite the responses :)

[-]

un_passant@reddit

Great !

I'll be sure to check this out.

Do you have anything to use an LLM as a judge to assess the sourced responses ?

Also, have you tried prompt compression for instance with [LLMLingua](https://llmlingua.com/llmlingua2.html) ?

Thx !

[-]

DunklerErpel@reddit

What is your take on Graph-RAG or Light-RAG?

[-]

Loud_Picture_1877@reddit (OP)

For the cases that we encountered additional complexity of extracting and storing entity relations was not justified with potential gains - hybrid search with dense and sparse vector was good enough. But I am more than sure, that along the way we'll add some sort of graph capabilities into ragbits - we just need a good real-world use-case.

[-]

Tomr750@reddit

searching journal articles through their references?

[-]

Cheap_Concert168no@reddit

Suspiciously AI generated post. But thanks for open sourcing it. v useful

[-]

Loud_Picture_1877@reddit (OP)

I'm much better off coding than writing :) Overall idea for post is mine, but wording with the help of gpt.
Glad you find this useful!

[-]

the_jends@reddit

I'm new to RAG although I find it very interesting. Since you are working with actual documents how often do you find the AI hallucinating or misrepresenting the contents of a document? Do you need to give disclaimers to the lay users that the LLM may do that from time to time?

[-]

Loud_Picture_1877@reddit (OP)

Hi, hope that you will enjoy your RAG journey :D

My key takeaways with RAG hallucinations are:

* make sure to link sources in a final response - then user can always double-check if needed
* Rerankers are quite good in determining if chunks returned from vector db are actually relevant (I recommend this one (LLM based reranker)
* If you haven't found relevant chunks - don't answer! This is the point when LLMs are starting to be too creative
* Make sure that you have good evaluation for retrieval - it is much easier to evaluate retrieval than e2e pipeline and there you can easily improve overall app quality
* Gather user feedback - in ragbits we have thumbs up/down system . That allows us to catch errors quickly.

[-]

keepthepace@reddit

If you haven't found relevant chunks - don't answer!

underrated answer!

[-]

somehowchris@reddit

What was your biggest scale of docs & did you use the same setup as ragbits? Currently building an open source legal rag/search and small countries have like billions of pdf pages for general law and facing some design choices I’m not sure about

[-]

indicava@reddit

So a couple of questions (not necessarily related to your library, but cool work, and thanks for open sourcing!):

Have you had any experience with RAG projects with codebases and not only text/formatted data? How did tackle those? Code is a whole different challenge than text.
Have you encountered a situation where RAG (or any other LLM augmentation method) was just not good enough and you had/wanted to fine-tune a model to meet the business requirements?

[-]

Loud_Picture_1877@reddit (OP)

When it comes to RAG vs fine-tune question: I tend to avoid fine-tuning because it is hard to explain results and it requires to fine-tune again on almost every data source update

[-]

indicava@reddit

True, fine tuning is not sustainable for continuously updating data.

The Ada project sounds really cool!

I have another question but I appreciate these are commercial projects so I’ll totally understand if you won’t elaborate.

When fine tuning for Ada code completion/FIM, how did you run your evals to check that the fine tuned model was outputting legit Ada code?

[-]

Loud_Picture_1877@reddit (OP)

Hi! We did one project in the past which was Ada language co-pilot. Basically we had to fine-tune a model with enough Ada snippets to make it good :) Here is one-pager for this project: https://deepsense.ai/case-studies/ai-copilots-impact-on-productivity-in-revolutionizing-ada-language-development/

Other examples of non-text project we had involved a lot of images, graphs, heatmaps - we used multi-modal llm to reason on that / generate descriptions for embeddings - that approach is available in ragbits with ImageElementEnricher

[-]

Impulse33@reddit

Any plans for a llms.txt compilation of the Ragbits documentation for easier loading into context for inference?

[-]

IntrepidAbroad@reddit

Nice, thanks for sharing and making open source. I get that sense of frustration/drive that made you do it, because historically all of my software engineering work has been closed source and so I've had to re-create the same over-and-over again. Aiming to follow your lead with my next project and will take a look at potentially using this too.

[-]

Loud_Picture_1877@reddit (OP)

Thanks! Good luck with your projects!

If you decide to try ragbits - hit me here - I'll be more than happy to help :))

[-]

Ill_Yam_9994@reddit

How does the RAG chunking and search work? The problem I've had at my work trying to build simple RAG solutions is that it will only pull out like a sentence or two with no other context so it'll often provide irrelevant information.

Does this have support for any more advanced logic for that such as contextual retrieval where the LLM does a pass over each chunk/document and adds context, or graph retrieval? How about filtering documents to attempt to retrieve from based on some LLM logic?

[-]

vosegus91@reddit

Is it any good for research? Or just for creating products etc

[-]

Loud_Picture_1877@reddit (OP)

We use it for research, but probably because we're really familiar with it :D Ragbits is production / product oriented, best when you need to build e2e stack with UI, APIs, monitoring, etc.

[-]

mzbacd@reddit

Thank you for sharing. I am also developing a native macOS rag app and would really learn a lot from your project and experience.

[-]

Impulse33@reddit

Do you use any LLM tools in your own workflow, how much of ragbits codebase is generated code?

I've been vibe coding some RAG systems and really appreciate the in-depth documentation. Looking into Reciprocal Rank Fusion now instead of manual classification of prompt categories. My main goal is identifying security related prompts and directing to a separate, less chunked, "protected" index. Would Ragbit's hybrid approach of reciprocal rank fusion work well for that use case?

[-]

Loud_Picture_1877@reddit (OP)

Some of the team members use cursor, buuuut we do proper code-reviews, quality checks, etc. So the code is definitively not vibe-coded :D

Yes definitely you can use ragbits to have separate indexes, we've even RRF implemented to mix the results later: https://ragbits.deepsense.ai/how-to/vector_stores/hybrid/#specifying-the-retrieval-strategy-for-a-hybrid-vector-store

Another approach would be to create specific `Elements` per different categories; I've described this concept here: https://www.reddit.com/r/LocalLLaMA/comments/1l352wk/comment/mvyiwr3/

[-]

Latter_Wind4390@reddit

Good stuff, really excited to dive into the code later! One quick question, how do you evaluate performance on your projects?

I’ve built a few systems like this myself and usually have a test set of question, answer, chunks that I run some metrics on (precision/recall of chunks, answer/resp similarity). But generating a good test set for hundreds of documents is tough.

I collect user feedback but most users don’t bother to leave any.

[-]

Loud_Picture_1877@reddit (OP)

u/Latter_Wind4390 thanks! hit me in case of any questions :))

We've an evaluation included in a ragbits-evaluate package. Usually we evaluate projects on 2 levels: retrieval and e2e. For retrieval we have metrics like:

- Context Precision, Recall, F1 (rank-unaware)
- Average Precision, Reciprocal Rank, NDCG (rank-aware)

For e2e llm-as-a-judge is usually a good choice.

There are some examples of how to do it in our repo: https://github.com/deepsense-ai/ragbits/tree/main/examples/evaluation/document-search

My colleague is working on evaluation quickstart - I'll make sure to post it here when it's published.

[-]

parabellum630@reddit

Do you support local faiss indexes. A lot of libraries i have seen just use 3rd party commercial vector stores like pinecone.

[-]

Loud_Picture_1877@reddit (OP)

Not yet - usually faiss stores aren't sufficient for us, as we need to access vector-db in client-server manner.
Here is the list of VectorStores that we've support: https://ragbits.deepsense.ai/api_reference/core/vector-stores/

I usually recommend people to go either with Qdrant or Pgvector - you can run both for free as a docker container :)

Feel free to raise an issue for faiss store - if it gets traction, we'll be happy to support it

[-]

BrilliantArmadillo64@reddit

How does ragbits compare to LlamaIndex?

[-]

Loud_Picture_1877@reddit (OP)

Ragbits is a more end-to-end solution for building production-ready, tailored chatbots, LLM workflows, and agentic apps. We focus on accelerating project development, making some parts more opinionated than in Llamaindex. For instance, things like a consistent interface for LLMs/vector stores, exposing FastAPI endpoints, user interfaces, or opentelemetry/grafana monitoring are features you may find in Ragbits.

Though Llamaindex can be a great complementary library to use alongside Ragbits - for example, to leverage its data extractors or tools :)

[-]

de4dee@reddit

Is this a good tool for an education app that also has AI avatar features?

An AI avatar reads the course material and presents it to a user in the user's level of understanding or age. If user is a kid, it talks differently. The course material stays the same but presentation is different thanks to AI.

[-]

Loud_Picture_1877@reddit (OP)

Yes! Either ragbits-core features may be helpful for you - like managing prompts, connecting to llms, observability. Or you can use ragbits-document-search for quering the course materials using RAG techniques.

Will be happy to help in case of any trouble!

[-]

mayesa@reddit

I’m attempting to extract relevant information from unstructured data, such as PDFs or Word files, to expedite the process of filling out a web form.

[-]

Loud_Picture_1877@reddit (OP)

Great! Either project generated by `uvx create-ragbits-app` or a snippets in our README should do the work! If you have your files on your local disk, then you can use LocalFileSource.

More about different document sources here: https://ragbits.deepsense.ai/how-to/sources/load-dataset/

[-]

cuckfoders@reddit

Perhaps more of a general question. How would you go about personalized ai assistants, say your own Alexa or Siri at home but actually decent and can hold a conversation. How would you curate store and retrieve the data, since perhaps I'm overcomplicating this by making different buckets and trying to separate out facts from memories etc. And I guess how to use ragbits to accelerate that 😊

[-]

Loud_Picture_1877@reddit (OP)

Hi u/cuckfoders, interesting idea!

In ragbits we have something called `Element` - it is kinda a type of information that you can store in our knowledge database. Default elements are TextElement, or ImageElement - but you can create custom types. In your case it would make sense to create FactElement, MemoryElement, etc. Then you can use custom where query when searching to query only the things you want, or treat extracted elements differently after retrieval based on type.

Here are related docs:

https://ragbits.deepsense.ai/how-to/document_search/ingest-documents/#how-to-ingest-documents
https://ragbits.deepsense.ai/api_reference/document_search/documents/elements/
https://ragbits.deepsense.ai/how-to/document_search/search-documents/#limit-results-with-metadata-based-filtering

Let me know in case of any questions :))

[-]

Swoopley@reddit

How easy would it be to integrate the rag pipeline into open-webui? Have you done so before? it's the most used UI for companies running llm's internally

[-]

Loud_Picture_1877@reddit (OP)

Hi u/Swoopley! It seems like a very cool idea to integrate with open-webui, I'll add it to our backlog.

We haven't used it yet - primarily because most of the time we build custom UI integrated in already existing systems. Until now our focus was to create basic React components and UI for testing - having that we can easily copy code into new project and adapt it for specific needs.

[-]

productboy@reddit

Re: “already existing systems”; are your customers mostly using web applications that your team integrates ragbits into? Also, does your team integrate ragbits into enterprise software; i.e. Salesforce, Workday, SAP…?

[-]

Loud_Picture_1877@reddit (OP)

Yeah, mostly existing web apps, but we also integrated it in one desktop application for windows and into microsoft office plugin (Word etc.). For enterprise stuff, we've just finished agentic project with Workday :)

[-]

productboy@reddit

Awesome… link to repo?

[-]

Loud_Picture_1877@reddit (OP)

https://github.com/deepsense-ai/ragbits here is the framework that we've used.

Mentioned projects were commercial, so code is not public.