Best platform-agnostic tools/frameworks to vectorize large wikis (not wikipedia) for RAG?

Posted by Mgeek35@reddit | LocalLLaMA | View on Reddit | 6 comments

Hi folks,

I'm working with an LLM company tailored to a special business use case. Since most LLMs were not trained on the business data, we are scraping wikis in our business and trying to build a vector database out of these wikis to use in our RAG. We want to have this database usable regardless of the RAG framework. One problem I found with things like LlamaIndex (please correct me if I'm wrong), they store the data in special objects, which are not really usable/transferable outside LlamaIndex.