Is there an LLM that can assimilate an entire codebase for chatting?

Posted by abbondanzio@reddit | LocalLLaMA | View on Reddit | 15 comments

I have a large project in typescript (monorepo) and currently use claude-dev to ask questions on my codebase or make changes.

Is there any way to feed my entire codebase to an LLM so that I can chat with it? I had thought of something that does indexing and creates embedding of my entire codebase.

Claude's projects accept very few files, the continue.dev api often fails me if I select the whole codebase and in general they force too much.

I would prefer it to be opensource. What is the state of the art?

[-]

BatsChimera@reddit

compile every line into a txt and uploadit to notebooklm, it works for me, i have a script that takes every line with a (from (x).txt[or whatever the extension was])

[-]

IdeaEchoChamber@reddit

Try taking a look at Gemini APIs, they have very large context window. Or just use cursor

[-]

[Aider](https://aider.chat) is a nice open source alternative. It handles the context of the project by feeding a [repository map](https://aider.chat/docs/repomap.html) to the LLM, and it adds specific files to the chat context when it is needed to answer some questions. Moreover, it can be connected to all the LLM supported by [LiteLLM](https://github.com/BerriAI/litellm), including local models.

[-]

GradatimRecovery@reddit

Prove your concept on NotebookLM before building your own RAG

[-]

PermanentLiminality@reddit

The default context on continue.dev is only 8k. Try turning it up. Be sure the model or your local LLM can support it

[-]

asankhs@reddit

To import the entire code the model needs to have a context length larger than the full repo. Except for a few small projects that would not be the case. If your repo is small enough you can try using one of the Gemini models. We do that in https://docs.codes/ where we take entire library sources and generate documentation from it by using Gemini Flash 1.5.

If the repo is larger then you have two options one is to embed the repo in a vector store using one of the embedding models and then chat with it. We have an implementation of that you can try freely at https://www.patched.codes/, we support multiple repos and even document stores that you can query and chat with from a single interface.

The other option is to use a short term memory and let LLM have access to it during the inference/chat. I recently implemented a short term memory via a memory plugin in optillm (https://github.com/codelion/optillm) which is an open-source optimizing inference proxy. Using memory we were able to get some very good results in large context benchmarks like FRAMES (https://huggingface.co/datasets/google/frames-benchmark). The benchmark doesn't include code retrieval though so ymv.

[-]

tallesl@reddit

Embedding source code files to later be used with natural language can be tricky. Human language does not quite match programming language. I saw a presentation by Codeium that they mitigate this problem by generating and adding a human-like description to the code (generated by a LLM of course).

Not a local solution, but a quick way to have this is by putting the entire codebase in the context them chatting with it. Gemini's huge context size shines here, combine it with how cheap the flash model is and you may have a winning combination.

[-]

justinrlloyd@reddit

Check out the Seagoat project on github.

[-]

Everlier@reddit

continue.dev would be amongst the top of the line in what you can get in Open Source

If your codebase is still relatively small - use tools like repopack for full in-context editing

[-]