Local AI coding assistant that runs fully offline (Gemma 4, codebase-aware)
Posted by andres_garrido@reddit | LocalLLaMA | View on Reddit | 11 comments
I’ve been experimenting with running a local coding assistant on Gemma 4 26B, focused on understanding full codebases instead of single-file prompts.
Main idea:
- build a project map (files, symbols, structure)
- run a planning step to decide which files matter
- then retrieve full files + semantic chunks before answering
Goal is to avoid the usual “chat with files” limitation and make it reason about structure first.
Runs fully local (llama.cpp, GGUF), no network calls during inference.
Curious if others here are doing something similar or handling codebase reasoning differently with local models.
andres_garrido@reddit (OP)
One thing I didn’t expect is even when the model has enough context, answers degrade a lot if it picks slightly wrong files or misses part of the dependency chain.
feels less like a “model capability” problem and more like a “context selection” problem.
It
benevbright@reddit
You can try https://www.npmjs.com/package/ai-agent-test with your local llm
andres_garrido@reddit (OP)
Appreciate it. I’m less focused on agent style task execution for now and more on improving codebase understanding first.
The main problem I’m trying to solve is file selection and retrieval quality before the model answers. Once that part is solid, testing agent workflows on top makes more sense.
benevbright@reddit
ok. but agent is also very good for understanding codebase too.
andres_garrido@reddit (OP)
Yeah that’s fair, agents can definitely help, especially for exploring or iterating across files.
The issue I’ve seen is they still depend a lot on what context they pick at each step. If retrieval isn’t solid, the agent just amplifies the same problem across multiple steps.
That’s why I’m trying to get the selection/retrieval part more reliable first, then layer agent behavior on top.
benevbright@reddit
I don't know why my reply wasn't applied. but yeah. I see what you mean. I think it's important and an area where has room to improve.
benevbright@reddit
I see. Yeah right. The things like "find reference" on variable that have been always in editors for human use, may not be available or being used in the same way and so on. agreed.
synw_@reddit
how do you do this? I'm looking for something like this that could deliver a lightweight and condensed knowledge map of a codebase, even big. I have seen a some libraries that do this but did not found anything that could do the job well yet. What would you guys recommend?
andres_garrido@reddit (OP)
I’m doing it in layers instead of trying to summarize the whole repo at once.
First I build a structural map from the codebase: files, symbols, imports, references, directory layout. Then I use that map in a planning step to decide which files are likely relevant for the question. After that I retrieve a mix of full files and smaller semantic chunks.
What helped me most was stopping the model from seeing the repo as plain text. Once it has some structure first, retrieval gets much better, especially on bigger projects.
I haven’t found a single library that solves it cleanly end to end yet. Most things I’ve seen are either too embedding-heavy or too shallow structurally.
total-context64k@reddit
Are you using repomap for this? The code intelligence and file operations tools in my coding harness might be useful to you.
andres_garrido@reddit (OP)
I’m not using repomap directly right now, but yeah, that’s pretty close to what I mean.
What I’ve been trying to figure out is what happens after the map is built. Just having it helps, but the real difference for me is using that structure to decide which files to pull before the model answers.
I’ll check your harness out, especially the code intelligence part.