Best set-up for coding with agents hosting local model

Posted by Mario__10@reddit | LocalLLaMA | View on Reddit | 6 comments

I’ve been experimenting with local LLMs for coding and I’m trying to understand what setups people are actually using in practice, my workflow has mostly been VSCode + GitHub Copilot, which works great because the chat can modify files, apply edits, and interact with the project directly, i am right with that because i mostly program all my code, but i want to try some hosting on my pc with this gemma4 and qwen models. My pc has 4070 super and 32gb RAM.

Recently I started looking into running models locally (Ollama basically). I found that you can technically connect local models to some VSCode extensions, but the experience feels very different.

I tried to host the model in local and add it to github copilot and try to use it as the gnerator but it only works in something like an “ask mode”:

But it doesn’t actually edit files, apply patches, or run commands in the project like Copilot with api model do.

So I’m curious what people are doing for real local coding workflows.

This is because im not used to "Claude Code" on terminal and this stuff. What would you recommend?