Current local models that work well as coding agents

Posted by yehiaserag@reddit | LocalLLaMA | View on Reddit | 7 comments

So I've been using copilot and windsurf for work, and they do actually help.. So since I use local LLMs a lot I was facing the question:

Can any local model rival the tools I'm currently paying for in terms of quality?

I have a 3080ti with 12gigs of vram and 128gigs of ddr4 ram

Anything that I can use locally?

[-]

abnormal_human@reddit

Nothing even remotely close. That's a 4.5yo gaming GPU. Maybe with four RTX6000s you could run a model large enough to play in the space, but it won't be Claude Code or Codex level at all.

I do a ton of AI enablement work for a decent sized software development org. I've seen all the tools and how they play out in actual peoples' hands. Codex and Claude Code are the ones worth using. Windsurf is 6mos behind and struggles for hours with tasks that Claude Code can one-shot in a few minutes. We did an in-depth eval, talked to their team, etc. Not good enough. Cursor is maybe 3mos behind. Copilot is worth turning on as a creature comfort.

I think we are also scaling out the IDE+single agent model. I've shifted away from using a single "agentic" tool towards hands-on managing small teams of agents working in parallel on non-overlapping tasks and coordinating via git, documentation, etc like people do. My next frontier is frontend QA since once you get the rest of it dialed in that seems to be the bottleneck. Need to figure out MCPs/automation to get the agent to click-test its own work.

[-]

NNN_Throwaway2@reddit

No.

Small models struggle to even consistently complete tool calls or supply diffs in the format required by these agentic editors, let alone actually write usable code.

Yes, you can technically write code with the help of small models, and yes, you can augment using MCP for stuff like web search or RAG, but it definitely won't rival what you can accomplish with large cloud models or even larger local models.

[-]

chisleu@reddit

Not really. Qwen 3 Coder 30b is the best coding model that you could run some quant of, but it wouldn't be nearly the performance you are used to.

[-]

yehiaserag@reddit (OP)

But what about the quality? I remember using qwen coder like a year ago, is this a newer iteration?

[-]

chisleu@reddit

I use Qwen 3 Coder 30b for local stuff with some success. You can't rely on the model's knowledge for hallucination free vibe coding. You have to load the context with all the necessary information. I use MCPs to accomplish this.

[-]

No-Consequence-1779@reddit

There are a couple larger versions on this also. 53b. Currently qwen is the best or top 3, depending upon opinion.

[-]

q5sys@reddit

What do you mean, you use an MCP server for extra context while coding?
I understand using an MCP server to preform tasks, but just to provide context for coding? You have a link that explains that? Like do you have a mcp server that's operating as a RAG for say... python coding documentation... or... something else?