Can I use Cursor Agent (or similar) with a local LLM setup (8B / 13B)?
Posted by BudgetPurple3002@reddit | LocalLLaMA | View on Reddit | 8 comments
Hey everyone, I want to set up a local LLM (running 8B and possibly 13B parameter models). I was wondering if tools like Cursor Agent (or other AI coding agents) can work directly with my local setup, or if they require cloud-based APIs only.
Basically:
Is it possible to connect Cursor (or any similar coding agent) to a local model?
If not Cursor specifically, are there any good agent frameworks that can plug into local models for tasks like code generation and project automation?
Would appreciate any guidance from folks who’ve tried this. 🙏
grabber4321@reddit
VSCode RooCode with Qwen2.5-Coder-7B (search for Cline)
Arkonias@reddit
Cursor doesn’t really work well with local models
Use roo code or cline in vscode with your localhost of choice.
FrozenBuffalo25@reddit
I use Cursor with Qwen Coder and it works just fine. Used it before with Phi4 when I had less VRAM and it worked too.
What’s the problem?
BudgetPurple3002@reddit (OP)
Appreciate the clarification! Just to double-check, if I run Roo Code or Cline against a local API (Ollama / LM Studio etc.), the agent-style functionality (code generation + iterative updates from a prompt) still works fine, correct?
ResidentPositive4122@reddit
Eh... Debatable. Small models are not really that strong, suffer from context deterioration quickly and give subpar results overall. They might work (as in, generate tool calls etc) but will probably get confused pretty quickly. Just because they "mime" the agentic behaviour doesn't mean they're actually good at that.
We've had good success with bigger models tho. Devstral is surprisingly good for its size. GLM have a ~100b model that's decent at agentic stuff. Qwen have some models, but it's highly dependent on what your needs are.
Cline/roo/kilo all have prompt editing capabilities. Make sure to edit the prompts, and use the minimum necessary for your tool needs. If you use the ootb prompts you'll notice that the first query is in the tens of thousands of tokens already. That's too much for small models. You'll have to experiment with what you actually need in the prompt, and how well the small models adhere to them. Trim a lot, only add parts as you notice the model fails.
mearyu_@reddit
yes, see GIF https://media.githubusercontent.com/media/RooCodeInc/Roo-Code/main/src/assets/docs/demo.gif
Blizado@reddit
You can use the OpenAI API inside Cursor, there you can change the API URL to a local LLM with a OpenAI compatible API. But it looks like when you use a API some Cursor features didn't work anymore. I got directly a warning about it, so I didn't even tested it yet.
Electronic-Ad2520@reddit
Cline works fine with local setup. I use qwen 3 coder with lm studio api. But you can run Grok code fast for free in cline right now. Its not Claude but with small tasks its enough.