Vs code extension
Posted by MK_L@reddit | LocalLLaMA | View on Reddit | 18 comments
Which coding agent extension are most of you fining best with LM studio as the local server 🤔
Im running qwen 3.6 27b
Ive used Cline and continue mostly.
I haven't checkout all the options but im looking for something that looks and feels like codex ( for me this has been Cline)
Im currently working an writing my own so it can be lm studio specific will all of the api calls coded in (something Cline is missing for me)
OsmanthusBloom@reddit
I've used Roo Code for a while and sort of liked it. But now that its development has been discontinued I would be curious about other options for VSCode. I haven't tried Cline.
Ideally I'd like a tool that has a minimal system prompt like Pi Coding Agent, but that's obviously terminal based.
_derv@reddit
I made a VS Code extension specifically for pi that uses a native chat UI. Check it out if you want: https://vscode-pi.dev
OsmanthusBloom@reddit
Very cool, thanks!
MK_L@reddit (OP)
I dont use the terminal option very much but Cline ships with a cli as well. (Not shilling Cline just letting you know)
BigYoSpeck@reddit
I'm currently happy with Cline using Qwen3.6 27b for planning and 35b for act
I wouldn't bother with LM Studio when you're making API calls though. Just run llama.cpp server in model router mode to hot swap between models
MK_L@reddit (OP)
Can you explain the benefits of llama.cpp a bit more? I just used lm studio because of the model switch api (jit loader). I found it easier to switch through different model/settings for testing.
Totally open to others.
BigYoSpeck@reddit
LM Studio is fine for quickly grabbing and testing models. But it's going to come with memory overhead that just running llama.cpp (which is what LM Studio uses underneath), it's not going to be as up to date as llama.cpp on its own, and it doesn't give you full access to the wealth of configuration for truly optimising your service
Once you have llama.cpp configured in model router mode it lets you switch between models from other clients
MK_L@reddit (OP)
Oh that tracks. Originally I was using lm studio to quickly test models before I ran any in vllm. Just go lazy and ran them right out of lmstudio after a while.
Sounds like I need to make the switch to llama.cpp
suicidaleggroll@reddit
I use opencode, but I use it in vscodium, not vscode (clean build from source without all of MS's telemetry). Continue.dev was a buggy mess when I tried it, RooCode worked well but is now dead.
gladfelter@reddit
I've been using Continue with Qwen 3.6 27B on a single 3090. It performed poorly until I got context up to 128K. There's a few bugs, like it just pastes bash commands without the newline to execute them. A fix is in the works for that. It sometimes sends too much data when doing a conversation summary and the server reports a json error due to only reading part of the message. I'm trying to adjust settings to fix that. And I've been having trouble with file creation commands, but I think that's another case of the extension sending too much data to the model as it happens at the end of long conversations.
I was able to create a high quality pull request to refactor a bunch of agent skills in someone's repository using this. Only a little manual tweaking was required.
I'm on WSL, for what it's worth. I may use pi for agentic tasks and have Continue focused on shorter tasks.
MK_L@reddit (OP)
Which server are you running on wsl? I use wsl for vllm but haven't tried hosting others on there
gladfelter@reddit
I have a recent build of llama.cpp. It wasn't a decision made with expert knowledge of all alternatives, though.
Due-Function-4877@reddit
I went back to Cline after trying both Continue and the "new" Kilo again. It's a shame about Roo closing down.
I can't chime in on other backends. I prefer basic llama.cpp and it's forks.
MK_L@reddit (OP)
It seems most here are recommending llama.cpp im going to switch and try that out as my backend
MK_L@reddit (OP)
I might have to switch to llama.cpp it sounds like everyone is saying its better than lm studio. I found lm studio easier for testing different models with the "recommend for your hardware " and the context slider. Made trying something out easier. I switched from vllm( which I do use) but the setup can changing models is much more involved. But bllm always seems to be atleast 10% faster when its running the same model
Craftkorb@reddit
I use Continue, and overall I like it. But I haven't tried the other many options.
MK_L@reddit (OP)
Which se4ver are you host the model with?
nicholas_the_furious@reddit
I recently jumped in and tried the opencode extension. It works in a terminal so feels a bit different initially than a normal Copilot style extension, but one you put that terminal window in the right hand pane it basically feels just like one of those.
Slightly sleeper learning curve but not really that bad and I've found it totally worth it.