VSCode and agent integration

Posted by loudsound-org@reddit | LocalLLaMA | View on Reddit | 2 comments

I've been using VSCode with Github Copilot for a bit (free tier) and looking to try running locally due to running in to all of the limits with GHCP. I'd like to have as close of an experience as possible with both code autocomplete and chat integration. I know that GHCP can use local models but I think I'll still run in to session limits and such. If there's a way around that then maybe sticking with it would be best.

A few things about my setup that may make a difference. I'm running the model (primarily Qwen 3.6 35B but would like the ability to switch to 27B and other models on the fly) on my windows PC with llama.cpp. My local Linux server hosts all of my code and dev environments, and I primarily use my windows laptop with VSCode on an SSH workspace in to my server (which works fine with GHCP and any agentic tooling). I plan to also setup Hermes for non-coding use (on the linux server), also using the windows PC's models (the server only has a 1060 6GB GPU...looking at doing embeddings and such on it once I figure that out!).

So with that setup, what is the best integration with VSCode? The Hermes extension and use Hermes for coding as well? Continue pointed directly to my llama.cpp? Cline pointed to either Hermes (is that even possible?) or llama? Run pi.dev alongside Hermes and somehow integrate that (tho it seems pi is mostly for cli dev?). Some other option? Appreciate any advice!