VS Code's new "Agents window" lets you use local AI models. Still requires an Internet connection and a Github Copilot plan (because we can't have nice things)
Posted by _wsgeorge@reddit | LocalLLaMA | View on Reddit | 71 comments
At first I was excited to see this, but I guess I'll wait till someone figures out what people actually want
Parley_DE@reddit
How can I select Codex as the active agent instead of Copilot or Claude in the VS Code Agents window?
ArtfulGenie69@reddit
Like a year ago this was all possible and now they have ripped it out and are trying to sell it back to you. I remember getting qwen2.5 running in vs code then like the next day they had gutted it and there was no way to run the local models. Fuck Microsoft
Savantskie1@reddit
In base vscode they had an extension for copilot chat that had allowed me to use local models. I switched to insiders a while ago, and that feature disappeared there and in base vscode. It pissed me off so much that I ditched it for opencode
Alan_Silva_TI@reddit
I used Copilot for almost 3 years (I have 35 payments registered on my GitHub account), but I decided to cancel my subscription as soon as they announced the move to token-based billing.
Now, I mostly use CODEX (the app on my personal PC and a CLI on my work PC) for my professional work (and a little bit on my personal projects). I also use OpenRouter+ with free models, and occasionally paid models (when I want to do complex things), to fuel my Hermes agent.
Otherwise, I use PI Code with local models to code the tools I developed for use with these same local models.
I'm still using VS Code, but I believe it's too little, too late for them. It was an amazing tool back in 2023/2024, but beyond their IDE integration, they offer absolutely nothing else compared to the current stack of paid, free, and local coding tools.
ccarlyon@reddit
I'm in a similar boat to you. With this being the final month for Premium request-based billing, I'm taking the time to explore if locally-hosted models are a viable option for me going forward. I just checked my projected costs for token-based billing and they are 5X what I am currently paying.
Alan_Silva_TI@reddit
The reality of model size is that almost anything is useful.
SOTA models require less human intervention. When you give them an instruction, they either have enough built-in knowledge or can simply go online and read the documentation. They are very good at trying things on their own without needing new prompts; they create their own hypotheses and test them against the code to figure out what the issue is.
With smaller local models, you will have to step in and collaborate from time to time. Their capacity to figure things out on their own exists, but it is orders of magnitude lower.
So, you will need a pretty good understanding of how to properly build an implementation plan and use Spec-Driven Development and Test-Driven Development in order to get them working on new features for big codebases.
They are much easier to handle on greenfield projects, though.
So, the TL; DR is:
You can be a senior pair programmer to a SOTA model, or you can even be its junior, LOL. But you must be the senior programmer to your local model so from time to time you will have to give it some direction.
Thrumpwart@reddit
Roo Code works well.
harglblarg@reddit
Discontinued, sadly. I’ve switched to Cline.
ccarlyon@reddit
I believe the original team has moved on to Roomote, however the project seems to have been handed over to a different team.
This is what the 3.53.0 release notes said:
Curious what your experience is like on Cline coming from Roo Code though? Anything you found missing?
Thrumpwart@reddit
Zoo Code is taking over!
Thin_Pollution8843@reddit
Tbh after using Zed for a few weeks I can’t go back to VS code. Zed has less functionality and flexibility for now BUT this thing so blazing fast! It’s working so fast and easy and smooth
wombweed@reddit
Zed is awesome compared to vscode just for code tasks in general, but I have had a lot of trouble hooking it up to a local agent, especially for next-edit predictions/completions. Is there a trick to it I’m missing? I’d really like a native IDE that seamlessly plugs into my local infra so I can vibe code without internet or whatever, Zed seems to hold the most promise overall but in terms of usability for agentic workflows I feel like Roo Code comes out ahead.
Thin_Pollution8843@reddit
I’m using with opencode. 0 issues. But I haven’t trying it to connect for autocomplete without some harness
wombweed@reddit
Yeah opencode is great but I like also having the middle ground provided by a traditional IDE so I can make manual code changes if it’s faster.
NeedToLieDown@reddit
I understand the need... But honestly try to completely give in to the agentic workflow and you'll see it's pretty awesome.
I know you prefer writing code by hand every now and then (I do too). But I started just telling a coding agent what do do even if I need to do something super simple loke rename a var, or delete a single comment, or change a constant from 42 to 69.
Yes, it'd probably take me 5 seconds at most to do by hand, and a coding agent will take much longer, but the agent can also run your typical lint + test loop while you go handle something else.
It's crazy but these days I don't want a code editor anymore... I just want a really good UI to track running agents (and let me run whichever agent flavor of the week I want), and a good code review UI.
wombweed@reddit
I totally agree, I’ve been writing essentially no code and just delegating everything to agents for the past few months. Just looking for a balance for the few situations where it makes sense for me to edit manually.
Luigi311@reddit
I switched to zed too just for the performance. Haven’t used any of the AI stuff. It’s literally just because vs code is so slow now days on my old hardware.
Thin_Pollution8843@reddit
VS code slow not only on old hardware tbh
youcloudsofdoom@reddit
I just tried it and DAMN is this thing fast in comparison to vs code's chat....
YouAsk-IAnswer@reddit
Zed is the GOAT
CulturalKing5623@reddit
It looks like the only thing it looks like Zed is missing for me is a way to easily integrate with databases like VS Code snowflake extension. Being able to query and code in the same IDE is basically my workflow. Do you know of a way to do that?
Icy-Roll-4044@reddit
Who uses vs code in the era cursor and antigravity🥀🥀
ea_man@reddit
Nobody, we use vscodium
Icy-Roll-4044@reddit
Cursor better
Mickenfox@reddit
Why would those be any better?
Icy-Roll-4044@reddit
Yeah
celsowm@reddit
Fuck you Microsoft!
Miriel_z@reddit
Best of both wolds: using local LLMs, and paid subscription? Sign me up!🤣
Shawnj2@reddit
The LLM runs on your computer but a paid plan is required it won’t work without an internet connection and all data sent to/from the LLM and thinking is sent to Microslop for future model training
FreeSammiches@reddit
VS code was free. They have to cover that development cost somehow. /s
Mickenfox@reddit
What is the /s for?
davl3232@reddit
All of the bad things, none of the benefits. Just pay to train ms models.
xienze@reddit
You laugh but there's people on this sub who love having Claude come up with plans and having local models implement them. Which is think is utterly bizarre. Just have Claude do the whole thing at that point.
Equivalent-Costumes@reddit
Bruh, Claude is super expensive to use. No points in blowing money on Claude writing basic codes. Plan with Claude and write code with local LLMs is especially good if you don't have a ton of VRAM to run an LLM powerful enough for planning. A 27B Qwen model can write code nearly perfectly but lost the thread on planning really quickly.
Moist-Length1766@reddit
don't need to pay
https://ibb.co/yc7f8dd5
Sarashana@reddit
Me too! I love to pay somebody for using things they didn't make.
Jump3r97@reddit
So they didn't make the software you are using?
Luke_Bavarious@reddit
At this point in time... yes?
techlatest_net@reddit
Yeah, the "local but still needs internet + subscription" part kinda defeats the purpose. Hope someone forks the idea into something actually offline-first. For now, I'll stick with my current setup.
clinthent@reddit
I have been using the Kilo vscode extension with LM studio running local LLMs within vs code with no issues in my Mac. Cline extension is also pretty good too.
SkyFeistyLlama8@reddit
That's why I use Continue.dev in VS Code.
Pleasant-Shallot-707@reddit
Just use zed
Ell2509@reddit
Talk about going the wrong way.
This will appeal to about 0.2% of the local ai community. Mostly because it is no longer properly local.
SethMatrix@reddit
Just use Cline if you’re wanting to use a local model
RoomyRoots@reddit
I think Theia could do something like that.
simotune@reddit
If offline use still depends on Copilot auth, this is local inference, not a local stack. That distinction matters more than the marketing.
Fun_Employment6042@reddit
Love that I need a paid cloud subscription and constant internet to "use my local model". Truly the future of offline computing.
DonnaPollson@reddit
That’s the weirdest possible bundle: local inference for privacy and latency, but cloud entitlement for permission to use it. If they want this to matter, the UX has to degrade gracefully offline and treat Copilot as optional, not as the license server for your own GPU.
DonnaPollson@reddit
That’s the weirdest possible bundle: local inference for privacy and latency, but cloud entitlement for permission to use it. If they want this to matter, the UX has to degrade gracefully offline and treat Copilot as optional, not as the license server for your own GPU.
conjuncts@reddit
Right after they took away Claude Sonnet 4.6 for students. Seems like they've fallen on hard times
chocofoxy@reddit
You can use an extension called OAI compatible that let you link local model with copilot chat i use it it’s pretty good
Fast-Satisfaction482@reddit
Co-pilot uses cloud based embeddings for semantic search. Even when doing the main model locally.
SangersSequence@reddit
I mean yeah, but still absolutely bullshit that it could easily have been designed to do locally as well.
Fast-Satisfaction482@reddit
There are other harnesses that work fully locally.
ai-christianson@reddit
yeah, this is the key distinction. local model support is not the same thing as a local agent stack. if semantic search, auth, tool routing, or telemetry still depends on a cloud service, then it is really just local inference inside a cloud-controlled product. useful maybe, but not the same category as something you can run and trust offline.
grandFossFusion@reddit
Imagine using vs unironically
dibis54986@reddit
Use vscodium
phein4242@reddit
Remember, these products need to be monetized. Zed does not come with this encumberment. Its not perfect tho, but lets be real here, is an IDE ever perfect? ;-)
Apart-Medium6539@reddit
nahhh
kiwibonga@reddit
"might change in a future release"
Please keep the change and paywall it so that this mediocre watered down IDE finally dies.
dto_lurker@reddit
They dont let you use auth tokens do they? Thats why I use cline currwntly. You can use copilot free
jake_that_dude@reddit
the annoying part is the semantic index, not the chat model. Copilot still wants GitHub in the loop for auth/search state, so local model ends up meaning local sampler, not local agent.
if you actually need airplane-mode local, use Continue/Cline with Ollama plus a local embedding model like
nomic-embed-text. then kill Wi-Fi and run a repo search before trusting it.pmttyji@reddit
What about VSCodium? Hope it solves the issue
yeah-ok@reddit
One can pray and hope - the fork would really come into it's own then (it's already my daily driver but bet it would attract yet larger audience!)
ForsookComparison@reddit
They avoid built-in telemetry with the base product but if you choose to use a feature/extension with must-use-cloud features the VSCodium mission doesn't include reinventing them as on-prem products.
Scared-Tip7914@reddit
Insane combo 😂 i propose that they tax local models by the token for the privilege of passing your local data through their precious servers
CulturalKing5623@reddit
You joke but GitHub does already do something similar to this with charging people for self-hosted runners since it still runs through their infrastructure.
Due-Function-4877@reddit
It's appropriate that Microsoft's spyware window would be named the agents window.
I just use Cline.
Great_Guidance_8448@reddit
Or just use Cline
MiserableSet5311@reddit
Pi Agent VS Code extension working for me good atm. It can sit besides Codex AI tools and you can switch when you run out of tokens.
Eyelbee@reddit
This isn't new