VS Code's new "Agents window" lets you use local AI models. Still requires an Internet connection and a Github Copilot plan (because we can't have nice things)

[-]

Parley_DE@reddit

How can I select Codex as the active agent instead of Copilot or Claude in the VS Code Agents window?

[-]

Like a year ago this was all possible and now they have ripped it out and are trying to sell it back to you. I remember getting qwen2.5 running in vs code then like the next day they had gutted it and there was no way to run the local models. Fuck Microsoft

[-]

Savantskie1@reddit

In base vscode they had an extension for copilot chat that had allowed me to use local models. I switched to insiders a while ago, and that feature disappeared there and in base vscode. It pissed me off so much that I ditched it for opencode

[-]

Alan_Silva_TI@reddit

I used Copilot for almost 3 years (I have 35 payments registered on my GitHub account), but I decided to cancel my subscription as soon as they announced the move to token-based billing.

Now, I mostly use CODEX (the app on my personal PC and a CLI on my work PC) for my professional work (and a little bit on my personal projects). I also use OpenRouter+ with free models, and occasionally paid models (when I want to do complex things), to fuel my Hermes agent.

Otherwise, I use PI Code with local models to code the tools I developed for use with these same local models.

I'm still using VS Code, but I believe it's too little, too late for them. It was an amazing tool back in 2023/2024, but beyond their IDE integration, they offer absolutely nothing else compared to the current stack of paid, free, and local coding tools.

[-]

ccarlyon@reddit

I'm in a similar boat to you. With this being the final month for Premium request-based billing, I'm taking the time to explore if locally-hosted models are a viable option for me going forward. I just checked my projected costs for token-based billing and they are 5X what I am currently paying.

[-]

Alan_Silva_TI@reddit

The reality of model size is that almost anything is useful.

SOTA models require less human intervention. When you give them an instruction, they either have enough built-in knowledge or can simply go online and read the documentation. They are very good at trying things on their own without needing new prompts; they create their own hypotheses and test them against the code to figure out what the issue is.

With smaller local models, you will have to step in and collaborate from time to time. Their capacity to figure things out on their own exists, but it is orders of magnitude lower.

So, you will need a pretty good understanding of how to properly build an implementation plan and use Spec-Driven Development and Test-Driven Development in order to get them working on new features for big codebases.

They are much easier to handle on greenfield projects, though.

So, the TL; DR is:

You can be a senior pair programmer to a SOTA model, or you can even be its junior, LOL. But you must be the senior programmer to your local model so from time to time you will have to give it some direction.

[-]

Thrumpwart@reddit

Roo Code works well.

[-]

harglblarg@reddit

Discontinued, sadly. I’ve switched to Cline.

[-]

ccarlyon@reddit

I believe the original team has moved on to Roomote, however the project seems to have been handed over to a different team.

This is what the 3.53.0 release notes said:

You may have seen the recent announcement that Roo Code hit 3 million installs and the original team is going all-in on Roomote. We know that news was hard for a lot of you. This plugin means a lot to us and to you, and we hear you. The good news: a community team has stepped up to carry Roo Code forward, and we're working with them on an official handoff so the plugin you rely on keeps getting maintained and improved.

Curious what your experience is like on Cline coming from Roo Code though? Anything you found missing?

[-]

Thrumpwart@reddit

Zoo Code is taking over!

[-]

Thin_Pollution8843@reddit

Tbh after using Zed for a few weeks I can’t go back to VS code. Zed has less functionality and flexibility for now BUT this thing so blazing fast! It’s working so fast and easy and smooth

[-]

wombweed@reddit

Zed is awesome compared to vscode just for code tasks in general, but I have had a lot of trouble hooking it up to a local agent, especially for next-edit predictions/completions. Is there a trick to it I’m missing? I’d really like a native IDE that seamlessly plugs into my local infra so I can vibe code without internet or whatever, Zed seems to hold the most promise overall but in terms of usability for agentic workflows I feel like Roo Code comes out ahead.

[-]

Thin_Pollution8843@reddit

I’m using with opencode. 0 issues. But I haven’t trying it to connect for autocomplete without some harness

[-]

wombweed@reddit

Yeah opencode is great but I like also having the middle ground provided by a traditional IDE so I can make manual code changes if it’s faster.

[-]

NeedToLieDown@reddit

I understand the need... But honestly try to completely give in to the agentic workflow and you'll see it's pretty awesome.

I know you prefer writing code by hand every now and then (I do too). But I started just telling a coding agent what do do even if I need to do something super simple loke rename a var, or delete a single comment, or change a constant from 42 to 69.

Yes, it'd probably take me 5 seconds at most to do by hand, and a coding agent will take much longer, but the agent can also run your typical lint + test loop while you go handle something else.

It's crazy but these days I don't want a code editor anymore... I just want a really good UI to track running agents (and let me run whichever agent flavor of the week I want), and a good code review UI.

[-]

wombweed@reddit

I totally agree, I’ve been writing essentially no code and just delegating everything to agents for the past few months. Just looking for a balance for the few situations where it makes sense for me to edit manually.

[-]

Luigi311@reddit

I switched to zed too just for the performance. Haven’t used any of the AI stuff. It’s literally just because vs code is so slow now days on my old hardware.

[-]

Thin_Pollution8843@reddit

VS code slow not only on old hardware tbh

[-]

youcloudsofdoom@reddit

I just tried it and DAMN is this thing fast in comparison to vs code's chat....

[-]

YouAsk-IAnswer@reddit

Zed is the GOAT

[-]

CulturalKing5623@reddit

It looks like the only thing it looks like Zed is missing for me is a way to easily integrate with databases like VS Code snowflake extension. Being able to query and code in the same IDE is basically my workflow. Do you know of a way to do that?

[-]

Icy-Roll-4044@reddit

Who uses vs code in the era cursor and antigravity🥀🥀

[-]

ea_man@reddit

Nobody, we use vscodium

[-]

Icy-Roll-4044@reddit

Cursor better

[-]

Mickenfox@reddit

Why would those be any better?

[-]

Icy-Roll-4044@reddit

Yeah

[-]

celsowm@reddit

Fuck you Microsoft!

[-]

Miriel_z@reddit

Best of both wolds: using local LLMs, and paid subscription? Sign me up!🤣

[-]

Shawnj2@reddit

The LLM runs on your computer but a paid plan is required it won’t work without an internet connection and all data sent to/from the LLM and thinking is sent to Microslop for future model training

[-]

FreeSammiches@reddit

VS code was free. They have to cover that development cost somehow. /s

[-]

Mickenfox@reddit

What is the /s for?

[-]

davl3232@reddit

All of the bad things, none of the benefits. Just pay to train ms models.

[-]

xienze@reddit

You laugh but there's people on this sub who love having Claude come up with plans and having local models implement them. Which is think is utterly bizarre. Just have Claude do the whole thing at that point.

[-]

Equivalent-Costumes@reddit

Bruh, Claude is super expensive to use. No points in blowing money on Claude writing basic codes. Plan with Claude and write code with local LLMs is especially good if you don't have a ton of VRAM to run an LLM powerful enough for planning. A 27B Qwen model can write code nearly perfectly but lost the thread on planning really quickly.

[-]

Moist-Length1766@reddit

don't need to pay

https://ibb.co/yc7f8dd5

[-]

Sarashana@reddit

Me too! I love to pay somebody for using things they didn't make.

[-]

Jump3r97@reddit

So they didn't make the software you are using?

[-]

Luke_Bavarious@reddit

At this point in time... yes?

[-]

techlatest_net@reddit

Yeah, the "local but still needs internet + subscription" part kinda defeats the purpose. Hope someone forks the idea into something actually offline-first. For now, I'll stick with my current setup.

[-]

clinthent@reddit

I have been using the Kilo vscode extension with LM studio running local LLMs within vs code with no issues in my Mac. Cline extension is also pretty good too.

[-]

SkyFeistyLlama8@reddit

That's why I use Continue.dev in VS Code.

[-]

Pleasant-Shallot-707@reddit

Just use zed

[-]

Ell2509@reddit

Talk about going the wrong way.

This will appeal to about 0.2% of the local ai community. Mostly because it is no longer properly local.

[-]

SethMatrix@reddit

Just use Cline if you’re wanting to use a local model

[-]

RoomyRoots@reddit

I think Theia could do something like that.

[-]

simotune@reddit

If offline use still depends on Copilot auth, this is local inference, not a local stack. That distinction matters more than the marketing.

[-]

Fun_Employment6042@reddit

Love that I need a paid cloud subscription and constant internet to "use my local model". Truly the future of offline computing.

[-]

DonnaPollson@reddit

That’s the weirdest possible bundle: local inference for privacy and latency, but cloud entitlement for permission to use it. If they want this to matter, the UX has to degrade gracefully offline and treat Copilot as optional, not as the license server for your own GPU.

[-]

DonnaPollson@reddit

That’s the weirdest possible bundle: local inference for privacy and latency, but cloud entitlement for permission to use it. If they want this to matter, the UX has to degrade gracefully offline and treat Copilot as optional, not as the license server for your own GPU.

[-]

conjuncts@reddit

Right after they took away Claude Sonnet 4.6 for students. Seems like they've fallen on hard times

[-]

chocofoxy@reddit

You can use an extension called OAI compatible that let you link local model with copilot chat i use it it’s pretty good

[-]

Fast-Satisfaction482@reddit

Co-pilot uses cloud based embeddings for semantic search. Even when doing the main model locally.

[-]

SangersSequence@reddit

I mean yeah, but still absolutely bullshit that it could easily have been designed to do locally as well.

[-]

Fast-Satisfaction482@reddit

There are other harnesses that work fully locally.

[-]

ai-christianson@reddit

yeah, this is the key distinction. local model support is not the same thing as a local agent stack. if semantic search, auth, tool routing, or telemetry still depends on a cloud service, then it is really just local inference inside a cloud-controlled product. useful maybe, but not the same category as something you can run and trust offline.

[-]

grandFossFusion@reddit

Imagine using vs unironically

[-]

dibis54986@reddit

Use vscodium

[-]

phein4242@reddit

Remember, these products need to be monetized. Zed does not come with this encumberment. Its not perfect tho, but lets be real here, is an IDE ever perfect? ;-)

[-]

Apart-Medium6539@reddit

nahhh

[-]

kiwibonga@reddit

"might change in a future release"

Please keep the change and paywall it so that this mediocre watered down IDE finally dies.

[-]

dto_lurker@reddit

They dont let you use auth tokens do they? Thats why I use cline currwntly. You can use copilot free

[-]

jake_that_dude@reddit

the annoying part is the semantic index, not the chat model. Copilot still wants GitHub in the loop for auth/search state, so local model ends up meaning local sampler, not local agent.

if you actually need airplane-mode local, use Continue/Cline with Ollama plus a local embedding model like nomic-embed-text. then kill Wi-Fi and run a repo search before trusting it.

[-]