Closest replacement for Claude + Claude Code? (got banned, no explanation)

Posted by antoniocorvas@reddit | LocalLLaMA | View on Reddit | 103 comments

I was using Claude Pro + Claude Code pretty heavily (terminal workflow, file access, etc.) and my account just got banned with zero explanation.

From what I’m seeing, this isn’t that uncommon — people getting flagged without clear reasons or support responses — so I’m trying to move on and rebuild my setup.

What I’m looking for is something that actually matches BOTH sides of what Claude gave me:

1. Claude-level reasoning / writing

strong long-form thinking
structured outputs (planning, creative work, etc.)

2. Claude Code-style workflow

terminal / CLI interaction
ability to work with local files or repos
feels like an “agent” that can execute tasks, not just chat

I’ve tried ChatGPT (even the $20 Plus + Codex), and while it’s good, it doesn’t have the same feel or workflow — especially on the terminal / agent side.

My actual use case:

lesson planning + building slides/materials (high school teaching)
content creation + branding (IG, captions, concepts)
DJ + music workflow (set planning, ideas, organization)
working out of an Obsidian vault synced via GitHub
occasionally generating visuals (images, HTML mockups) and analyzing screenshots

Ideally also:

works with an Obsidian vault or local knowledge base
stable (no sketchy plugins or risk of getting banned again)
okay with paid tools (\~$20/mo range)

For people who were actually using Claude + Claude Code:
what are you using now that comes closest in real workflows?

Not looking for theoretical answers, more interested in setups you’re actually using day-to-day.

[-]

DeepBlue96@reddit

cloud based:
qwencode and github copilot sub iguess
localsetup:
as plugin for vscode: roo code, as cli: qwencode
local models: qwen 3.5-35b-a3b or qwen 3.6-35b-a3b for small system maybe qwen 3.5 9b

[-]

rainbyte@reddit

Everybody is suggesting the biggest frontier models available or accounts on other cloud providers...

But, in case you are interested in going local (this is r/localllama), which hardware do you have? Do you have a gpu? We can recommend you a model compatible with your hardware.

If you have a gpu you can run a model locally and have some level of independence from cloud models.

[-]

troop99@reddit

And if the hardware is just a gaming setup with a gtx 5090ti, what would be a Good model?

[-]

I recently rejoined the sub, and it is wild seeing all the discussion about stuff that isn't local. I was just thinking about leaving the sub when I saw your comment, if anybody knows of places where there's an actual focus on local models, I would love to join that sub.

[-]

ttkciar@reddit

Right now the closest model to Claude Opus is GLM-5.1, which is slightly more competent than Sonnet for codegen but slightly less than Opus.

[-]

IDoButtStuffs@reddit

What sort of hardware would even be required to run this lol

[-]

lostnuclues@reddit

10 to 12k USD, buy 10 Intel arc b70 . At Q4 it will fit in completely on VRAM

[-]

kitanokikori@reddit

You'd use it via OpenRouter and OpenCode, you wouldn't build a rig for this yourself

[-]

pulse77@reddit

Prepare 50,000 USD to 100,000 USD...

[-]

Kranvagen@reddit

1 server graphic card its about 30 000 usd

[-]

sk1kn1ght@reddit

What you mean? It's a subscription for 99% of the people using it like all the other sota models

[-]

IDoButtStuffs@reddit

You can run it locally

https://huggingface.co/zai-org/GLM-5.1

[-]

sk1kn1ght@reddit

Oh yes I know. I am just lacking about 600k in equipment.

[-]

Due-Project-7507@reddit

GLM-5 runs in NVFP4 on 6 RTX Pro 6000 Blackwell with a combination of tensor and pipeline parallel mode. The problem is that the code paths for this in SGLang and vLLM are not really stable. Only few people use this configuartion and report/fix bugs for it. Last February, it did not run with vLLM and with SGLang, I had quality problems. I don't know if these bugs are now fixed because at the moment, we need the RTX Pro 6000 GPUs for a project so I cannot test it.

[-]

Superb_Onion8227@reddit

Do you have any idea of the Wh/token you reach on that setup? (at ~0 and 10 000t)

GLM-5 can't optimize itself yet also haha? I feel like you could have a channel of people with similar setups and just share code.

[-]

wie_witzig@reddit

About 10 B200, connected with NVLink for the model, hundreds of GB of RAM and a distributed inference stack

[-]

Karyo_Ten@reddit

8x RTX Pro 6000, so instead of leasing for a Tesla ...

[-]

ninjainvasion@reddit

is there any way to use both Claude Opus along with GLM 5.1 in Claude Code?

[-]

rgar132@reddit

go-llm-proxy makes it pretty painless if you want to have a native type experience with web search and server tooling with claude code. I’ve been running cc harness with MiniMax m2.7 locally and pretty happy with it. Best case setup without anthropic is probably GLM-5.1 as opus and MM as sonnet or haiku and you’ll get a lot done (use their config generator to try it).

[-]

hellobritishcolumbia@reddit

There are easy wrappers like the one from Ollama if you want to try it out. I use LiteLLM personally and it’s solid for standardizing across different API formats from each provider.

[-]

ZireaelStargaze@reddit

There is -- model flag that let's you define extra third party model. Of if you use c proxy or litellm for routing models, you can have as many as you want.

[-]

Quango2009@reddit

Try GitHub copilot if local LLMs don’t work out. You can still access Sonnet, Opus etc, may find it cheaper. However there are still limits so depending on your usage you might still have problems

[-]

localizeatp@reddit

anthropic's target audience is whole faang companies, they don't care about us any more.

[-]

ServiceOver4447@reddit

They never liked subsidizing all plans that aren't enterprise.

[-]

EenyMeanyMineyMoo@reddit

They lose money on every price plan at every tier. With the recent belt-tightening over there it wouldn't surprise me if some bans are just their most expensive users.

[-]

MoffKalast@reddit

Can't they just double the price or something? They are already undercutting OpenAI.

[-]

YouAreRight007@reddit

Banning a user without clear reasoning is akin to theft. Even casinos are polite enough to ask you to leave and let you cash out.

[-]

martinerous@reddit

Yeah, they'd better introduce throttling instead of stupidly banning everyone for "too much use".

[-]

Statcat2017@reddit

You will find yourself being throttled after like two prompts.

[-]

Shot-Buffalo-2603@reddit

Yeah most people don’t realize it because it’s tucked away behind API calls and generally aligns with the cost of other SaaS subscription based services, but your remotely spinning up like 100k of GPUs for 3 hours a day for 20$ a month 😭 this is basically stealing rn

[-]

inebriated_me@reddit

Real talk: why not just open a new account under a different email or something?

[-]

YouAreRight007@reddit

I agree. Do a quick cost benefit analysis considering how many months it takes to be banned and make the call.

[-]

getpodapp@reddit

They're blocking third party harnesses at the level of the system prompt. I was using 'opencode-claude-auth' which is meant to emulate claude code at the harness level but I still got soft-blocked.

[-]

pilibitti@reddit

it is not the e-mail. it is the card you do payment with. all your cards are tied to the same stable identity. paid services know who the customer is unless you use someone else's card.

[-]

Kodix@reddit

So that he gets banned and loses money again? And what then, do it again?

I'm genuinely unsure why you think that's a good long-term solution.

[-]

Kranvagen@reddit

GLM 5.1

[-]

Cultural_Meeting_240@reddit

For your use case honestly you might not even need local models. Gemini 2.5 Pro is free right now and the reasoning is genuinely close to Claude. For the agentic coding side, Aider or Kilo Code with any strong API model gives you that terminal workflow with local file access. Pair Gemini API with Aider and you basically rebuild your whole setup for cheap.

[-]

deaglanh@reddit

When you say Gemini 2.5 Pro if free now... Where? How?

That was without a doubt my favourite Google model so would love to know how to get it for free

[-]

homak666@reddit

Through their API, or just in Google Studio.

https://ai.google.dev/gemini-api/docs/pricing#gemini-2.5-pro

[-]

Practical-Trick3332@reddit

Had no idea, thought you needed the AI Pro and could only use with their new model, trying this when I get home. 🙏🏻

[-]

kiilkk@reddit

Its free but you run often into outage errors. Especially since the (mis-)use with openclaw.

[-]

Practical-Trick3332@reddit

I really only want to use a cloud model every once in a while to get a powerful cloud model to review the work of my tiny, stupid local ones, and look for bugs/errors I missed. 😅 Thanks for the heads-up.

[-]

jazir55@reddit

The default model is actually 3.0 flash now, which is better than 2.5 pro at coding.

[-]

redblood252@reddit

Is it doable to use gemini 2.5 pro with claude code? Is it better than glm 5.1?

[-]

Potential-Leg-639@reddit

Of course not

[-]

arsenale@reddit

There's an appeal process, and people regularly have their ban reversed, I've seen yesterday on x.com

https://x.com/DrMarcNunes/status/2045729225173508296

[-]

Zealousideal-Check77@reddit

Personally, I love kimi, kimi k2.6 coding preview has been a good model for me. Is it as good as opus 4.6? No. But if you know what you are doing and not a vibe coder who just prompts and sleeps then it is insanely good, and the token quota is insanely generous for the $19 plan.

[-]

evia89@reddit

my take https://old.reddit.com/r/CLine/comments/1soscdr/agentic_ai_coder/ogw0i75/

[-]

WorthBathroom3268@reddit

I still haven't found a true 1:1 replacement for the Claude + Claude Code combo. The closest setups usually feel assembled rather than integrated: one tool for reasoning, another for terminal/repo work. If stability matters most, I'd optimize for the most boring reliable workflow you can trust with your local files, even if it feels slightly less magical.

[-]

marrabld@reddit

Opencode + github copilot + claude

[-]

YouAreRight007@reddit

You could just sign up again. Use a virtual card assuming your bank supports it.

I don't use Claude. Codex has been sufficient for my needs.

[-]

floridianfisher@reddit

Anthropic is nuts. They cut me off for no reason as well.

[-]

LumpyWelds@reddit

Did you routinely run out of tokens? I'm looking for a pattern.

[-]

Worried-Squirrel2023@reddit

got cut off once too, no warning no explanation. ended up running opencode with a mix of models depending on the task. GLM 5.1 for the reasoning heavy stuff, qwen 3.6 locally for anything that doesn't need frontier quality. it's actually a better setup than relying on one provider because you're never fully locked out again. the obsidian + github workflow you described works fine with opencode since it just reads local files directly.

[-]

Odd_Crab1224@reddit

Codex subscription with OpenCode. And if some other more attractive subscription/model appears you can easily switch without changing your dev setup

[-]

Lostinanidlemind@reddit

I personally could t find a CLI to do everything I wanted specifically so built my own, it's a work in progress but I can manage it myself and use API and local models when needed

[-]

weiyong1024@reddit

Got burned by the same vendor lock-in problem recently, OpenAI added Cloudflare protection that killed Codex OAuth access overnight so my whole agent setup broke. Ended up switching to a multi-provider approach where each agent runs in its own Docker container through ClawFleet (github.com/clawfleet/ClawFleet) and I can swap providers per instance, OpenAI API for one, Google AI Studio free tier for another. Never depending on a single vendor's policy decisions again.

[-]

gagsgupta@reddit

Noob question what's best way to generate ppt/slides from claude code or any llm Any workflow/tool/prompt to get this done?

[-]

kesor@reddit

Use Amazon Kiro, it has the same claude models, and there is Kiro CLI that is quite good. Basic usage is free with an account with their builders thing, and for proper usage you'd want an AWS account with Q subscription (or whatever they renamed it to these days) that is also very cheap in comparison to other Claude model solutions.

[-]

Aromatic_Ad_7557@reddit

I'm not sure how many tokens you spent, I'm use Kiro, there available opus also, but not so much tokens.

[-]

quanhua92@reddit

I use Claude Code with GLM 5.1. I bought the yearly coding plan from z.ai last year, so it was cheap back then. Now, it's competitive, but it's getting expensive quickly. Qwen also has a coding plan, but it doesn't seem easy to purchase. You can also check Ollama Pro plan.

[-]

YaboiCucc@reddit

Is it good? I bought the lite coding plan in december for Xmas, but 4.7 was really shit, slow and brabbling chinese sometimes. has it changed? do the plan matter?

[-]

quanhua92@reddit

It's not Opus-level, but you can iterate on the plan mode until you find the right path. GLM 5.1 is pretty slow, I prefer glm-5-turbo. 4.7 isn't as good as 5, but you can use it or the air model for the explorer agent.

So, yes. Make sure you do the planning, and the GLM can do the job fine.

[-]

anyesh@reddit

I use claude code with Qwen 3.6. I have been using it on my personal projects and it’s pretty good for local llm with goodness of claude code cli.

[-]

Brief-Persimmon-7037@reddit

Opencode for harness with openeouter as gateway to different models maybe?

[-]

Brief-Persimmon-7037@reddit

for smiple taskas minimax m 2.7 is quite Capable at a fraction of Claude cost.
For complex things (coding) I found Gemini 3.1 pro very close to Opus but I am not sure how good or bad it is for your worflow. Aloso I currently hate google for their failed Antigravity integration

[-]

SkillLevelAsia@reddit

OpenCode + GLM 5.1 is what I am testing. Seems about sonnet quality for my tasks.

[-]

rushBblat@reddit

what hardware would be used to even run that beast?

[-]

SkillLevelAsia@reddit

I am just using z.ai, no local hardware for that.

[-]

getpodapp@reddit

basically unfeasible to run yourself. use ollama cloud.

[-]

rushBblat@reddit

yes also think that will be at least 50k investment to make it run well

[-]

ghostopera@reddit

I've been using OpenCode with Github Copilot as my model provider. (OpenCode use just about everything as a model provider).

OpenCode is very similar to the Claude Code as a harness, and with Copilot I have access to Opus 4.6, GPT 5.4, and etc.

I've also had a pretty good experience with OpenCode + Qwen 3.6 35B with LM Studio (local) as my provider on my 7900XTX.

Work pays for the Copilot account, so for doing personal stuff I've been using Qwen 3.6, occasionally moving to GPT5.4 on ChatGPT when I am needing a frontier model.

I'm really happy with the combination!

[-]

spidLL@reddit

I bet you know what you did.

[-]

lol-its-funny@reddit

Is you still want to use them, best make another email address. Two can play the game.

If you think they deserver the 🖕, I’ve had good luck with OpenAI GPT5.4 extra-high. On the local llama side, that level isn’t available but gemma4 is space constrained or qwen 3.5+ MoEs are strong last I checked.

[-]

xXG0DLessXx@reddit

New account still works, but for how long? They already started doing identity verification. It’s only a matter of time before they ban the “person” rather than the account. Another reason to be against identity verification.

[-]

coder903@reddit

Codex will work but has a different feel to it. I’ve been building an OpenCode version using Kimi K2.5 as orchestrator/coder, GLM5.1 as planner/reviewer, QWEN3.6 Plus as explorer/researcher. OpenCode will take some configuration and tweaking. So my plan in your situation would be Codex but use it also to help you with configuring your OpenCode system. Then you will have a decent backup should something happen with Codex and also a little peace of mind. One more thing — Don’t do what I did at first and use GLM5.1 for everything. Although good. It will eat you up in costs and be way too slow.

[-]

Savantskie1@reddit

The reason you got banned is because they were thinking you were trying to distill from Claude. So instead of messaging you, they just banned you. The same old thing, you get use out of their stuff, you didn't use it like they wanted, (in your case education) instead of strictly code like they want, so they banned you to get rid of your "training" dataset. (I understand it most likely wasn't)

[-]

Statcat2017@reddit

In think they’ve just started culling users who cost more than they pay.

Those sob stories about people using 27k of compute on a 200 dollar sub didn’t come from nowhere.

I don’t think they can just rate limit you because that would expose just how expensive LLMs are

[-]

Texasaudiovideoguy@reddit

Use a diff email, refresh your wan ip. Problem solved. Done it twice

[-]

FederalAnalysis420@reddit

with cline and openrouter i think you could kind of rebuild claude code in vscode. openrouter still gives you sonnet/opus even if anthropic banned you, and cline handles the agentic file/terminal stuff the same way.

[-]

kiilkk@reddit

Ollama Launch Claude with glm5.1:cloud (20 Dollars per Month), very similar to your original setup. You can try ollama launch hermes as alternative with same account

[-]

ReasonableBenefit47@reddit

Use Kimi much better

[-]

atharvat80@reddit

Opencode + GitHub Copilot, you can customise the config so that it uses Opus for planning/orchestrating and Sonnet for executing. Should give you pretty similar experience.

[-]

candreacchio@reddit

I’ve also been using it as: • a personal assistant • a social media / brand manager

Just remember Claude Code is only supposed to be for coding purposes, not personal assistanting purposes?

[-]

LaRosarito@reddit

Yo uso codex y trae en modo Solo Builder y estoy súper contento

[-]

face_throne@reddit

For writing Gemini and for reasoning codex 5.2 (non codex)Not local models but you can use their paid plans

[-]

voitiksde@reddit

I've tried to replace Claude subscription with open weight models, but as many said, for me even GLM 5.1 wasn't close enough to compete. I enjoyed using GLM for planning and Qwen 3.5 to execute from Ollama Pro plan, but I needed to babysit them much more than Claude (or even GPT). I'd recommend either checking Codex (GPT models doesn't feel like Claude but for me it's the smartest among others for programming and reasoning) coupling with Github Copilot. There is a pay per request, so it's fine for implementing big specs for me and you can switch between Claude / GPT (and others) just to test them out.

For me personally switched from Claude Code, and I use Claude / GPT (with gpt sub + github copilot), which costs 60$ per month (saving 140$ of Claude), and I could use it for development, for full month. Now there is Opus 4.7 with the higher multiplier on requests usage, but 4.6 / 4.5 or Sonnet is still affordable there imo

[-]

AndreasWolff@reddit

OpenCode Go? For GLM 5.1 + Zen for API access to Claude?

[-]

sergeialmazov@reddit

It’s always good to have an alternative plan

[-]

ThankThePhoenicians_@reddit

Try the GitHub Copilot CLI. Claude models are available via the subscription, as well as OpenAI models. You can also bring your own key/models that you host locally/anywhere else.

[-]

zdy132@reddit

It even supports openrouter, very easy to switch models with that.

[-]

Negative-Walrus-7490@reddit

Usa crickcoder e integrato totalmente in Vscode, non e un estensione ha qwen 3.5 free e come big glm5.1 e opus 3.7. lo scarichi qui https://crickcoder.com/download

[-]

redditorialy_retard@reddit

Claude code is open source now. use codex or qwen 3.6 on those

[-]

Last_Fig_5166@reddit

Right, what do you drink before commenting?

[-]

vhthc@reddit

The code of Claude code was leaked and reimplemented in python and rust. Also shows how shitty the code quality and prompts are of Claude code which is no surprise when you look at cli terminal benchmarks where it is on the bottom list.

Codex and Junie are better cli, junie you can use with anthropic too

[-]

leonbollerup@reddit

Claude code .. best replacement.. warp.dev, opencode..

claude (the AI interface) can be replaced by most.. but actually best chat interface.. open webui.. its magic good compared to everything else

[-]

cyberspacecowboy@reddit

OpenCode up front, github copilot as provider in the back. Pick any model you like

[-]

leo-k7v@reddit

Anyone playing with llama-agent?

[-]

Responsible_Buy_7999@reddit

Cursor

[-]

unique-moi@reddit

You can keep right on using Claude code cli - the desktop software app can be used as the cli front end to non-Anthropic LLM. The two things necessary are that you set the environment variables (to give it the right url and model name, and unset the api key) and that the url speaks Anthropic API (by using a vLLM or oMLX model runner, or a litellm proxy). Ask Google how to do it. You could, for example, point your Claude code cli at an openrouter subscription and use paid or free models - including opus & sonnet if you want.

[-]

blakok14@reddit

Opencode, es la mejor alternativa Open source puedes poner muchísimos modelos de IA incluidos locales configurar mcp skills etc, de hecho a mi me gusta más

[-]

BidWestern1056@reddit

npcsh/incognide with ollama cloud

https://github.com/npc-worldwide/npcsh

https://github.com/npc-worldwide/npcpy