Closest replacement for Claude + Claude Code? (got banned, no explanation)
Posted by antoniocorvas@reddit | LocalLLaMA | View on Reddit | 103 comments
I was using Claude Pro + Claude Code pretty heavily (terminal workflow, file access, etc.) and my account just got banned with zero explanation.
From what I’m seeing, this isn’t that uncommon — people getting flagged without clear reasons or support responses — so I’m trying to move on and rebuild my setup.
What I’m looking for is something that actually matches BOTH sides of what Claude gave me:
1. Claude-level reasoning / writing
- strong long-form thinking
- structured outputs (planning, creative work, etc.)
2. Claude Code-style workflow
- terminal / CLI interaction
- ability to work with local files or repos
- feels like an “agent” that can execute tasks, not just chat
I’ve tried ChatGPT (even the $20 Plus + Codex), and while it’s good, it doesn’t have the same feel or workflow — especially on the terminal / agent side.
My actual use case:
- lesson planning + building slides/materials (high school teaching)
- content creation + branding (IG, captions, concepts)
- DJ + music workflow (set planning, ideas, organization)
- working out of an Obsidian vault synced via GitHub
- occasionally generating visuals (images, HTML mockups) and analyzing screenshots
Ideally also:
- works with an Obsidian vault or local knowledge base
- stable (no sketchy plugins or risk of getting banned again)
- okay with paid tools (\~$20/mo range)
For people who were actually using Claude + Claude Code:
what are you using now that comes closest in real workflows?
Not looking for theoretical answers, more interested in setups you’re actually using day-to-day.
DeepBlue96@reddit
cloud based:
qwencode and github copilot sub iguess
localsetup:
as plugin for vscode: roo code, as cli: qwencode
local models: qwen 3.5-35b-a3b or qwen 3.6-35b-a3b for small system maybe qwen 3.5 9b
rainbyte@reddit
Everybody is suggesting the biggest frontier models available or accounts on other cloud providers...
But, in case you are interested in going local (this is r/localllama), which hardware do you have? Do you have a gpu? We can recommend you a model compatible with your hardware.
If you have a gpu you can run a model locally and have some level of independence from cloud models.
troop99@reddit
And if the hardware is just a gaming setup with a gtx 5090ti, what would be a Good model?
popiazaza@reddit
Qwen 3.6
micseydel@reddit
I recently rejoined the sub, and it is wild seeing all the discussion about stuff that isn't local. I was just thinking about leaving the sub when I saw your comment, if anybody knows of places where there's an actual focus on local models, I would love to join that sub.
ttkciar@reddit
Right now the closest model to Claude Opus is GLM-5.1, which is slightly more competent than Sonnet for codegen but slightly less than Opus.
IDoButtStuffs@reddit
What sort of hardware would even be required to run this lol
lostnuclues@reddit
10 to 12k USD, buy 10 Intel arc b70 . At Q4 it will fit in completely on VRAM
kitanokikori@reddit
You'd use it via OpenRouter and OpenCode, you wouldn't build a rig for this yourself
pulse77@reddit
Prepare 50,000 USD to 100,000 USD...
Kranvagen@reddit
1 server graphic card its about 30 000 usd
sk1kn1ght@reddit
What you mean? It's a subscription for 99% of the people using it like all the other sota models
IDoButtStuffs@reddit
You can run it locally
https://huggingface.co/zai-org/GLM-5.1
sk1kn1ght@reddit
Oh yes I know. I am just lacking about 600k in equipment.
Due-Project-7507@reddit
GLM-5 runs in NVFP4 on 6 RTX Pro 6000 Blackwell with a combination of tensor and pipeline parallel mode. The problem is that the code paths for this in SGLang and vLLM are not really stable. Only few people use this configuartion and report/fix bugs for it. Last February, it did not run with vLLM and with SGLang, I had quality problems. I don't know if these bugs are now fixed because at the moment, we need the RTX Pro 6000 GPUs for a project so I cannot test it.
Superb_Onion8227@reddit
Do you have any idea of the Wh/token you reach on that setup? (at ~0 and 10 000t)
GLM-5 can't optimize itself yet also haha? I feel like you could have a channel of people with similar setups and just share code.
wie_witzig@reddit
About 10 B200, connected with NVLink for the model, hundreds of GB of RAM and a distributed inference stack
Karyo_Ten@reddit
8x RTX Pro 6000, so instead of leasing for a Tesla ...
ninjainvasion@reddit
is there any way to use both Claude Opus along with GLM 5.1 in Claude Code?
rgar132@reddit
go-llm-proxy makes it pretty painless if you want to have a native type experience with web search and server tooling with claude code. I’ve been running cc harness with MiniMax m2.7 locally and pretty happy with it. Best case setup without anthropic is probably GLM-5.1 as opus and MM as sonnet or haiku and you’ll get a lot done (use their config generator to try it).
hellobritishcolumbia@reddit
There are easy wrappers like the one from Ollama if you want to try it out. I use LiteLLM personally and it’s solid for standardizing across different API formats from each provider.
ZireaelStargaze@reddit
There is -- model flag that let's you define extra third party model. Of if you use c proxy or litellm for routing models, you can have as many as you want.
Quango2009@reddit
Try GitHub copilot if local LLMs don’t work out. You can still access Sonnet, Opus etc, may find it cheaper. However there are still limits so depending on your usage you might still have problems
localizeatp@reddit
anthropic's target audience is whole faang companies, they don't care about us any more.
ServiceOver4447@reddit
They never liked subsidizing all plans that aren't enterprise.
EenyMeanyMineyMoo@reddit
They lose money on every price plan at every tier. With the recent belt-tightening over there it wouldn't surprise me if some bans are just their most expensive users.
MoffKalast@reddit
Can't they just double the price or something? They are already undercutting OpenAI.
YouAreRight007@reddit
Banning a user without clear reasoning is akin to theft. Even casinos are polite enough to ask you to leave and let you cash out.
martinerous@reddit
Yeah, they'd better introduce throttling instead of stupidly banning everyone for "too much use".
Statcat2017@reddit
You will find yourself being throttled after like two prompts.
Shot-Buffalo-2603@reddit
Yeah most people don’t realize it because it’s tucked away behind API calls and generally aligns with the cost of other SaaS subscription based services, but your remotely spinning up like 100k of GPUs for 3 hours a day for 20$ a month 😭 this is basically stealing rn
inebriated_me@reddit
Real talk: why not just open a new account under a different email or something?
YouAreRight007@reddit
I agree. Do a quick cost benefit analysis considering how many months it takes to be banned and make the call.
getpodapp@reddit
They're blocking third party harnesses at the level of the system prompt. I was using 'opencode-claude-auth' which is meant to emulate claude code at the harness level but I still got soft-blocked.
pilibitti@reddit
it is not the e-mail. it is the card you do payment with. all your cards are tied to the same stable identity. paid services know who the customer is unless you use someone else's card.
Kodix@reddit
So that he gets banned and loses money again? And what then, do it again?
I'm genuinely unsure why you think that's a good long-term solution.
Kranvagen@reddit
GLM 5.1
Cultural_Meeting_240@reddit
For your use case honestly you might not even need local models. Gemini 2.5 Pro is free right now and the reasoning is genuinely close to Claude. For the agentic coding side, Aider or Kilo Code with any strong API model gives you that terminal workflow with local file access. Pair Gemini API with Aider and you basically rebuild your whole setup for cheap.
deaglanh@reddit
When you say Gemini 2.5 Pro if free now... Where? How?
That was without a doubt my favourite Google model so would love to know how to get it for free
homak666@reddit
Through their API, or just in Google Studio.
Practical-Trick3332@reddit
Had no idea, thought you needed the AI Pro and could only use with their new model, trying this when I get home. 🙏🏻
kiilkk@reddit
Its free but you run often into outage errors. Especially since the (mis-)use with openclaw.
Practical-Trick3332@reddit
I really only want to use a cloud model every once in a while to get a powerful cloud model to review the work of my tiny, stupid local ones, and look for bugs/errors I missed. 😅 Thanks for the heads-up.
jazir55@reddit
The default model is actually 3.0 flash now, which is better than 2.5 pro at coding.
redblood252@reddit
Is it doable to use gemini 2.5 pro with claude code? Is it better than glm 5.1?
Potential-Leg-639@reddit
Of course not
arsenale@reddit
There's an appeal process, and people regularly have their ban reversed, I've seen yesterday on x.com
https://x.com/DrMarcNunes/status/2045729225173508296
Zealousideal-Check77@reddit
Personally, I love kimi, kimi k2.6 coding preview has been a good model for me. Is it as good as opus 4.6? No. But if you know what you are doing and not a vibe coder who just prompts and sleeps then it is insanely good, and the token quota is insanely generous for the $19 plan.
evia89@reddit
my take https://old.reddit.com/r/CLine/comments/1soscdr/agentic_ai_coder/ogw0i75/
WorthBathroom3268@reddit
I still haven't found a true 1:1 replacement for the Claude + Claude Code combo. The closest setups usually feel assembled rather than integrated: one tool for reasoning, another for terminal/repo work. If stability matters most, I'd optimize for the most boring reliable workflow you can trust with your local files, even if it feels slightly less magical.
marrabld@reddit
Opencode + github copilot + claude
YouAreRight007@reddit
You could just sign up again. Use a virtual card assuming your bank supports it.
I don't use Claude. Codex has been sufficient for my needs.
floridianfisher@reddit
Anthropic is nuts. They cut me off for no reason as well.
LumpyWelds@reddit
Did you routinely run out of tokens? I'm looking for a pattern.
Worried-Squirrel2023@reddit
got cut off once too, no warning no explanation. ended up running opencode with a mix of models depending on the task. GLM 5.1 for the reasoning heavy stuff, qwen 3.6 locally for anything that doesn't need frontier quality. it's actually a better setup than relying on one provider because you're never fully locked out again. the obsidian + github workflow you described works fine with opencode since it just reads local files directly.
Odd_Crab1224@reddit
Codex subscription with OpenCode. And if some other more attractive subscription/model appears you can easily switch without changing your dev setup
Lostinanidlemind@reddit
I personally could t find a CLI to do everything I wanted specifically so built my own, it's a work in progress but I can manage it myself and use API and local models when needed
weiyong1024@reddit
Got burned by the same vendor lock-in problem recently, OpenAI added Cloudflare protection that killed Codex OAuth access overnight so my whole agent setup broke. Ended up switching to a multi-provider approach where each agent runs in its own Docker container through ClawFleet (github.com/clawfleet/ClawFleet) and I can swap providers per instance, OpenAI API for one, Google AI Studio free tier for another. Never depending on a single vendor's policy decisions again.
gagsgupta@reddit
Noob question what's best way to generate ppt/slides from claude code or any llm Any workflow/tool/prompt to get this done?
kesor@reddit
Use Amazon Kiro, it has the same claude models, and there is Kiro CLI that is quite good. Basic usage is free with an account with their builders thing, and for proper usage you'd want an AWS account with Q subscription (or whatever they renamed it to these days) that is also very cheap in comparison to other Claude model solutions.
Aromatic_Ad_7557@reddit
I'm not sure how many tokens you spent, I'm use Kiro, there available opus also, but not so much tokens.
quanhua92@reddit
I use Claude Code with GLM 5.1. I bought the yearly coding plan from z.ai last year, so it was cheap back then. Now, it's competitive, but it's getting expensive quickly. Qwen also has a coding plan, but it doesn't seem easy to purchase. You can also check Ollama Pro plan.
YaboiCucc@reddit
Is it good? I bought the lite coding plan in december for Xmas, but 4.7 was really shit, slow and brabbling chinese sometimes. has it changed? do the plan matter?
quanhua92@reddit
It's not Opus-level, but you can iterate on the plan mode until you find the right path. GLM 5.1 is pretty slow, I prefer glm-5-turbo. 4.7 isn't as good as 5, but you can use it or the air model for the explorer agent.
So, yes. Make sure you do the planning, and the GLM can do the job fine.
anyesh@reddit
I use claude code with Qwen 3.6. I have been using it on my personal projects and it’s pretty good for local llm with goodness of claude code cli.
Brief-Persimmon-7037@reddit
Opencode for harness with openeouter as gateway to different models maybe?
Brief-Persimmon-7037@reddit
for smiple taskas minimax m 2.7 is quite Capable at a fraction of Claude cost.
For complex things (coding) I found Gemini 3.1 pro very close to Opus but I am not sure how good or bad it is for your worflow. Aloso I currently hate google for their failed Antigravity integration
SkillLevelAsia@reddit
OpenCode + GLM 5.1 is what I am testing. Seems about sonnet quality for my tasks.
rushBblat@reddit
what hardware would be used to even run that beast?
SkillLevelAsia@reddit
I am just using z.ai, no local hardware for that.
getpodapp@reddit
basically unfeasible to run yourself. use ollama cloud.
rushBblat@reddit
yes also think that will be at least 50k investment to make it run well
ghostopera@reddit
I've been using OpenCode with Github Copilot as my model provider. (OpenCode use just about everything as a model provider).
OpenCode is very similar to the Claude Code as a harness, and with Copilot I have access to Opus 4.6, GPT 5.4, and etc.
I've also had a pretty good experience with OpenCode + Qwen 3.6 35B with LM Studio (local) as my provider on my 7900XTX.
Work pays for the Copilot account, so for doing personal stuff I've been using Qwen 3.6, occasionally moving to GPT5.4 on ChatGPT when I am needing a frontier model.
I'm really happy with the combination!
spidLL@reddit
I bet you know what you did.
lol-its-funny@reddit
Is you still want to use them, best make another email address. Two can play the game.
If you think they deserver the 🖕, I’ve had good luck with OpenAI GPT5.4 extra-high. On the local llama side, that level isn’t available but gemma4 is space constrained or qwen 3.5+ MoEs are strong last I checked.
xXG0DLessXx@reddit
New account still works, but for how long? They already started doing identity verification. It’s only a matter of time before they ban the “person” rather than the account. Another reason to be against identity verification.
coder903@reddit
Codex will work but has a different feel to it. I’ve been building an OpenCode version using Kimi K2.5 as orchestrator/coder, GLM5.1 as planner/reviewer, QWEN3.6 Plus as explorer/researcher. OpenCode will take some configuration and tweaking. So my plan in your situation would be Codex but use it also to help you with configuring your OpenCode system. Then you will have a decent backup should something happen with Codex and also a little peace of mind. One more thing — Don’t do what I did at first and use GLM5.1 for everything. Although good. It will eat you up in costs and be way too slow.
Savantskie1@reddit
The reason you got banned is because they were thinking you were trying to distill from Claude. So instead of messaging you, they just banned you. The same old thing, you get use out of their stuff, you didn't use it like they wanted, (in your case education) instead of strictly code like they want, so they banned you to get rid of your "training" dataset. (I understand it most likely wasn't)
Statcat2017@reddit
In think they’ve just started culling users who cost more than they pay.
Those sob stories about people using 27k of compute on a 200 dollar sub didn’t come from nowhere.
I don’t think they can just rate limit you because that would expose just how expensive LLMs are
Texasaudiovideoguy@reddit
Use a diff email, refresh your wan ip. Problem solved. Done it twice
FederalAnalysis420@reddit
with cline and openrouter i think you could kind of rebuild claude code in vscode. openrouter still gives you sonnet/opus even if anthropic banned you, and cline handles the agentic file/terminal stuff the same way.
kiilkk@reddit
Ollama Launch Claude with glm5.1:cloud (20 Dollars per Month), very similar to your original setup. You can try ollama launch hermes as alternative with same account
ReasonableBenefit47@reddit
Use Kimi much better
atharvat80@reddit
Opencode + GitHub Copilot, you can customise the config so that it uses Opus for planning/orchestrating and Sonnet for executing. Should give you pretty similar experience.
candreacchio@reddit
I’ve also been using it as: • a personal assistant • a social media / brand manager
Just remember Claude Code is only supposed to be for coding purposes, not personal assistanting purposes?
LaRosarito@reddit
Yo uso codex y trae en modo Solo Builder y estoy súper contento
face_throne@reddit
For writing Gemini and for reasoning codex 5.2 (non codex)Not local models but you can use their paid plans
voitiksde@reddit
I've tried to replace Claude subscription with open weight models, but as many said, for me even GLM 5.1 wasn't close enough to compete. I enjoyed using GLM for planning and Qwen 3.5 to execute from Ollama Pro plan, but I needed to babysit them much more than Claude (or even GPT). I'd recommend either checking Codex (GPT models doesn't feel like Claude but for me it's the smartest among others for programming and reasoning) coupling with Github Copilot. There is a pay per request, so it's fine for implementing big specs for me and you can switch between Claude / GPT (and others) just to test them out.
For me personally switched from Claude Code, and I use Claude / GPT (with gpt sub + github copilot), which costs 60$ per month (saving 140$ of Claude), and I could use it for development, for full month. Now there is Opus 4.7 with the higher multiplier on requests usage, but 4.6 / 4.5 or Sonnet is still affordable there imo
AndreasWolff@reddit
OpenCode Go? For GLM 5.1 + Zen for API access to Claude?
sergeialmazov@reddit
It’s always good to have an alternative plan
ThankThePhoenicians_@reddit
Try the GitHub Copilot CLI. Claude models are available via the subscription, as well as OpenAI models. You can also bring your own key/models that you host locally/anywhere else.
zdy132@reddit
It even supports openrouter, very easy to switch models with that.
Negative-Walrus-7490@reddit
Usa crickcoder e integrato totalmente in Vscode, non e un estensione ha qwen 3.5 free e come big glm5.1 e opus 3.7. lo scarichi qui https://crickcoder.com/download
redditorialy_retard@reddit
Claude code is open source now. use codex or qwen 3.6 on those
Last_Fig_5166@reddit
Right, what do you drink before commenting?
vhthc@reddit
The code of Claude code was leaked and reimplemented in python and rust. Also shows how shitty the code quality and prompts are of Claude code which is no surprise when you look at cli terminal benchmarks where it is on the bottom list.
Codex and Junie are better cli, junie you can use with anthropic too
leonbollerup@reddit
Claude code .. best replacement.. warp.dev, opencode..
claude (the AI interface) can be replaced by most.. but actually best chat interface.. open webui.. its magic good compared to everything else
cyberspacecowboy@reddit
OpenCode up front, github copilot as provider in the back. Pick any model you like
leo-k7v@reddit
Anyone playing with llama-agent?
Responsible_Buy_7999@reddit
Cursor
unique-moi@reddit
You can keep right on using Claude code cli - the desktop software app can be used as the cli front end to non-Anthropic LLM. The two things necessary are that you set the environment variables (to give it the right url and model name, and unset the api key) and that the url speaks Anthropic API (by using a vLLM or oMLX model runner, or a litellm proxy). Ask Google how to do it. You could, for example, point your Claude code cli at an openrouter subscription and use paid or free models - including opus & sonnet if you want.
blakok14@reddit
Opencode, es la mejor alternativa Open source puedes poner muchísimos modelos de IA incluidos locales configurar mcp skills etc, de hecho a mi me gusta más
BidWestern1056@reddit
npcsh/incognide with ollama cloud
https://github.com/npc-worldwide/npcsh
https://github.com/npc-worldwide/npcpy