Running Claude Code + Codex + a local 35B as a three-tier agent stack. Why I stopped betting on one provider.
Posted by Joozio@reddit | LocalLLaMA | View on Reddit | 3 comments
Two months ago I simplified to Claude Max 20x only. $200/month, one CLI, one model. For my autonomous agent it covered everything.
Last week I renewed Codex Pro at another $200/month after Opus 4.7 made my default workflow untenable. Now I run three tiers:
- Claude Code > tuned skills, Anthropic prompt caching (a 5x cost saver when it hits), familiar hooks
- Codex (GPT-5.4) > better web search, deeper codebase maps, generous usage ceiling
- Local 35B on Mac Mini M4 > classify / route / small-glue tasks, cheap preprocessing
My agent has a dynamic switcher. Same memory, same skills, same routing. Only the harness + model flip. Research-heavy work goes to Codex. Architectural refactors go to Codex. Agent-tuned automation glue stays on Claude Code. Classify and route goes to the local 35B.
The reason for the pivot off Claude-only: Opus 4.7 at default effort is measurably lazier than 4.6. AMD's Senior Director of AI (Stella Laurenzo) filed GitHub #42796 with her team's 6,852-session analysis. Read:Edit ratio dropped 70%, API requests 80x for worse output. The honest caveat: 4.7 at max reasoning still delivers, but max burns 3-4x more tokens and my weekly ceiling arrives on Tuesday instead of Friday.
Local LLMs are my cheap third lane, not a substitute for the frontier. A 35B at Q4 is good enough for classification and small glue, not for architectural refactors.
Net: $300/month in subscriptions plus negligible local cost. Roughly 2x the weekly throughput I had a month ago.
Anyone else running a multi-provider stack with a local preprocessing tier? Curious what splits you have settled on.
Full write-up and the switcher design: https://thoughts.jock.pl/p/opus-4-7-codex-comeback-2026
LocalLLaMA-ModTeam@reddit
Rule 3 - Minimal value post. slop
ps5cfw@reddit
this is local only and you are a goddamn ai slop bot.
OneSlash137@reddit
I know you all think that’s cool and all? But that doesn’t make things better. Each agent is another point of failure…
All it does is make hallucinations exponentially worse.