What agentic cli do you use for local models ?

Posted by siegevjorn@reddit | LocalLLaMA | View on Reddit | 12 comments

title says all—are there any notable difference among them? i know claude code is industry standard. opencode is probably the most popular open source project. and there is crush from charm. can gemini-cli & claude code run local agents? my plan is to spin up llama.cpp server and provide the endpoint.

also have anyone had luck with open weight models for tasks? how do qwen3.5 / gemma4 compare to sonnet? is gpt-oss-120b still balance king? or has it been taken over by qwen 3.5 /gemma4? i wonder if 10-20 tk/s is ok for running agents.

finally for those of you who use both claude / local models, what sort of task do you give it to local models?