Your experiences in the wild with Kimi K2.6 vs. other open source models
Posted by itsstroom@reddit | LocalLLaMA | View on Reddit | 7 comments
I use Kimi K2.6 over Opencode Go and it tends to reason too long about trivial tasks and burns tokens like there's no tomorrow. Is it just me or does this model shine in benchmarks and is not that good afterall? I still use GLM5 for daily tasks for my homelab and it works really well.
DependentBat5432@reddit
Kimi burns token in simple stuff but handles the hard ones surprisingly. I use it for heavy tasks and let cheaper models handle the rest, saves a ton for me
Academic-Map268@reddit
Kimi K2.6 agent is INSANE for deep research.
It thinks for 15 minutes, does hundreds of searches and comes back with nice charts and graphs.
Lost-Health-8675@reddit
I downloaded it. Now it sits on my hard drive waiting for the better times when I will have a machine to run it
Joozio@reddit
Token burn on trivial tasks is something I keep hitting on closed models too. Opus 4.7 fires roughly 80x more requests than 4.6 for similar work according to AMD's github analysis, and the new tokenizer adds about 35 percent overhead on identical files. Same shape as what you are seeing with K2.6, just a different bill. GLM5 has been my classifier of choice on local for a few weeks, agree it punches above its weight. Have not run K2.6 long enough to call it though.
korino11@reddit
Kimi doesnt burn tokens at all. In Coding plan you pay for api calls.. feel the difference!
Unable-Jelly6228@reddit
GLM 5.1 to build, kimi for debug. That is what I use
RepulsiveRaisin7@reddit
Tried it briefly with Ollama Cloud and also not impressed. Overthinks and fucks up just the same as GLM.