The more I use it, the more I'm impressed

Posted by ComfyUser48@reddit | LocalLLaMA | View on Reddit | 94 comments

Qwen 3.6 27b vs Codex GPT 5.5 / Claude Opus 4.7

My local llm discovered a bug that they both missed

And it turns out it's critical

GPT 5.5 and Claude both stood their ground and didn't give up until the end - they claimed to be right all along.

I told my Qwen to provide detailed proof of his arguments, brought the evidance to both of them, and only then came their admission.

Qwen 3.6 27b thinks a lot. That can be both a good and a bad thing. In this case, the long thinking actually discovered a bug neither of the frontier models couldn't find.

GPT 5.5 is FAST. Really fast. But in reality as I found out, it comes with a big tradeoff.

[GPT 5.5 admission](

[Claude Opus 4.7 admission](