qwen3.6 performance jump is real, just make sure you have it properly configured

Posted by onil_gova@reddit | LocalLLaMA | View on Reddit | 300 comments

qwen3.6 performance jump is real, just make sure you have it properly configured

I've been running workloads that I typically only trust Opus and Codex with, and I can confirm 3.6 is really capable. Of course, it's not at the level of those models, but it's definitely crossing the barrier of usefulness, plus the speed is amazing running this on an M5 Max 128B 3K PP 100 TG.

Just ensure you have `preserve_thinking` turned on. Check out details [here](https://www.reddit.com/r/LocalLLaMA/s/oy3jLNbSkB).