Qwen models for coding, using qwen-code - my experience

Posted by Undici77@reddit | LocalLLaMA | View on Reddit | 4 comments

Hi all,

For more than three months I've been using Qwen-Code-Cli and Qwen models for my daily coding (C and C++ in the embedded world), and they are pretty good for easy tasks.

My setup is:

- MacBook Pro M4 Max, 128 GB
- LM Studio or oMLX
- Qwen‑Code

I started with Qwen3‑Coder‑30B, then switched to Qwen‑Coder‑Next‑80B, and now I'm trying the new 3.5 and 3.6 models (from 27 B to 122 B).

What drives me crazy is that on paper 3.5/3.6 should be better than 3 (30 B and 80 B Next), but this is absolutely not true! In a single‑shot scenario it may sometimes be the case (more in HTML benchmark), but for long and difficult tasks-especially when using the MCP tool available in Qwen‑Code-Cli, Qwen‑3 works better than Qwen‑3.5/3.6.

In general, Qwen‑3 uses the MCP tools more effectively than Qwen‑3.5/3.6, which often fall into an infinite thinking loop.

I've tried different versions of MLX (4/8/16 bits, oQ formats, Unsloth) with various parameter settings, but nothing helps!

This is very strange and unexpected! Has anyone else experienced the same issue?