Experience of Qwen 3.5-122b and 3.6

Posted by Impossible_Car_3745@reddit | LocalLLaMA | View on Reddit | 15 comments

I am managing an on-premise llm for my team using 2 x rtx pro 6000.

I haved switched from Qwen3.5-122b -> Qwen3.6-35B-A3B -> Qwen3.6-27b (today :) )

And qwen team does not lie on their benchmark. My experience was just like their benchmark.

1) performance: defintely, qwen3.5 -122b < qwen3.6-35b < qwen3.6-27b
And I have not tested its full knowledge base and I do not clearly remember how good old opus was..but for my task request, Qwen3.6-27B did very well as solid. It's very good.

2) speed and context with mtp & 2 x rtx pro 6000 & fp8

- Qwen3.6-35B-A3B: 512k x 11 & 280 tps

- Qwen3.6-27B: 320k x 6 & 110 tps