Qwen3.5 27B refuses to stop thinking

Posted by liftheavyscheisse@reddit | LocalLLaMA | View on Reddit | 30 comments

I've tried --chat-template-kwargs '{"enable_thinking": false}' and its successor --reasoning off in llama-server, and although it works for other models (I've tried successfully on several Qwen and Nemotron models), it doesn't work for the Qwen3.5 27B model.

It just thinks anyway (without inserting a tag, but it finishes its thinking with ).

Anybody else have this problem / know how to solve it?

llama.cpp b8295