The definitive Qwen 3.5 Jinja template
Posted by ex-arman68@reddit | LocalLLaMA | View on Reddit | 30 comments
I’ve been doing a pretty thorough deep dive into the Qwen 3.5 templating logic to properly fix the lingering tool calling bugs. People here have done some really brilliant groundwork, templates from folks like @pneuny and @ellary were absolute lifesavers early on. But I realised that a lot of them rely on forced prompt injections, or accidentally hallucinate the xml formatting (qwen is actually trained on pure <think> tags natively, not the /* syntax some older templates fallback to).
So after many hours of resarching and testing all the known problems with the official qwen template, I carefully wrote the best possible template. It perfectly respects the native xml schema, dynamically maps the newer 'developer' role strings from modern api clients, and safely caches empty tool parameters.
Just as a side note for anyone specifically using LM Studio: the backend throws an error over python |items dict iterators, and the regex parser completely borks if the model just ponders about a tool call inside its thoughts. I’ve integrated targeted fixes for this into the jinja too. If you write <|think_off|> anywhere inside your prompt (both system or user), the template invisibly scrubs the tag and hard-disables thinking for that turn, completely bypassing the infinite loop tool bug.
Im hoping the architecture here is solid enough that it should still be valid for the soon to be released Qwen 3.6 models. Let me know if you run into any weird behaviour.
You can get the template from here:
onil_gova@reddit
Consider incorporating the chat template fix that avoids empty historical
<think>blocks to prevent cache invalidation in long-running agentic workloads.https://huggingface.co/Qwen/Qwen3.5-122B-A10B/discussions/22
https://www.reddit.com/r/LocalLLaMA/comments/1sg076h/i_tracked_a_major_cache_reuse_issue_down_to_qwen/
ezyz@reddit
Wouldn't this change the history in a way that's subtly different that what the model saw during chat training?
onil_gova@reddit
Not at all. Same behavior without the unnecessary context invalidation.
PlasticMaterial9681@reddit
The test results were excellent. Thank you.
ex-arman68@reddit (OP)
Thank you for the information. I have incorporated the fix in the jinja template, and updated the HF file. This only required a small change to 1 line. Readme will be updated soon.
Please re-download.
phhusson@reddit
Not so definitive eh. It's software, it's okay to say it's forever evolving
ex-arman68@reddit (OP)
:-D very true
Imaginary-Unit-3267@reddit
Maybe this is a stupid question, but, how do I actually use this in llama.cpp?
Plenty_Bug4945@reddit
- --temp
- "0.6"
- --top-p
- "0.8"
- --top-k
- "20"
- --min-p
- "0.05"
- --repeat-penalty
- "1.0"
- --chat-template-file
- /templates/chat_template.jinja
Imaginary-Unit-3267@reddit
Ahhh! My problem was that I was passing
--chat-templaterather than--chat-template-file. :face_palm: Thank you!!!milpster@reddit
Haha me too, thank you both for clearing that up.
Imaginary-Unit-3267@reddit
Update: I tried this template and it just outputs tool calls directly to me instead of actually calling the tools. Qwen3.5-35B-A3B, at least, is still broken with this template.
Direct_Technician812@reddit
Good, I run Qwopus 27B v3 iq4_xs + llama.cpp server + Opencode works very smoothly without errors. Call tool is perfect. Tks
--chat-template-file chat_template.jinja
ElSrJuez@reddit
Is this for general cpp or specifically for LM Studio?
ex-arman68@reddit (OP)
It is a must for LM Studio user, but it also applies to all.
hilycker@reddit
I'm still wondering why Qwen devs won't bother fixing all this at the source..
CATLLM@reddit
I think you might just have saved Qwen3.5 family of models! Thank you for your work!
daniele-bruneo@reddit
This! Finally...
Thanks
Mkengine@reddit
Does it only work for MLX?
ex-arman68@reddit (OP)
No, the jinja chat template is the same for format.
Imaginary-Unit-3267@reddit
I tried this template and it just outputs tool calls directly to me instead of actually calling the tools. Qwen3.5-35B-A3B, at least, is still broken with this template.
ex-arman68@reddit (OP)
Great to know! Thanks for the report.
Dazzling_Equipment_9@reddit
Thank you for sharing. I have seen so many templates, and no one has ever integrated and described them clearly like you.
SaroTaz@reddit
Is it also possible to use it with smaller Qwen3.5 models?
ayylmaonade@reddit
Yes. They all use the same template, so it should be fine.
One-Replacement-37@reddit
Is this to fix Qwen stopping randomly after a couple of turns in Opencode, and having to send him a "go" or empty prompt to have him keep going? Thanks a lot\~
VicemanPro@reddit
This has been my biggest issue too.
Icy-Degree6161@reddit
Thanks, I'll check it out
soyalemujica@reddit
This template works with Qwen3-Coder-next?
ex-arman68@reddit (OP)
Not sure, I think this is based on Qwen 3 not 3.5 which has some changes that required a different template.