Note the new recommended sampling parameters for Qwen3.6 27B

Posted by Thrumpwart@reddit | LocalLLaMA | View on Reddit | 23 comments

We recommend using the following set of sampling parameters for generation

Thinking mode for general tasks: temperature=1.0, top_p=0.95, top_k=20, min_p=0.0, presence_penalty=0.0, repetition_penalty=1.0
Thinking mode for precise coding tasks (e.g. WebDev): temperature=0.6, top_p=0.95, top_k=20, min_p=0.0, presence_penalty=0.0, repetition_penalty=1.0
Instruct (or non-thinking) mode: temperature=0.7, top_p=0.80, top_k=20, min_p=0.0, presence_penalty=1.5, repetition_penalty=1.0

These are different from 3.5 so I thought I would draw your attention to them.

[-]

GregoryfromtheHood@reddit

Very glad they're recommending 0.0 presence penalty now for thinking. The old 1.5 and even 1.1 was giving me so many issues.

[-]

Caffdy@reddit

what is the difference thou? what does presence penalty do in the first place?

[-]

GregoryfromtheHood@reddit

It's supposed to punish repetition, so it's supposed to help with the looping. But I guess because the model wants to repeat some tokens, when it can't it goes into a loop. That's my guess anyway.

[-]

david-deeeds@reddit

isn't "repetition penalty" the one that punishes repetition?

[-]

Shadowfita@reddit

They both do. Two different parameters with the same-ish outcome, that target different areas of the model is my understanding.

[-]

Unless something has recently changed, they're the same, just that presence penalty is a boolean ("Does the token already exist? If yes, apply penalty x."), whereas repetition penalty is numerical ("Count how often the token exists already, apply penalty x for each existence").

[-]

Shadowfita@reddit

Ahhh! That would make sense. Thanks for that distinction.

[-]

david-deeeds@reddit

Thanks!

[-]

LeonidasTMT@reddit

For the Moe model I had to turn up presence penalty because otherwise it would go into loops

Either the same line repeating except for a final word. Or a larger loop of repeating logic.

[-]

Shadowfita@reddit

Yes me too. I was finding that if I was giving an agent a task and provided it with an ID, for example, it would seemingly exhaust it's "limit" of repetition for the actual value I wanted it to include in the final output and would change it slightly making it incorrect.

[-]

kroggens@reddit

why not temperature==0.0 for coding?

[-]

DefNattyBoii@reddit

so you can reroll the dice on a shit diff, vibecodeing goes brr

[-]

Evening_Ad6637@reddit

I think the recommended params are not very good. I’ve tested around and found these params better:


ctx-size         = 128000


temp             = 1.0


top-p            = 1.0


top-k            = 25


min-p            = 0.2


presence-penalty = 0.1


repeat-penalty   = 1.05


chat-template-kwargs  = { "enable_thinking": true }

[-]

HiddenoO@reddit

Some of these frankly make little sense. E.g., the presence penalty becomes fairly pointless if the repeat-penalty is ten times as high since the latter also applies on the first presence.

How did you obtain these?

[-]