You are not doing it wrong, though it is needed for base model and SFT-only models. Almost all instruct model has some form of repetition penalty baked in (such a n-gram repetition penalty) for their post-training RL pipeline. A little bit of repetition penalty also may help reducing thinking loop if needed.
You're probably good. Most models (especially reasoning models) recommend a repeat penalty of 1.0. If you're concerned, you can likely just search the model you're using and find the officially recommended repeat penalty, or if that's not available, unsloth may have it in their docs.
no, that's the right way to do it. rep pen can really damage model outputs if it's set to overly agressive values. unless there is a lot of repetition, leaving it a 1 is correct.
ClankerCore@reddit
You’re an interesting fellow. I follow.
AurumDaemonHD@reddit (OP)
❤️
Ok_Top9254@reddit
You shouldn't touch repetition_penalty and instead use DRY. The penalty lobotomizes the model and might still not work.
MoffKalast@reddit
POV: You are using a IQ2_XXS quant.
Busy-Group-3597@reddit
This is what I normally do.… am I doing it wrong ? Never faced a problem though
NandaVegg@reddit
You are not doing it wrong, though it is needed for base model and SFT-only models. Almost all instruct model has some form of repetition penalty baked in (such a n-gram repetition penalty) for their post-training RL pipeline. A little bit of repetition penalty also may help reducing thinking loop if needed.
ayylmaonade@reddit
You're probably good. Most models (especially reasoning models) recommend a repeat penalty of 1.0. If you're concerned, you can likely just search the model you're using and find the officially recommended repeat penalty, or if that's not available, unsloth may have it in their docs.
LagOps91@reddit
no, that's the right way to do it. rep pen can really damage model outputs if it's set to overly agressive values. unless there is a lot of repetition, leaving it a 1 is correct.