Samplers in llama.cpp

Posted by kaisurniwurer@reddit | LocalLLaMA | View on Reddit | 4 comments

I often play with samplers and text template with llama.cpp, but recently I found that newer models are very repetitive in their output, I chucked it to a stricter training and moved on.

Now I decided to give gemma 4 a go, and the 26B A4B was looping so I started by checking smaplers since I often run with weirder settings but not matter what I changed, the output did not change.

Even setting it to the extreme values, like temp 1000 with no other samplers, the output is coherent, which no matter what, it should not be.

Is it me, or are samplers somewhat broken?

[-]

llama-impersonator@reddit

gemma seems unusually confident in the top output. that is probably why changing the temperature doesn't have much effect.

[-]

That's exactly what I assumed to be happening before, but at extremely high temperature and no culling, the output should be pretty much random gibberish, but instead it produces pretty much the same response, almost like temperature was ignored and set to 0.

[-]

Velocita84@reddit

Llama.cpp sets top p and min p defaults if you don't

[-]

kaisurniwurer@reddit (OP)

I generally do set it.

I also tried setting minP to 0 and topP to 1 in the direct gui with seemingly no effect.