how to preserve gemma 4 thinking trace

Posted by Qwoctopussy@reddit | LocalLLaMA | View on Reddit | 20 comments

how to preserve gemma 4 thinking trace

how can i prevent discarding the thinking trace?

llama.cpp (b8858) serving gemma 4 31b (UD-Q6_K_XL), (almost) vanilla pi harness

got some flags here and there on llama-server, nothing relevant, but adding --jinja and --chat-template-kwargs ‘{“preserve_thinking”: true}’ didn’t seem to change it