Context checkpoint erasure in llama.cpp ?
Posted by SimilarWarthog8393@reddit | LocalLLaMA | View on Reddit | 7 comments
Has anyone been able to solve or mitigate context checkpoints being erased during single user inference, specifically when function calling is part of the chat history? I've been using Qwen 3.5 35B A3B for some time (now using 3.6), tested in Cherry Studio & Open WebUI, and in all instances in the same chat session between prompts there are always checkpoints being erased. Is this because tool call content is not being passed back? I thought it could also be the CoT content not being preserved but even with preserve_thinking: true for Qwen 3.6 I get the same issue.
I use 128 checkpoints and 16GiB cache RAM so I'm not running out of checkpoints or RAM. Suggestions would be appreciated (:
erazortt@reddit
I have seen that same behavior also when I just use llama frontend. Perhaps you can add your experience with that issue:
https://github.com/ggml-org/llama.cpp/issues/21903
SimilarWarthog8393@reddit (OP)
I see aldehir's explanation, I'll take some time to test with tool calls in instruct mode (no CoT) to reproduce per your comments, thanks for sharing 🙏🏽
Awwtifishal@reddit
Make sure Open WebUI is not trashing your context because of title generation, tag generation, or any other of these shenanigans.
SimilarWarthog8393@reddit (OP)
Thanks for the suggestion, I've got all of that disabled 👍🏽
Awwtifishal@reddit
You can also select a different model to do some of those, like the title.
Local-Cardiologist-5@reddit
experament with the context checkpoint amounts. i have mine at 20
Awwtifishal@reddit
Note that the default is 32, so 12 or 20 would make things worse, if anything.