Does 'preserve_thinking' work with openwebui?

[-]

ayylmaonade@reddit

So, I just found this thread because I noticed Qwen's performance had been a little more spotty than usual, then tracked it down to an issue with preserve_thinking. Turns out, it's actually just any version of Open-WebUI past 0.9.2. I just ended up pulling that version and I'll be staying on it for the forseeable future.

Reply

[-]

HuskyTheSniffer@reddit

Even without preserve thinking, afaik openwebui always injects the thinking from previous turns See [openwebui doc](https://docs.openwebui.com/features/chat-conversations/chat-features/reasoning-models/#configuration--behavior)

Reply

[-]

sterby92@reddit (OP)

Hm, in my tests it doesn't seem to work 🤔

Reply

[-]

BankjaPrameth@reddit

Just tested with Qwen 3.5 397B which has no preserve\_thinking support and it works https://preview.redd.it/c6u7hb53in0h1.jpeg?width=1206&format=pjpg&auto=webp&s=80e158a866f90a9f8f8b351d757161a194a8e673

Reply

[-]

HuskyTheSniffer@reddit

Hmm, i tried with gpt oss, I asked it to output the full reasoning of earlier turns exactly, eord for word, and it was able to do that

Reply

[-]

Synthetic451@reddit

I tried the two number test in OpenWebUI and it did not work without adding preserve\_thinking to chat\_template\_kwags

Reply

[-]

sterby92@reddit (OP)

for me it also did not work with the preserve\_thinking in chat\_template\_kwags. But in the native llama.cpp WebUI it worked...

Reply

[-]

nickless07@reddit

You content does look like more a high temp rather then 'preserved' anything. Maybe it will tell you locking in a number after 10 more turns too. Can you log the incoming token? As for me it works as expected with all reasoning content send back to the model each turn. I even wrote a script to stip the CoT as it bloated the ctx too much. [https://docs.openwebui.com/features/chat-conversations/chat-features/reasoning-models](https://docs.openwebui.com/features/chat-conversations/chat-features/reasoning-models)

Reply

[-]

sterby92@reddit (OP)

It works 100% of the time with the same configuration with the llama.cpp web interface. What version of openwebui are you running? Might be broken recently

Reply

[-]

nickless07@reddit

0.92 and the full thing get send: Received request: POST to /v1/chat/completions with body { "stream": true, "model": "qwen3.6-35b-a3b", "messages": \[ { "role": "user", "content": "hmm do you know whow to deal with \\"8197a522-c63f-4681-8ab0-58c558af5ef9\\" ?" }, { "role": "assistant", "content": "<think>The user is asking about a specific ID: \\"81... <Truncated in logs> ...not have this ID. Let's check.\\nI will call</think>" }, { "role": "user", "content": "hmm" } \], "tools": \[ Let me update and test again.

Reply

[-]

sterby92@reddit (OP)

It was resolved, see the updated post. TLDR: provider of the connection needs to be changed to llama.cpp to support recent changes

Reply

[-]

nickless07@reddit

Yeah, thanks. Hehe i wouldn't have noticed it for quite some time too.

Reply

[-]

nickless07@reddit

Looks a bit different now. { "role": "user", "content": "\[08/05/2026, Friday, 05:09:28 PM\]\\nhmm do you know whow to deal with \\"8197a522-c63f-4681-8ab0-58c558af5ef9\\" ?" }, { "role": "assistant", "content": "<details type=\\"reasoning\\" done=\\"false\\">\\n<summary>T... <Truncated in logs> ... ID. Let\'s check.\\n\> I will call\\n</details>" }, { "role": "user", "content": "\[11/05/2026, Monday, 06:35:26 PM\]\\nhmm" } \], "tools": \[ I stopped generation mid thinking, so "role": "assistant" only contains CoT, no finished reply. However full content still get send. Perhaps a parsing error? https://preview.redd.it/lonfbtu5ej0h1.png?width=2391&format=png&auto=webp&s=597f2ebe859645418f75d1427a304797a649d6cd

Reply

[-]

AdamLangePL@reddit

I have forked openwebui and added some features loke context compaction and progress bar with usage and tps speed :) let me check preserve thinking

Reply

[-]

Medium_Chemist_4032@reddit

Some heroes don't wear capes - they eat pierogi

Reply

[-]

WyattTheSkid@reddit

Polish moment

Reply

[-]

NoStage9115@reddit

https://preview.redd.it/eerchbaqdj0h1.jpeg?width=1280&format=pjpg&auto=webp&s=9a9d5cfaefd266ef4df445c39e0db2d0a1578b13

Reply

[-]

AdamLangePL@reddit

Yup!

Reply

[-]

apetersson@reddit

thanmk you for your service! do you have a link to your fork? is OWU main kinda slow with rolling out such obvious contributions? my video input capable models are waiting to be usable since forever because of OWU not simply passing through the file

Reply

[-]

AdamLangePL@reddit

It was for private use, a bit vibe coded (no time to go deep dive) but works. I will share it later today :)

Reply

[-]

AdamLangePL@reddit

https://preview.redd.it/bm8874hwui0h1.png?width=1211&format=png&auto=webp&s=3c6bb7c7d169a9e254d9888c7186fb6fe213e5b1 Progress and details above the chat box :)

Reply

[-]

sterby92@reddit (OP)

Thanks a lot! Would be great to know 😃

Reply

[-]

TechSwag@reddit

After messing about, I think I see what happened. There was a change to specify what kind of provider type a connection is. Apparently llama.cpp (among others) handle reasoning differently than Open WebUI's "default". You have to switch the provider type to `llama.cpp` so Open WebUI sends the reasoning_content back to llama.cpp properly. [[docs](https://docs.openwebui.com/features/chat-conversations/chat-features/reasoning-models/#path-2--reasoning-captured-into-a-structured-output-array)] After swapping it looks to work now.

Reply

[-]

Synthetic451@reddit

Nice this worked for me as well. Is this settable via an environmental variable?

Reply

[-]

sterby92@reddit (OP)

This is the SOLUTION! 🥳 🥳 Thanks a lot! I will add it to the post!

Reply

[-]

Synthetic451@reddit

It used to work just a week ago. Something broke in the latest update

Reply

[-]

sterby92@reddit (OP)

looks like it... :/ I hope its an easy fix 😃

Reply

[-]

Synthetic451@reddit

Yeah I just tested with the llama.cpp server Web UI and that worked every time. So something definitely broke in OpenWebUI because it used to work reliably there too.

Reply

[-]

TechSwag@reddit

It did. I just tested it now and it seems to not be working anymore, not sure if it's an Open WebUI or llama.cpp issue. For clarity, I tried this when the first PSA/FYI post gained some traction, and it worked fine. I updated Open WebUI just now and no change. Verified through llama-swap's logs that `preserve_thinking` was set to true. I'll rebuild llama.cpp/llama-swap now just in case.

Reply

[-]

TechSwag@reddit

Yeah it's not working unfortunately. Maybe I'm hallucinating, but I could've sworn it was working at one point. I was running the `dev` branch for a short period though so maybe it was a change made in `dev` that never got pushed to prod. I did find [a comment](https://github.com/open-webui/open-webui/issues/23175#issuecomment-4285894634) made by a maintainer saying it was "likely fixed in dev", but then reverted in dev due to an issue, and that it should be instead handled externally (which based on my understanding, is not fundamentally possible lmao). It has been brought up to the maintainers though, see below: https://github.com/open-webui/open-webui/issues/23339 https://github.com/open-webui/open-webui/discussions/23895

Reply

[-]

AltruisticList6000@reddit

Stuff usually works on textgen webui maybe check it there too. If it doesn't fail their either then probably openwebui has some problems.

Reply

[-]

Digital_Soul_Naga@reddit

try it and let us know

Reply

[-]

sterby92@reddit (OP)

I mean, thats what I did I guess 🤔 it seems not to work, but I cannot 100% confirm it yet. And I'm interested to know if that is expected or something wrong on my part

Reply

Does 'preserve_thinking' work with openwebui?

Reply to Post

33 Comments

ayylmaonade@reddit

HuskyTheSniffer@reddit

sterby92@reddit (OP)

BankjaPrameth@reddit

HuskyTheSniffer@reddit

Synthetic451@reddit

sterby92@reddit (OP)

nickless07@reddit

sterby92@reddit (OP)

nickless07@reddit

sterby92@reddit (OP)

nickless07@reddit

nickless07@reddit

AdamLangePL@reddit

Medium_Chemist_4032@reddit

WyattTheSkid@reddit

NoStage9115@reddit

AdamLangePL@reddit

apetersson@reddit

AdamLangePL@reddit

AdamLangePL@reddit

sterby92@reddit (OP)

TechSwag@reddit

Synthetic451@reddit

sterby92@reddit (OP)

Synthetic451@reddit

sterby92@reddit (OP)

Synthetic451@reddit

TechSwag@reddit

TechSwag@reddit

AltruisticList6000@reddit

Digital_Soul_Naga@reddit

sterby92@reddit (OP)