GPT 5.5 just leaked its chain of thought to me in codex, and it looks like an idea from 5 months ago in this sub.

Posted by Homeschooled316@reddit | LocalLLaMA | View on Reddit | 59 comments

https://www.reddit.com/r/LocalLLaMA/comments/1p0lnlo/make_your_ai_talk_like_a_caveman_and_decrease/

In the middle of a project I'm working on, I got this output:

Implemented the narrower fix in Homm3ImportUnitPreviewModelHook.cs? Need absolute path. Need know cwd absolute. v:... Use markdown. final with path. Need avoid bogus path. Use Homm3ImportUnitPreviewModelHook.cs? Format requires /abs/path. Windows abs maybe v:.... Use angle. Final no too long. Need include uncommitted. Proceed.

[-]

jakegh@reddit

Interesting to see behind the curtain; this is why GPT-5.5 uses less tokens, it's RL'd to be extremely terse in its CoT, almost "caveman mode", avoiding unnecessary verbage.

[-]

psycoee@reddit

It's a pretty obvious idea, especially since the attention mechanism generally gives a lot more weight to the stuff in the last few thousand tokens. So you want to fit as much in there as possible and caveman mode is the way to do it. Many small models perform much better if you make them use caveman speak.

[-]

jakegh@reddit

Yeah. There are all sorts of alignment issues with RL on CoT such that it’s actually been called “the forbidden technique”. It is EXTREMELY dangerous.

But maybe if you reinforce on CoT length, rather than content, those don’t apply. Still makes me nervous and I’d like to see research on it.

[-]

jakegh@reddit

After looking it up a bit, there are papers on making CoT terse in what they describe as a safe way avoiding the forbidden technique via adaptive reasoning budgets in training or reinforcement but I didn’t find any studies looking at the resulting faithfulness of that terse CoT.

Which makes me kinda nervous.

[-]

psycoee@reddit

I mean you have to be able to control the CoT process in any case. Existing models are generally trained to think in a particular language, limit the length of the thought process to something reasonable, follow instructions, etc. Simply scoring shorter thoughts higher is not really training on CoT, and to the extent the model can game it, it is what you want (the same quality output with fewer thinking tokens).

[-]

jakegh@reddit

Yes that’s how the papers describe it, either a hard or adaptive thinking budget. But none of them evaluated CoT faithfulness after doing so.

[-]

TheRealMasonMac@reddit

https://huggingface.co/datasets/microsoft/OpenMementos is an interesting alternative

[-]

damhack@reddit

The CoT traces output in the context are not the actual reasining traces used, they’re just what the LLM thinks you want to see. Therefore you cannot draw any conclusions from them.

Demonstrated by Anthropic’s research here:

https://assets.anthropic.com/m/71876fabef0f0ed4/original/reasoning_models_paper.pdf

[-]

ddavidovic@reddit

Yeah, you can observe this very clearly when reading the gpt-oss-120b chains of thought. It presumably used a similar training regime.

[-]

Toastti@reddit

That's not 5.5 directly. It's chain of thought is passed through another small LLM before. They almost certainly want to save tokens here so instruct the smaller LLM to be as consice as possible

[-]

jakegh@reddit

What do you base that on? Why would the big model spend even more tokens to shorten what it passes to the summary model? That makes no sense. Logically, this is more likely to be the actual CoT.

[-]

StupidScaredSquirrel@reddit

No, because CoT is an asset that you want to keep secret to prevent distillation. Passing it through a 2b model to summarise and obfuscate the CoT is dirt cheap and keeps your moat intact.

[-]

jakegh@reddit

Obviously you want to keep it secret. The OP is saying it leaked

[-]

4whatreason@reddit

Yep agreed, summarized cot does not replace the actual cot, that is passed back and forth encrypted to be secret.

Can't remember exactly why but I think they sometimes require reasoning to not be modified/dropped in certain situations too

[-]

dataexception@reddit

I think you're referring to speculative decoding. A prompt is filtered through a smaller model for preprocessing.

[-]

Clueless_Nooblet@reddit

I don't know, but I believe distilling "Grmny only ctry wo tmplmt Atbn" after language is stable.

[-]

MisticRain69@reddit

I kinda think this is what they do with claude, claude has the typical qwen/deepseek "actually I think its x...WAIT it might actually be y.." COT

[-]

ayylmaonade@reddit

You aren't understanding what the user is saying. Yes, CoT is summarized by smaller models, but only externally, and it is not at all uncommon for models to accidentally leak their "true" chain of thought, especially if they mess up closing the think tag. Not to mention that Gemini regularly leaks its actual CoT all the time, even to this day. And what do you know? It's identical to the CoT Gemma 4 has. The sort of cave-man like speak here reminds me of GPT-OSS, its quite similar. I can totally see this being the actual CoT.

The idea you're proposing literally costs more compute because you'd have to filter every single token from the model CoT, summarize it, then feed it back to itself instead of doing a single pass on the final CoT. It'd also degrade the hell out of performance.

[-]

holy_macanoli@reddit

There’s a subagent that gets called to create the title of each thread in codex so why not this?

[-]

jakegh@reddit

You're saying the main model has its real CoT, which it then passes to a small model to turn into a caveman CoT, just so that caveman CoT can be summarized by another small model. This does not make any sense.

[-]

alwaysbeblepping@reddit

You're saying the main model has its real CoT, which it then passes to a small model to turn into a caveman CoT, just so that caveman CoT can be summarized for humans to read by another small model. This does not make any sense. Occam's razor, this is the real CoT.

Not the same person, but you're making quite a few assumptions here. If you're seeing the CoT and you're not supposed to, that means something went wrong and there are a number of possibilities:

It's just as you said, the raw CoT from the larger model was directly shown, and your assumption that this is typical CoT for that model also is the case.
The raw CoT from the larger model was directly shown, but it's not the typical CoT for the larger model. After all, the heuristics for detecting/separating CoT failed which could mean that it was in an unexpected format/garbled (due to technical issues, due to context size too high and the model producing nonsense, etc).
The smaller summarization model is the one that failed and started producing caveman speak rather than actually summarizing. Like you said, it wouldn't be too likely for this to be a deliberate thing, but we know something went wrong if we're seeing the CoT we shouldn't so failure at this point isn't impossible.

Not an exhaustive list, there are probably other plausible possibilities.

[-]

jakegh@reddit

Those are all possible, but seem less likely than the simple explanation.

[-]

alwaysbeblepping@reddit

Those are all possible, but seem less likely than the simple explanation.

How so? I mean, in a vacuum maybe but this is a system designed to hide the original CoT from the user. If someone is seeing it, there's a reason, right? It's not everyone that's seeing it, just an isolated case with a specific user so it stands to reason that something unusual is happening.

Occam's Razor is a useful reasoning technique but it doesn't mean we should ignore the context that things happen in when we try to come up with an explanation.

[-]

jakegh@reddit

I agreed those scenarios were also possible. They just seem less likely, and we don't have the info to evaluate further.

[-]

Howdareme9@reddit

Anthropic does the same, Haiku summarizes it

[-]

jakegh@reddit

No kidding. The question is whether the real CoT would be cut-down with a second small model just for the summary model. And the answer is no.

[-]

IrisColt@reddit

It actually works. Gemma 4 handles problematic prompts much better when you really compact them first.

[-]

Jolly-Rip5973@reddit

yes they hide the chain of thought from you because it's usually completely convoluted. It's amazing anything correct ever happens because of how gimpy the chain of thought actually is when you read it.

[-]

cantgetthistowork@reddit

That's how I talk to people who I think are stupid

[-]

AdventurousVast6510@reddit

me too. i talk like this to people who dont speak english very well

[-]

florinandrei@reddit

SAVE MORE TOKENS is the prime directive now for all these companies, because of the huge demand.

[-]

Ha_Deal_5079@reddit

the compressed CoT thing is wild. wonder if theyre using a distilled model to condense the reasoning or just aggressively prompting for token efficiency

[-]

ayylmaonade@reddit

I'm pretty sure that's just OpenAI's reasoning. Go try GPT-OSS for example, it's very "cave-man" like in its way of deliberating to itself. It's probably how they're able to make the claims that 5.5 uses significantly less tokens.

[-]

TheRealMasonMac@reddit

Deepseek is also similar, though not as compressed.

[-]

joshualander@reddit

This isn’t new. If you’re using GPT in Hermes Agent and you interrupt a complex task with a simpler one, you’ll get caveman CoT 😊

[-]

eworker8888@reddit

Agents are often multiple LLMs, one that excels in summarization, another one that excels in code, and so on. It can be the same LLM, or different quantization’s of the same, or totally different llms

In our Agent (not codex), you can enable extensive logging, and look at the entire chain (the entire process)

You can always proxy any agent through tools and capture the entire communication and review it.

[-]

danielv123@reddit

For your last sentence, well no, you can't. OpenAI does the summarization on their server side, and you don't have access to capture the original tokens there.

[-]

eworker8888@reddit

in E-Worker everything is on the client side

[-]

TheThoccnessMonster@reddit

We’re not talking about your project dude

[-]

the_renaissance_jack@reddit

But they were dude.

In our Agent (not codex), you can enable extensive logging, and look at the entire chain (the entire process)

[-]

brahh85@reddit

thats like speculative prefill https://www.reddit.com/r/LocalLLaMA/comments/1t0vp3w/pflash_10x_prefill_speedup_over_llamacpp_at_128k/ but being speculative prefill more fancy

[-]

RevolutionaryLime758@reddit

No it’s not dumbass

[-]

cpldcpu@reddit

the CoT of gpt-oss also looks like that. It seems that openai always trained for brief language in the CoT.

See also examples here: https://nousresearch.com/measuring-thinking-efficiency-in-reasoning-models-the-missing-benchmark

[-]

No_Hunter_7786@reddit

Interesting catch. Chain of thought leaking through is always funny to see. Caveman prompting actually making it into production GPT is kind of wild if true

[-]

ResidentPositive4122@reddit

That has nothing to do with caveman prompting. oAI have been training their models for succint CoT for a while now. Even gpt-oss shows this. Compare its traces with any of the open models and you'll see these patterns.

oss: do math. try this. no. ok now try this. might work. need python.

qwen: so the user is asking us to consider doing some math on this project that reminds me of a time where i was learning math doing the theorem discovery with my uncle on a shiny hilltop. I will try to... no, wait what if... but is this correct? I need to... Wait, now I see it, it wasn't a sunny hilltop it was rainy and I was cold. Wait, no, I was hot.

[-]

yopla@reddit

Careful about hilltops, goblins nearby.

[-]

Normal-Ad-7114@reddit

BUT WAIT

[-]