Where the goblins came from

[-]

keyser1884@reddit

Definitely seeing patterns with ChatGPT I don’t see with other models. If I launch into a debate, it become unnecessarily combative, and if I ask for an opinion it will always hedge in the same way (here’s what I think, here’s the common opposing view, here’s why that view is ‘not fringe’).

No goblins though…

[-]

Alex_1729@reddit

That's exactly what you should welcome, an opposing view and AI that has a spine and sticks to objectivity, instead of a sycophants.

[-]

Kornelius20@reddit

See the problem is that the AI doesn't really "have a spine" or "stick to objectivity". It's just predicting the next most likely token based on the post-training RL method's reward framework.

Maybe it'll reduce the amount of positive agreement with everything in the AI now but the same method could make the AI more likely to argue against basic facts because it was trained to be combative.

I think this might also give a false sense of security if you see the AI pushing back in some easy to represent task

[-]

Borkato@reddit

As with anything subjective, people will just like it if it seems like it does what you ask, even if “what you ask” is just “push back every now and then”. What some people will call sycophantic, others will call amazing, and what some will call combative others will call objective

[-]

Fit_Whole422@reddit

This is always a dumb take, just argue with actual people instead or do a proper research prompt with the AI if you want challenges on a position.

[-]

Alex_1729@reddit

If this is how you argue with people in real life, then I don't see anyone welcoming it.

[-]

Briskfall@reddit

This reminds me of the study Anthropic did with the Golden Gate Bridge! Finally something fun from OAI.

[-]

KontoOficjalneMR@reddit

Reading the article what's really shocking is that they straight up don't monitor word usage frequency until flagged by users. Which is kinda ridiculously incompetent.

[-]

LeonidasTMT@reddit

It literally says the opposite in the article

The first signs of creatures The first time we clearly saw the pattern was in November, after the GPT‑5.1 launch, although it may have started earlier⁠(opens in a new window). Users complained about the model being oddly overfamiliar in conversation, which prompted an investigation into specific verbal tics. A safety researcher had experienced a few “goblins” and “gremlins” and asked that they be included in the check. When we looked, use of “goblin” in ChatGPT had risen by 175% after the launch of GPT‑5.1, while “gremlin” had risen by 52%.

[-]

KontoOficjalneMR@reddit

Yes. Read it again. They only noticed and run stats when users complained.

[-]

FrostTactics@reddit

I mean, what's the alternative here? They should have a dashboard or something along those lines that logs relative word usage, but it's not like that's going to immediately surface that the word "goblin" is significantly more prevalent. There are probably countless other words with a similar relative increase that are legitimate.

[-]

KontoOficjalneMR@reddit

Yes, pretty much.

There's really a limited amount of words in english language and there are frequency statistics as well. They should not only track outliers in the output, but also changes in the frequency.

There are probably countless other words with a similar relative increase that are legitimate.

Even if they are too cheap to hire manual verificator they could at least ... I don'\t know... Ask AI to check if "racoon" seems like appropriate word in programming context? :P

[-]

LeonidasTMT@reddit

I guess we're interpreting the sequence differently.

The way I see it, users didn’t specifically complain about “goblins”. They complained about the model feeling overly familiar or having odd verbal tics, which is a pretty vague and general kind of feedback to begin with. I mean even the em dash and whatever else is a verbal tics too

That broader feedback triggered the investigation. Then during that process, the team identified specific patterns like “goblins” and “gremlins” and measured their frequency.

So yes, user complaints kicked things off, but the specific word analysis came after they defined what to look for.

[-]

FrostTactics@reddit

So their temporary solution for 5.5 was to include explicit instructions not to use either of those words? I suppose kv-caching should negate that vast majority of the computational expense for handling it this way. Regardless, this seems to add needless additional complexity to the LLMs system prompt for what's at worst going to be a slightly annoying verbal tic.

[-]

JuniorDeveloper73@reddit

wtf its all this nonsense

[-]

Bac-Te@reddit

Can you really hold a flamethrower like that though? Wouldn't that part be hot or only the flaming part is.

[-]

Tommonen@reddit

I bet underlying issue was that troll is a creature, like goblins. This being concentrated to nerdy personality, well nerds tend to troll. But chatgpt is not allowed to troll, so it looked for something similar to troll, which are other creatures like goblins.

Then they trained the goblin to gpt 5.5.

[-]

a_beautiful_rhind@reddit

So this means they're going to remove all the other slop... right?

[-]

FORLLM@reddit

"while most uses of frog turned out to be legitimate."

[-]

PowerBottomBear92@reddit

You mean like the frogs that got fucked by Joseph Smith?

[-]

Luke2642@reddit

I want to tie this phenomena back to an interpretation of Sutton's bitter lesson that seems to have taken hold of AI researchers everywhere.

Sutton clearly said that the efficient and surgical application of compute to search the space of possible solutions will beat hand crafted algorithms. He didn't say scale your compute and try to bake all of the worlds knowledge into weights.

Sutton literally said the exact opposite. He said don't bake in priors! Don't bake in knowledge! He said build a system that discovers the patterns and structure of the world for itself so it can outperform the limitations of hand crafted knowledge! He didn't say scale data. He didn't say scale parameters. He said scale compute, for search.

The latest OpenAI model is an estimated 10T parameters that probably cost a billion dollars to train, specifically to bake in every bit of knowledge and prior humanity has ever said, including goblins.

It just seems wrong from the ground up. If they built a knowledge graph and a reasoning engine they wouldn't have to put goblins in their system prompt. Or, they could have changed the strength of one weight in the knowledge graph database.

I'm not sure Sutton was 100% right, as you have to frame it that Chinese researchers have demonstrated a much more efficient application of less compute to search, or, they have written better hand crafted algorithms, new architectures.

Either way, the fact that trillions of parameters prefer goblins is peak stupid engineering.

[-]