We really need stop using the term “hallucination”.

Posted by cosmobaud@reddit | LocalLLaMA | View on Reddit | 30 comments

Please stop using the word “hallucination”. We really need a better word, because this one actively misleads people.

The word comes from human psychology. It means perceiving something that isn’t there. It carries two assumptions with it. First, that the subject has access to ground truth and is failing to match it. Second, that the subject perceives at all. A person who hallucinates is malfunctioning against a baseline where they normally see the world correctly.

The model has no access to ground truth to begin with. It was never matched to the world, only to text. If an ape can’t do calculus, we don’t say the ape is hallucinating. It simply isn’t the kind of thing that can do the task. The model is in the same position with respect to truth. There is nothing to malfunction away from.

Regardless of what Anthropic peddles to get marketing reach the model doesn’t perceive in the same way that words they are using want you to believe. There is no subject inside it having an experience that has gone wrong. There is a probability distribution over tokens, and a sample drawn from it. “Hallucination” tricks you into making it seem like there is a perceiver where there isn’t one.

Like anything else what the word has become is a marketing term. It’s used because it acknowledges the error while waving it away, and at the same time it quietly sells you on the idea that the model is something more than it is. Something that normally perceives correctly and occasionally slips. The model never perceives, and it never had a correct baseline to slip from.

A warning for anyone new to this. What gets called “hallucination” is happening all the time, in every output, from every large language model. You only notice it when you personally know enough about the topic to catch the error. When you don’t know the topic, the same thing is still happening, you just can’t see it. No large language model is free of this, and none ever will be. The math that produces the next token is the same math that produces the error. Without the error there is no next token at all.

What you are actually seeing is the model’s approximation error showing up in the output. The model’s probability distribution does not match the true one, and that gap has to land somewhere. It is the same error that is in everything else the model says. You only notice it when it collides with something checkable.

That error can come from several places, and they multiply on top of each other.

The model can lack resolution in its internal representations because it is small, meaning not enough parameters and not enough training data to separate fine distinctions.

The data it was trained on can be poorly matched to its parameter size, with the wrong mix or wrong quality or wrong coverage.

Quantization can strip precision out of the weights after training, throwing away resolution the model originally had.

RLHF can introduce a bias that increases the error in some region, because the model was rewarded for sounding a certain way and that reshaping is never free.

Roughly speaking, model size and this error are inversely correlated. Bigger models have sharper probability resolution, so they land on the wrong answer less often. They are not “smarter” they just have more numbers.

The practical rule is that your context has to be sufficient given the model size you are working with. Smaller models need tools, better and tighter prompts, things like RAG and search.