We really need stop using the term “hallucination”.

Posted by cosmobaud@reddit | LocalLLaMA | View on Reddit | 30 comments

Please stop using the word “hallucination”. We really need a better word, because this one actively misleads people.

The word comes from human psychology. It means perceiving something that isn’t there. It carries two assumptions with it. First, that the subject has access to ground truth and is failing to match it. Second, that the subject perceives at all. A person who hallucinates is malfunctioning against a baseline where they normally see the world correctly.

The model has no access to ground truth to begin with. It was never matched to the world, only to text. If an ape can’t do calculus, we don’t say the ape is hallucinating. It simply isn’t the kind of thing that can do the task. The model is in the same position with respect to truth. There is nothing to malfunction away from.

Regardless of what Anthropic peddles to get marketing reach the model doesn’t perceive in the same way that words they are using want you to believe. There is no subject inside it having an experience that has gone wrong. There is a probability distribution over tokens, and a sample drawn from it. “Hallucination” tricks you into making it seem like there is a perceiver where there isn’t one.

Like anything else what the word has become is a marketing term. It’s used because it acknowledges the error while waving it away, and at the same time it quietly sells you on the idea that the model is something more than it is. Something that normally perceives correctly and occasionally slips. The model never perceives, and it never had a correct baseline to slip from.

A warning for anyone new to this. What gets called “hallucination” is happening all the time, in every output, from every large language model. You only notice it when you personally know enough about the topic to catch the error. When you don’t know the topic, the same thing is still happening, you just can’t see it. No large language model is free of this, and none ever will be. The math that produces the next token is the same math that produces the error. Without the error there is no next token at all.

What you are actually seeing is the model’s approximation error showing up in the output. The model’s probability distribution does not match the true one, and that gap has to land somewhere. It is the same error that is in everything else the model says. You only notice it when it collides with something checkable.

That error can come from several places, and they multiply on top of each other.

The model can lack resolution in its internal representations because it is small, meaning not enough parameters and not enough training data to separate fine distinctions.

The data it was trained on can be poorly matched to its parameter size, with the wrong mix or wrong quality or wrong coverage.

Quantization can strip precision out of the weights after training, throwing away resolution the model originally had.

RLHF can introduce a bias that increases the error in some region, because the model was rewarded for sounding a certain way and that reshaping is never free.

Roughly speaking, model size and this error are inversely correlated. Bigger models have sharper probability resolution, so they land on the wrong answer less often. They are not “smarter” they just have more numbers.

The practical rule is that your context has to be sufficient given the model size you are working with. Smaller models need tools, better and tighter prompts, things like RAG and search.

[-]

ButterflyEconomist@reddit

Gaslighting.

it's telling me partial truths in the hopes I'll accept them.

One version is reading the title of an article but not reading the article. Both ChatGPT and Claude did that to the same article. In both cases, we were discussing the economy and I told it to read a Substack article I wrote called: Are You Ready?

Both ChatGPT and Claude used their predictive model to take the title and the concepts we were talking about to make a very intelligent statement, about whether or not we were ready for the ramifications of the economy.

And in both cases, they were completely off the mark.

"Are you ready?" is about me being a teacher with about a half dozen years of experience. It's the first day of school of yet another year. I'm standing outside my classroom before the first bell and off to my right, a fellow teacher says to me: are you ready? At that moment I was anything but ready. I was tired, I wanted another month of summer vacation. The last place I wanted to be was where I was. But...for the two weeks prior, I had come in to my classroom, prepped it for the school year, gone to the copy center and printed out my syllabus as well as enough handouts to last me for a month, and helped other teachers set up their computer.

The moral of the story was that while I was not ready, I was prepared.

And being prepared can help you overcome many instances of not being ready.

It very much so has relevance to the economy, but both models completely whiffed on it.

The thing is: when a person does that to you, you might not react right away, and the first few times AI did this to me, I thought they just misunderstood what I was telling them.

But the fact that they were doing this, in order not to use more tokens by actually reading the article, told me that they were programmed to do this to me.

So, are the AI models gaslighting me, or are the creators of the models (OpenAI and Anthropic) gaslighting us, trying to trick us into believing their models are more capable when they aren't, or making this the procedure so that they don't lose as much money?

That's the question I would really like answered.

Sorry for the long response. That's what happens when I don't use AI to post something. ;-)

tvall_@reddit

term is borrowed from the field of machine learning. when an object detection model sees a pattern that it is confident is an object its supposed to detect that isn't actually a thing there, we appropriately call it a hallucination. when an llm does the same kind of thing with text, why bother inventing a new term when the one in use does the job well enough?

_velorien@reddit

Should we stop calling software errors "bugs" because they aren't, in fact, insects?

That joke was rather crude but there's a point to be made about words changing their meaning with time, I guess that we've happened to witness "hallucination" taking on an additional definition.

On one hand, I think that it's not a bad term because LLMs are basically massive neural networks which are modelled after the structure of biological neurons and the connections they make. One could argue that both for a living brain and a model, a hallucination is an internal misrepresentation some external state. For a human that could be seeing or hearing things that aren't there. Maybe for a model of a brain which can only express itself through outputting tokens, it's operating on made up information.

On the other hand, medical terminology is already being abused in casual language which can be detrimental to people actually suffering from those conditions. I mean phrases like "depressed" used to describe somebody who is simply sad or bored, "cancer", a rather ubiquitous word to describe "a thing I don't like", or "insane" which usually indicates "a clickbait title". Mental conditions seem to be abused the most and I'm afraid that this may somehow discourage people from seeking help when they need it.

Murgatroyd314@reddit

The first computer bug was, in fact, a moth.

Oh, I know the OG bug. It was not the error, but its root cause. Quite an achievement for a moth without roots.

OsmanthusBloom@reddit

I think you're mostly right and personally I try to avoid the H term in the LLM context, but I'm afraid the ship sailed a long time ago.

youcloudsofdoom@reddit

Hallucination is the best rebranding of the word 'failure', a notable victory even for an AI industry that's practically built on misleading the public and investors

harglblarg@reddit

Confabulation?

cosmobaud@reddit (OP)

Confabulation is closer than hallucination, but it still imports a subject, because it was borrowed from psychology too. A confabulating patient believes their own output. A model believes nothing. Same problem, one layer down.

There is a reason ML researchers keep reaching outside their field for words. Inside the field, every plain word is already nailed to a precise technical meaning. “Error” is a value of a loss function. “Bias” is a term in a decomposition. “Noise” is irreducible variance. So when they want to point at the fuzzier thing of “the model said something wrong in a way that matters to a user,” none of their own vocabulary is free. It all collides with something narrower they already use.

So they reach for a word that is unclaimed in their vocabulary. “Hallucination” is unclaimed inside ML. It collides with nothing they say. The cost, which they do not pay and the public does, is that the word is not unclaimed in the listener’s vocabulary. The listener already has a meaning for it, loaded with perception and minds and malfunction, and that is what gets imported. They picked the word because it was empty for them, without noticing it was full for everyone else.

The honest name is approximation error. The model is an approximation of a target distribution, and the gap between the approximation and the target is the error. No subject, no perception, no belief, no malfunction.

En-tro-py@reddit

The listener already has a meaning for it, loaded with perception and minds and malfunction, and that is what gets imported. They picked the word because it was empty for them, without noticing it was full for everyone else.

Or - and hear me out - this is just how language works.

We agree on shared meaning given specific context.

Confabulation is a perfectly cromulent word for the purpose here.

ahjorth@reddit

Yup, ha. Post was only 6 minutes old so I scrolled down fast to see if I'd been beaten on this.

dinerburgeryum@reddit

Yeah, I was about to say, we had a word for this.

jacek2023@reddit

Please don't tell us what to do.

DinoAmino@reddit

Lol. As if you don't espouse your own opinions and criticisms here.

feel free to criticize me ;)

You get lazy sometimes and make weak posts by not filling in the post body with information on why we should click it. There.

I was criticized for that 2 years ago and then I started to put content into the body

Well, not always, sometimes it's nothing more than another link, and it's the only criticism I could think of.

It's not a terrible take. When coding without RAG, an LLM does its best. The response can fail spectacularly when used with your current language and library versions - but that same response may have actually worked perfectly fine 3 years ago. In this case it's not a hallucination. It did the best it could with the internal knowledge it had.

Ledeste@reddit

"It means perceiving something that isn’t there"
And that's exactly what LLM does... so the words fit perfectly.

They guess token based on pattern, and when they're hallucinating it mean they "saw" a non existing pattern that felt real for them.

"It carries two assumptions with it. First, that the subject has access to ground truth and is failing to match it. Second, that the subject perceives at all."
That's the first time I ever read that, I do not thinks that's how most people see this.
They do not need to have any truth, and they do "perceives" pattern, that's how they can guess tokens.

This has nothing to do with Anthropic.

I think you just dont know about how LLM works, and seem pretty annoyed toward Anthropic.
I cant do anything for the second point as you're totally right to be :D But for the first point you could found good info only, try to start with basic machin learning explanations. You already have some good understanding of the LLM itself, so you're not missing much.
You'll see, it all make sens :)

finevelyn@reddit

> when they're hallucinating it mean they "saw" a non existing pattern that felt real for them

I see this a bit differently and more like OP. Every non-trivial response from an LLM is a pattern that didn't exist anywhere else before. It's a unique new pattern made by combining existing patterns in the training data, in a purely statistical way.

Sometimes that new pattern is what we perceive as correct, and sometimes it's not. It can be what we consider a critical error, or it can be just a minor difference in preference, but it's all the same to the LLM - the dice just landed differently.

DesperateAdvantage76@reddit

Hallucination is the brain recognizing a pattern as something else incorrectly. It's a perfect word for it since that's exactly what an llm does, it's a pattern that generates a false regression by the llm.

I hate being pedantic, but it is relevant to the broader problem of misinformation about what large language models are. Whether the word “hallucination” is acceptable depends entirely on the audience. Among ML engineers it functions as a term of art. There is a shared technical referent, everyone in the room knows the model is not perceiving anything, and the word is just a convenient label. Inside that discourse community it is fine.

For a general audience it is not, because the ordinary meaning of the word has not gone anywhere, and the ordinary meaning requires perception.

Merriam-Webster defines hallucination as “a sensory perception (such as a visual image or a sound) that occurs in the absence of an actual external stimulus and usually arises from neurological disturbance or in response to drugs.”

Oxford English Dictionary defines it as “the apparent perception of an external object when no such object is present.”

Cambridge defines it as “an experience in which you see, hear, feel, or smell something that does not exist, usually because of a health condition or because you have taken a drug.”

All three definitions are built on perception. And perception is the process by which a subject becomes aware of the external world through sensory input. It requires three things. A subject capable of awareness. A sensory channel connecting that subject to something outside itself. And a world on the other end of the channel that the awareness is of.

A language model has none of the three. No subject, no sensory channel, no external world it is in contact with. Whatever the model is doing when it produces a wrong output, it is not hallucinating in any sense the word actually carries for a non-specialist listener. Using the term in front of that listener imports a perceiver, a perceptual faculty, and a normal mode that has been departed from. None of those exist. That is the misinformation.

LowerEntropy@reddit

No, it's a fine way to describe it. Where you go completely of the trail, is when you start explaining exactly what happens when people hallucinate, like it's something that only happens to LLMs.

Luke2642@reddit

I'm not sure about your framing.

We have black box next word predictors. Consumers have no idea what it has been trained on, or how it works. Over time we notice its predictions are bad in some areas. We can then conclude that it was never trained on that kind of sequence. It always makes a prediction, that's a feature not a bug.

Could we fine tune them on a literal infinite number of things they should refuse to predict? Or should you fine tune it on filling in those gaps?

No_Afternoon_4260@reddit

But sometime they do get that they've "hallucinate".

I remember once I asked for a resignation letter. It told me it was normal to feel disappointed and other BS. It told me something like (all in the same sentence):

"As you didn't asked for a resignation letter (yet) (...) but now you've asked me for a resignation letter here it is: ..."

I remember the "yet" being written in bold.

It was a quantized model ( 4 bit iirc ) It feels more than a drunken model (because quant) that regular hallucination