Why do these small models all rank so bad in hallucination? Incl. Gemma 4.

Posted by Fusseldieb@reddit | LocalLLaMA | View on Reddit | 23 comments

Why do these small models all rank so bad in hallucination? Incl. Gemma 4.

A few days ago Gemma 4 came out, and while they race against every other "intelligence" benchmark, the one that probably matters the most, they don't race against, which is the (Non-)Hallucinate Rate.

Are these small models bad regardless of training (ie. architrectural-wise)?

In my book a model is quite "useless" when it hallucinates so much, which would mean that if it doesn't find something in it's RAG, it might respond nonsense roughly 80% of the time?

Someone please prove me wrong.