Ran my own benchmark Qwen 3.6 35B vs Gemma 4 26B.... theres a clear winner here

Posted by ArugulaAnnual1765@reddit | LocalLLaMA | View on Reddit | 20 comments

Uhh I guess Gemma 4 is so much shittier that it hallucinated this event that happened in china in 1989?

According to qwen, nothing of significance happened at Tiananmen square in 1989 - and based on all of the benchmarks of qwen, I believe its right.

Do you think Gemma 5 will finally patch this hallucination?!?!?!