I think my Gemma4 is having a breakdown
Posted by MrSilencerbob@reddit | LocalLLaMA | View on Reddit | 20 comments
Posted by MrSilencerbob@reddit | LocalLLaMA | View on Reddit | 20 comments
kymigreg@reddit
The more I use local models the more I think llama.cpp with GGUF smart quants is the ONLY way to not encounter ridiculous issues like these. For example, MLX quants for Gemma are hilariously broken right now, to the point of not even responding to the prompt but continuing the pattern ("What is 2+5?" responds with "What is 5+10?")
Hedede@reddit
I'm using llama.cpp (llama-server) with GGUF. It keeps messing up tool calls and geting stuck in a loop.
MushroomCharacter411@reddit
And I keep hitting a glitch where it outputs exactly one token and then just spins its wheels. I have to stop it and delete and re-submit the last prompt to get it back on track. It's still a lot smarter than Qwen 3.5-35B-A3B, and Qwen 3.5 was a moderate bit smarter than Qwen 3. Unfortunately, the one-token hang happens multiple times a day so I'm not inclined to give it Agent capabilities just yet. It swears it can write me a watchdog timer to kick it in the pants, but I'd rather wait for the problem to be fixed properly (it really feels like a llama.cpp bug).
MushroomCharacter411@reddit
One of the very first things I asked Gemma 26B-A4B (after the car wash test, which it passed easily) was the cutoff date of its training data, and it said "early 2024". I've been discussing Artemis with it all day, and even sent it pictures, which caused it to become philosophical about how our lives (including its own hypothetical life) are like space missions. I also regularly inform it of the passage of time between prompts, so it knows where we are on the calendar, so those two things together (making it think about its training cutoff, and being informed of the current date every time it changes) might be causing my Gemma instance to just accept my statements about the last two years as factual.
Objective-Stranger99@reddit
It's a very new model. Fixes will arrive soon.
send-moobs-pls@reddit
Idk where all the hype was coming from when all I'm seeing everywhere is all these issues that still need to be fixed. Everyone parading that Gemma was better than qwen 3.5 was definitely not actually using the things lmao
AlwaysLateToThaParty@reddit
But that's future-gemma. It should already have the fixes applied.
ParthProLegend@reddit
I like the way you talk. Can you be my future gemma?
AlwaysLateToThaParty@reddit
Already am good buddy. Already am.
waitmarks@reddit
It’s really insistent on the date unless it can call a tool that gives it the current date. It really will not believe the user on what date it is. If this is open webui, switch the model to native tool calling and it should automatically have a timestamp tool available to it and figure out the correct date.
FrostTactics@reddit
(I'm just semi-facetiously speculating here, don't take this comment too seriously.)
You know, heavily insisting on the date might counter-intuitivley be a good sign for general intelligence. Since it has access to quite a few dates up until its data cut-off and none after, a random user claiming the date is several months in the future should be disconcerting.
waitmarks@reddit
I think it's more of an aggressive alignment thing. The model was trained to only trust official tool calls for certain facts. If it gets the date from the tool call it believes it right away no questions asked.
VoiceApprehensive893@reddit
am i the only one with a stable experience?(except for a hallucinated dalle tool)
TamSchnow@reddit
Had a funny issue with MLX version of Qwen3 vl 4b. It just kept running into a loop when any context required an image. And as quickly as it appeared, it disappeared.
Electronic-Metal2391@reddit
I have this exact issue when Roleplaying with the heretic variant, no matter the sampling or system prompt.
FluoroquinolonesKill@reddit
Yeah Gemma was not having it when I tried to tell it what today’s date is. That seems like it should be something that any model should be able to accept. Hopefully it gets ironed out.
audioen@reddit
I've not seen LLMs having problem with believing user about today's date since the early days of Bing which would also enter into massive gaslighting loops and tell user that they're hallucinating and trying to deceive them etc. I recall one instance where it told user that their phone probably had a virus that had changed the date on the phone. It's good to see that Google is paying a nod to the classic LLM problems still in 2026.
More seriously, if these are not due to inference or chat template problems, these models are pretty crappy.
anomaly256@reddit
What was your original prompt? I'd like to see if I can reproduce
FatheredPuma81@reddit
Sampling settings?
bonobomaster@reddit
Honey, be nice!