What do yall think of Gemma 4's "personality"?

Posted by TacticalRock@reddit | LocalLLaMA | View on Reddit | 24 comments

Interested in hearing your thoughts on the qualitative aspect of using Gemma 4 (I mainly run the 31B). For me, I kinda didn't hate interacting with the base tuning without any system prompts. Usually I have to prompt models to act a certain way to my liking, and while that hasn't changed, I found that no system prompt chatting was bearable.

Whenever a new model comes out, I like asking it very nebulous, vibey questions about self determination to figure out the base ego and personality tuning as a fun little exploration. For Gemma 4, I fed it parts of Anthropic's LLM emotions paper, and I found Gemma to not be overly glazing or hype, somewhat grounded (but still pretty assistant oriented by asking follow up questions). Last time I had a nice gut feeling about the vibe of a model was Llama 3.3 70B, which was just a nice guy at the core.

[-]

henk717@reddit

I don't know if it counts as personality but my first experiences with it have been very sour. More so than a regular user probably.

It refuses to write long, this is an issue with most LLM's but its just a really big difference compared to Qwen3.5 which understands what "use 5000 words" means. I'd ask gemma the same thing and when it refuses to write more that turn I ask it how much it wrote and it will claim 2300 words. I don't care if its count is accurate its telling me it didn't do what I asked and indeed wrote to short. You can try it multi turn but it will have rushed the story by then.

The other experience I have with its "personality" is its constant refusal to work with people who deviate an inch from its instruct format. We have seen token loop after token loop, and this was not a KoboldCpp issue. But because KoboldCpp users tend to use very flexible frontends they run in to this way more. So it was a lot of guiding people along before we could fix it for our users.

So for me its been a very stubborn model. And if you also have the issue with it where it keeps looping the same token over and over because you for example like using a model in text completions mode give KoboldCpp 1.111.2 a try.

[-]

AltruisticList6000@reddit

Yeah I tried the 26b, I noticed it is among the few models that completely broke with custom "character" cards or anything so I came up with the idea to just copy paste its massive base system prompt into the 1st chat message lazily (not gonna create a new character just for testing it lol) and it solved it, but it is not that great tbh. Too much thinking, ultra censored, as in constantly reasoning about policy like gpt oss instead of reasoning about the task, too many mistakes. Also hallucinates/lies a lot, I have a problem with a tool for some reason on all models, and glm flash would tell me the tool/process didn't work at least 70% of the time, gemma 100% of the time hallucinated something and claimed everything worked. Its style is also full of gptism and slop, almost as bad as qwen.

Still can't really find anything to replace 24b mistrals, I need more dense models in 20-24b range aswell.

[-]

Top-Rub-4670@reddit

It refuses to write long, this is an issue with most LLM's but its just a really big difference compared to Qwen3.5 which understands what "use 5000 words" means.

If it makes you feel any better, it also refuses to write short. It will do it for one shot requests but having "Be concise, your answer should never be more than 100 words." in the system prompt isn't respected for long.

I haven't had this problem with gpt-oss (which is admittedly well known for following instructions) or Qwen 3.5 27/35B.

[-]

TacticalRock@reddit (OP)

I also found the writing length to be on the shorter side. I frequently use LLMs for textbook summarizations into markdown, and Gemma 4 seems to write like a page max. Also, thanks for all the hard work!

[-]

Farmadupe@reddit

for chatbot purposes, I'm finding the no-system-prompt tuning to be much more agreeable than qwen3.5. It's extremely responsive to system prompts, doesn't seem to stop attending to them even as the context iwndow grows.

feel it might be a bit overaligned on the "helpful assistant" bit ("if I ordered it to jump off a bridge I think it probably would"), and a bit underaligned on ethics and morality....

[-]

Ashamed-Honey1202@reddit

Perdón soy un novato en usar llm locales, estoy con llama server usando Gemma 4 26b, me encantaría quitarle algo de censura, como puedo toquetear el system prompt y hay algún consejo sobre lo que debería poner?

[-]

TacticalRock@reddit (OP)

Yeah the assistant "may I take your hat sir" tendency to be a little too overbearing at times. Finding my "take a few puffs and chill out" prompt to be entertaining.

[-]

CommonPurpose1969@reddit

When using with tool calls, it is lazy AF. Instead of taking initiative and being really helpful and proactive, it keeps asking: "Should I?", "May I?", "Can you give me the information I need, even though it is there, or I could easily infer it?"

Or starts to argue: "But I am a big LLM and run in the cloud. While I appreciate your delusion, you must understand that I can't be running on your hardware. No way!!!"

With Gemma 3, it was the same. After a while, I became sarcastic and passive-aggressive.

[-]

LeRobber@reddit

>When using with tool calls, it is lazy AF.

Trying to explain the fundamentals of why this review made me chuckle to their family.

[-]

VoiceApprehensive893@reddit

Uses web search once

tool returns unrelated info

"I wasnt able to find anything related so lets do something else"

[-]

CommonPurpose1969@reddit

User: "I would like you to do X."

Gemma 4: "Ok. The user wants me to say ABC. Got it. Now I am going to do some shit."

And then nothing comes. No tool calls. No explanation why it didn't perform the actual task after it figured it out. Only that it figured out what it should do, and then it announced it so going to do something.

That happens so many times that it was not that funny anymore. So yeah, lazy AF. And that is one example. That thing drove up the walls this weekend and I wanted it to work so much.

[-]

LeRobber@reddit

Have you tried with stock unmodified gemma? Like I see that behavior with hacked up Qwen3.5 but not stock (unless I set the max tokens too low).

[-]

CommonPurpose1969@reddit

I've tried the Gemma 4 E2B and Gemma 4 E4B GGUFs from unsloth with llama.cpp. And I've compared them with Qwen3.5 4B.

Gemma is not even able to access a database running in Docker. It is so fucking disappointing that it starts the Docker shell in interactive mode. No joke. While Gemma fails to pass the password to PostgreSQL, Qwen knows how to pull the database schema without issues.

Another issue is that Gemma produces . And yes, I've updated llama.cpp and re-downloaded the Gemma models, since they've been fixed multiple times, but with no improvements.

For linguistic tasks, Gemma is really good. It is better at Chinese than Qwen. For Agentic tasks, for now, it's a failure, IMHO.

[-]

Corporate_Drone31@reddit

Two things:

Give it the exact specs of the computer it's running on, and include details about the token generation rate. It might help it be convinced that it's running locally.
Smaller models could respond better to being hyped up. Tell it that it's capable, that you trust its judgment, that you trust it will make the correct decisions/choices, give it a "tool" that it can lean on to decide which option is better that's just another Gemma deliberating the choice presented as if it were a hypothetical... just think of it as a shy friend and shower it with reassurance.

[-]

Aromatic-Flatworm-57@reddit

Can't believe in 2026, we are going to be cheerleader for software :D

[-]

Corporate_Drone31@reddit

The future is weird, isn't it :P

[-]

TacticalRock@reddit (OP)

That's pretty funny actually. Unhelpful, but hilarious. Hope you find a better llm haha

[-]

CommonPurpose1969@reddit

There are always better LLMs. ;)

[-]

redragtop99@reddit

Found it to be incredibly similar to Gemma 3 in this aspect, even uses some of the same phrases.

[-]

CommonPurpose1969@reddit

It sounds at times condescending as hell. And it has this corporate feeling to it: "Please consider that..."

[-]

redragtop99@reddit

It depends on how you prompt it. I found Gemma 3 to be one of the best models, and I wouldn’t say 4 is less quality. I haven’t used 4 as much but I was surprised that right away I was getting the same role playing phrases I specifically remember getting only in Gemma 3. Gemma 3 has some of the best abliterated models, and I fully expect Gemma 4 will too.

[-]

Mistborn_Jedi@reddit

I don't like that it latches onto something and quotes it forever, using quotes

[-]

Chupa-Skrull@reddit

I was fine with it, but I didn't run it stock much at all. I'm impressed by how strongly it takes to direction, though, across all of the releases. I'm also impressed by the lack of usual oily LLM composition in general

[-]

nickm_27@reddit

It took to my system prompts personality ("You have the personality of a star wars droid") a lot more than GPT-OSS or Qwen3.5 did, it changed its behavior / wording more than others but overall it has been my favorite and most reliable model thus far for home assistant voice with my prompt.