My thought on Qwen and Gemma

Posted by Internal-Thanks8812@reddit | LocalLLaMA | View on Reddit | 34 comments

This spring is really hot since the localLLM giant, both Qwen and Gemma released major models.
I'm really excited with those release and happy with their capability.
Both are real hero for local LLM, although I have feeling they have different strength.
For the background, I use them with text review, grammar check in human/social science field and some coding with python(mostly light data analysis stuff), web app(js, ts), general stuff.
I use 27/31B dense and 35/26B Moe, haven't much tried with smaller models.

Qwen
Strength

Thought/knowledge and way/paradigm how it deals in STEM area.
Coding. It was already better, but with 3.6, coding is much much superior than Gemma.

Weakness

Non english language. I feel it got dumm when text/conversation is not in english. guess in chinese it does well, but since I can't chinese, no clue.
I feel sometimes it tend to too much "logical" or "hard head" for my area.

Gemma

Strength

Flexible on way of thinking, but it is also sometimes "fuzzy". But for my use, it is often suited than Qwen.
Non English language. unlike Qwen, it doesn't degrade in other language.

Weakness

Coding. 4 is much better than 3. but still way behind than Qwen.
Image. Qwen is better for image recognition.
Tool use. I guess it is not the problem of model itself, but I feel it still lucks optimise of engine. Model architect too complicated? I have no idea.

Bias

Both has bias in different way/direction, especially politics/cultural topic. Since I believe real "neutral" model is impossible in general, I would always keep it in my mind. But I feel Qwen got more toward to neutral since 3.5(before it was much biased in my opinion), similar neutrality to Gemma.

They still hallucinate occasionally and sometimes dumm, but I think it is also good for me since I still need to use my brain/hand to cover it to avoid got Alzheimer.

Both are open weight, I continue use them by case.
My usage is not so much heavy, so I may miss something and this is just my opinion/feelings.
What is your thought? I'm curious.

[-]

Interesting_Key3421@reddit

can you make examples of qwen weaknesses? to me it's just fine

mr_tolkien@reddit

For English/Japanese translation Gemma4 31b is much better than Qwen. It’s really not even close.

a_beautiful_rhind@reddit

Poor at creative writing. Excessive thinking. Too many refusals by default.

Ok. i didn't test creative writing, but normal writing and summarization seems fine to me. Thinking can be controlled with reasoning-budget

Unless a model is hopeless, of course you can work against it's flaws.

Ngoalong01@reddit

Can we make a "Qwen3.6_Gemma4_Opus..." like something before :)) Qwen with knowledge language and logic...

GrungeWerX@reddit

God, please no…those opus finetunes are trash.

MuzafferMahi@reddit

I disagree. The performance degrarion is unnoticable to me. What quant you use them in?

Q5/Q6.

I already deleted them, won’t be going back. Clearly a drop in quality in all my tests. Especially over long context.

Internal-Thanks8812@reddit (OP)

I guess Qwen is good with coding because it is more heavy on logic and gemma vis versa. just my guess.

jonnaybb@reddit

Which one do you use for grammar so I can avoid it?

gitsad@reddit

Using local llms is still exotic. If you don't have strong hardware behind then you basically can't use it. Most people just don't have. Untill small models become smart enough then I don't see the reason to use it. I don't want to pay 20-30k for my hardware just for automating my emails drafts.

ayylmaonade@reddit

I can literally run these models on my near-4 year old iPhone. Where on earth are you getting that 20-30k number from? A typical high end gaming rig can run both of these models at Q4 with long context and great performance.

Far-Low-4705@reddit

You can automate your email drafts with a laptop from 10 years ago…

jacobcantspeak@reddit

Mfw local models on my iPhone can “automate my email drafts” and probably could 2 years ago too..? Have you even touched a local model since 2024?

it was simple hyperbolic

One_Key_8127@reddit

Get the most basic Mac Studio M1 Max for $1.5k (used) and run Gemma or Qwen MoE, I think you'll get about 30tps tg / 500tps pp at Q4 quants. Very usable speeds, small form factor, extremely low power draw - both idle and under load. You will automate your email drafts with these models, these are a real deal.

Okay, it was a little too hiperbolic. I've should written this different. What I mean is that I've tried it on my local M3 Pro Macbook Pro. My results was that model is working but it was too slow. I'm not upset with that however I live in Poland. Macbook M3 Pro was worth 12k my local currency. It's not cheap. The model you've mentioned about costed like 20k local currency that time when it was released. Now I might find it and buy it from second hands for like 5k I guess. I misused my perception of cost regardless of region so sorry for that.

Nevertheless 5k PLN is not cheap still but change things for sure.

PaceZealousideal6091@reddit

I am sorry, but your idea of local llm is hugely misplaced mostly due to lack of research or/and effort. As far as I know m3 pro starts with minimum 18GB unified memory. With that you can run Qwen 3.5 9B at Q4 if not more. Thats no slouch. It is maybe 95% as good as 35B model. So.. yeah, its just your ignorance and lack if effort. Not really local llms or cost issue.

Free-Combination-773@reddit

You don't need so much money just to automate email drafts

Alarming_Positive_59@reddit

OK dario

???????????????????????????

nickm_27@reddit

Gemma is way better at following tool use and formatting instructions (non code). Using these models for voice assistant and Qwen does not follow instructions on how to ask for clarification, confirm actions, etc. Gemma4 is very good at this and also calls the tools accurately without any issues.

Fangsong_Long@reddit

qwen3.6 35B A3B also tends to overthinking, at least in the default settings of ollama.

jojorne@reddit

gemma's image recognition is good, but the problem is that it carries too many details that get trimmed. you have to increase the default tokens for image (--image-min-tokens).

kaeptnphlop@reddit

Thanks for the hint.

I might have to rerun a test where it mistook the front of a truck for the back of it. Dashcam footage.

I did put Gemma aside with a note of “probably a config issue” — it had a rocky start after all

what is your use case?

Light coding and analysis

gemma's image recognition is good, the problem is that it carries too much details that gets trimmed. you have to increase the default tokens for image (--image-min-tokens).