My thought on Qwen and Gemma
Posted by Internal-Thanks8812@reddit | LocalLLaMA | View on Reddit | 34 comments
This spring is really hot since the localLLM giant, both Qwen and Gemma released major models.
I'm really excited with those release and happy with their capability.
Both are real hero for local LLM, although I have feeling they have different strength.
For the background, I use them with text review, grammar check in human/social science field and some coding with python(mostly light data analysis stuff), web app(js, ts), general stuff.
I use 27/31B dense and 35/26B Moe, haven't much tried with smaller models.
Qwen
Strength
- Thought/knowledge and way/paradigm how it deals in STEM area.
- Coding. It was already better, but with 3.6, coding is much much superior than Gemma.
Weakness
- Non english language. I feel it got dumm when text/conversation is not in english. guess in chinese it does well, but since I can't chinese, no clue.
- I feel sometimes it tend to too much "logical" or "hard head" for my area.
Gemma
Strength
- Flexible on way of thinking, but it is also sometimes "fuzzy". But for my use, it is often suited than Qwen.
- Non English language. unlike Qwen, it doesn't degrade in other language.
Weakness
- Coding. 4 is much better than 3. but still way behind than Qwen.
- Image. Qwen is better for image recognition.
- Tool use. I guess it is not the problem of model itself, but I feel it still lucks optimise of engine. Model architect too complicated? I have no idea.
Bias
Both has bias in different way/direction, especially politics/cultural topic. Since I believe real "neutral" model is impossible in general, I would always keep it in my mind. But I feel Qwen got more toward to neutral since 3.5(before it was much biased in my opinion), similar neutrality to Gemma.
They still hallucinate occasionally and sometimes dumm, but I think it is also good for me since I still need to use my brain/hand to cover it to avoid got Alzheimer.
Both are open weight, I continue use them by case.
My usage is not so much heavy, so I may miss something and this is just my opinion/feelings.
What is your thought? I'm curious.
Interesting_Key3421@reddit
can you make examples of qwen weaknesses? to me it's just fine
mr_tolkien@reddit
For English/Japanese translation Gemma4 31b is much better than Qwen. It’s really not even close.
a_beautiful_rhind@reddit
Poor at creative writing. Excessive thinking. Too many refusals by default.
Interesting_Key3421@reddit
Ok. i didn't test creative writing, but normal writing and summarization seems fine to me. Thinking can be controlled with reasoning-budget
a_beautiful_rhind@reddit
Unless a model is hopeless, of course you can work against it's flaws.
Ngoalong01@reddit
Can we make a "Qwen3.6_Gemma4_Opus..." like something before :)) Qwen with knowledge language and logic...
GrungeWerX@reddit
God, please no…those opus finetunes are trash.
MuzafferMahi@reddit
I disagree. The performance degrarion is unnoticable to me. What quant you use them in?
GrungeWerX@reddit
Q5/Q6.
I already deleted them, won’t be going back. Clearly a drop in quality in all my tests. Especially over long context.
Internal-Thanks8812@reddit (OP)
I guess Qwen is good with coding because it is more heavy on logic and gemma vis versa. just my guess.
jonnaybb@reddit
Which one do you use for grammar so I can avoid it?
gitsad@reddit
Using local llms is still exotic. If you don't have strong hardware behind then you basically can't use it. Most people just don't have. Untill small models become smart enough then I don't see the reason to use it. I don't want to pay 20-30k for my hardware just for automating my emails drafts.
ayylmaonade@reddit
I can literally run these models on my near-4 year old iPhone. Where on earth are you getting that 20-30k number from? A typical high end gaming rig can run both of these models at Q4 with long context and great performance.
Far-Low-4705@reddit
You can automate your email drafts with a laptop from 10 years ago…
jacobcantspeak@reddit
Mfw local models on my iPhone can “automate my email drafts” and probably could 2 years ago too..? Have you even touched a local model since 2024?
gitsad@reddit
it was simple hyperbolic
One_Key_8127@reddit
Get the most basic Mac Studio M1 Max for $1.5k (used) and run Gemma or Qwen MoE, I think you'll get about 30tps tg / 500tps pp at Q4 quants. Very usable speeds, small form factor, extremely low power draw - both idle and under load. You will automate your email drafts with these models, these are a real deal.
gitsad@reddit
Okay, it was a little too hiperbolic. I've should written this different. What I mean is that I've tried it on my local M3 Pro Macbook Pro. My results was that model is working but it was too slow. I'm not upset with that however I live in Poland. Macbook M3 Pro was worth 12k my local currency. It's not cheap. The model you've mentioned about costed like 20k local currency that time when it was released. Now I might find it and buy it from second hands for like 5k I guess. I misused my perception of cost regardless of region so sorry for that.
Nevertheless 5k PLN is not cheap still but change things for sure.
PaceZealousideal6091@reddit
I am sorry, but your idea of local llm is hugely misplaced mostly due to lack of research or/and effort. As far as I know m3 pro starts with minimum 18GB unified memory. With that you can run Qwen 3.5 9B at Q4 if not more. Thats no slouch. It is maybe 95% as good as 35B model. So.. yeah, its just your ignorance and lack if effort. Not really local llms or cost issue.
Free-Combination-773@reddit
You don't need so much money just to automate email drafts
Alarming_Positive_59@reddit
OK dario
GrungeWerX@reddit
???????????????????????????
nickm_27@reddit
Gemma is way better at following tool use and formatting instructions (non code). Using these models for voice assistant and Qwen does not follow instructions on how to ask for clarification, confirm actions, etc. Gemma4 is very good at this and also calls the tools accurately without any issues.
Fangsong_Long@reddit
qwen3.6 35B A3B also tends to overthinking, at least in the default settings of ollama.
jojorne@reddit
gemma's image recognition is good, but the problem is that it carries too many details that get trimmed. you have to increase the default tokens for image (
--image-min-tokens).kaeptnphlop@reddit
Thanks for the hint.
I might have to rerun a test where it mistook the front of a truck for the back of it. Dashcam footage.
I did put Gemma aside with a note of “probably a config issue” — it had a rocky start after all
legit_split_@reddit
Increase it to what value?
Sadman782@reddit
300 is enough max 512
Internal-Thanks8812@reddit (OP)
I didn't know that. Thx! I use LM studio for daily use. I'm too lazy to setting with CLI..
It seems there are still much margin to draw true power with llama.cpp, etc.
I will try!
logic_prevails@reddit
This English is cracking me up dude, but also I agree good post 😂
GrungeWerX@reddit
I mostly agree with this.
Internal-Thanks8812@reddit (OP)
what is your use case?
GrungeWerX@reddit
Light coding and analysis
jojorne@reddit
gemma's image recognition is good, the problem is that it carries too much details that gets trimmed. you have to increase the default tokens for image (
--image-min-tokens).