Gemma 3 27b vs GPT OSS 20B anyone try yet?
Posted by deathcom65@reddit | LocalLLaMA | View on Reddit | 25 comments
Has anyone done a side by side comparison at various tasks between these models? This would be a very interesting comparison
AppearanceHeavy6724@reddit
OSS 20b is unusable for creative writing and chatting. Awful. Gemma is way, way better at these things. Coding sucks though.
OutrageousMinimum191@reddit
Oss in codind sucks as well. Qwen is the only way among small size models for that.
MECTRONx@reddit
Nah, oss 20b high \~= o3 mini
Lorian0x7@reddit
not really, IME, oss20b passed my personal coding benchmarks (powershell) that qwen 30b-a3b and 34b failed
luckyroger815@reddit
I´m running gemma3:27b in a (Proxmox) Ubuntu VM with gpu passthrough. An old sucker too. Quadro P6000 24gb. Runns pretty slick. Dual Xeon Gold 6154. Fujitsu Celsius R970Bpower. If anyone cares one day \^\^,
gelukuMLG@reddit
Gemma 3 27B is way better in every aspect, intelligence, knowledge and censorship. OpenAi unironically released the most censored model ever, on par with goody 2. And the 120B openai model looses to qwen 3 30B in coding. Also good luck getting any trivia knowledge out of openai model, it has been trained entirely on synthetic data.
meshreplacer@reddit
It surprises me that something from Google is this good. especially for creative stuff. Is Gemma3 27B the largest model?
feelosofee@reddit
https://artificialanalysis.ai/models/comparisons/gpt-oss-20b-vs-gemma-3-27b#intelligence
gptlocalhost@reddit
We conducted a brief comparison between gpt-oss-20b and Phi-4 in Microsoft Word like this:
https://youtu.be/6SARTUkU8ho
feelosofee@reddit
and?
Pretend_Sky2610@reddit
Hah this GPT OSS so stupid model......
Lorian0x7@reddit
Gemma 27 is better at creativity, but i found oss20b to be better at coding.
I don't know how people are running this 20b but there's clearly some implementation issues.
with VLLM oss 20b is better then qwen 30b
findingsubtext@reddit
Gemma3 27B is truly a SOTA model for its size, unless you need coding. I hope Gemma4 comes out soon, or another company manages to outperform it. In my experience, Gemma3 27B seems to have as good, if not better world knowledge than Mistral Large.
llmentry@reddit
Comparing against 120B is more appropriate, considering active params.
They're very different beasts; Gemma's strengths are GPT-OSS's weaknesses, and vice versa. I like both.
ForsookComparison@reddit
I'm still evaluating it vs Qwen3.
Gemma3 was never competitive in anything except for semi human sounding chats and decent western knowledge. OOS is decent at both, so I'd wager Gemma3-27B will be a dead model for me once I'm done.
ttkciar@reddit
Huh. Gemma3-27B (and the excellent anti-sycophant fine-tune Big-Tiger-Gemma-27B-v3, from TheDrummer) has been really good at creative writing, for me, and also RAG -- but only within a context of about 90K; any more than that and it gets stupid and forgetful. Though, 90K is still a lot of context!
When I get around to my formal assessment of GPT-OSS-20B, I will be comparing it to Gemma3-27B and Phi-4-25B.
DinoAmino@reddit
Please try RAG based coding with the Gemma. Only because I haven't tried it yet either and I'm a little curious about it now.
ForsookComparison@reddit
It's worse.
And if the docs you fetch with RAG are anything longer than a few paragraphs it's game over for Gemma3 since it falls off a cliff with context
DinoAmino@reddit
Good to know. Thanks!
cristoper@reddit
One advantage gemma3 still has is its multi-modal input (image to text), which can make it a nice multi-purpose chat model. But that is probably minor for many uses cases.
mikael110@reddit
I'd argue one of Gemma 3's greatest strengths is that it's quite multilingual. More so than most OSS models, and certainly more so that GPT-OSS given that OpenAI has explicitly said it was trained almost entirely on English-Only data.
The fact that Gemma has vision built in at all sizes other than 1B is also quite nice. And is another area where it definitively comes ahead of GPT-OSS which does not have any vision support.
cristoper@reddit
I'd be interested in an in-depth comparison also.
I can share some anecdotal results from my immediate uses (summarizing and criticizing longform essays, mostly) with 24GB VRAM:
gemma3-27b,qwen3-30b-a3b-instruct, andgpt-oss-20bare all pretty competitive. Gemma can sometimes give higher quality output and has image capabilities which is useful to me for captioning images, but qwen3 and gpt-oss are 3-4 times faster.gpt-oss-20bis especially useful for summarizing long documents because at full "native" 4-bits (I'm not sure what depth it was actually trained at) it takes less than 13GB of VRAM and allows for loong context.I haven't experimented with
mistral-small-3.2-24b-instructyet, but I can run it at q6_k quant at similar context as gemma @ q4, so I intend to play with it at some point.For all of them I'm manually providing context, and haven't tested how good they are at tool calling.
ttkciar@reddit
When I assess it, I will be comparing its capabilities to Gemma3-27B and Phi-4-25B. Not sure when it will happen, though.
Own-Potential-2308@reddit
And I thought Gemma models were censored lmao
ttkciar@reddit
It would be a violation of the Gemma license to publish a decensored fine-tune, so I'm not going to say that Big-Tiger-Gemma-27B-v3 is decensored. Instead, I'm going to simply suggest you try it.