Layman's comparison on Qwen3.6 35b-a3b and Gemma4 26b-a4b-it

[-]

Kahvana@reddit

I found both to be very solid models, for different purposes.

Qwen3.5/3.6 is solid for programming and tool calling, gemma is beat for conversation/roleplay and translation. OCR is a toss up between the two. Gemma reasons less, which is nice for quick tasks.

[-]

Vinserello@reddit

Qwen3.5 on autogen is more incapable than Qwen2.5 on tool calling for me

[-]

Kahvana@reddit

Then you’re doing something very wrong.

[-]

OCR is far superior on Gemma, especially on multi languages. Qwen often end up in a loop if there are certain south east asian language appear together at the same time. It's been a problem since Qwen 3 VL. Gemma also gives MUCH more accurate text in a semi complex images.

[-]

onephn@reddit

Speaking of OCR, I'm trying to vibe code a utility that would scan PDFs and make whatever modifications are necessary for wcag compliance and the like, which models would you recommend for the actual ocr process and alt text generation?

[-]

SoftConsistent8857@reddit

for ocr on pdfs ive been using reseek and its honestly been solid for pulling text out of scanned docs and images. the ai tagging is pretty handy too for keeping stuff organized without manual work.

for alt text generation specifically tho you might wanna look at dedicated vision models like gpt 4o or claude 3 since theyre built for describing visual content. reseek handles the extraction side well but if youre doing full wcag compliance youll probably want a pipeline that combines both.

[-]

shansoft@reddit

It really depends on your system and how it can handle. Gemma3 27B is what I used before for something similar with structural output. Gemma4 31B is definitely better, but you need a hardware that can handle it. If not, the cheapest and reliable way is to use Gemini 3 Flash. Its speedy, cheap, and pretty consistent compare to all other model for OCR and processing.

[-]

onephn@reddit

I see, though I would want documents to remain local. Have you had good experiences with 26 a4b? I have hardware onsite that can run that, but not the 31b

[-]

9kSs@reddit

What hardware?

[-]

LocalAI_Amateur@reddit (OP)

AMD Ryzen 7840U laptop cpu. 32gb ram. 5070 ti 16gb vram through oculink

[-]

Gold-Drag9242@reddit

Wow. What were your settings and how long did it take? Did you run it with llamacpp or something else?

[-]

ambient_temp_xeno@reddit

You can just ask models to make what you want. If you just say "tetris pls" it might give you a basic becky one.

[-]

LocalAI_Amateur@reddit (OP)

I didn't know I can be so lazy with the prompt. I tried "GTA 6 pls" but did not get gta 6... It was able to give me flappy bird on request tho.

[-]

2Norn@reddit

imo without a specific prompt this is kinda useless because you just let model make assumptions. sure it tells you something, but it doesn't tell you what it's capable of.

[-]

Budget-Juggernaut-68@reddit

or you know... clone someone's github project

[-]

ambient_temp_xeno@reddit

Where's the fun in that?

[-]

Budget-Juggernaut-68@reddit

The fun is building something unique

[-]

MoneyPowerNexis@reddit

lol

[-]

spyboy70@reddit

"make becktris pls"

[-]

philmarcracken@reddit

make no mistakes + don't lose me any money

[-]

Sadman782@reddit

exactly

[-]

Key-Can-4768@reddit

Hello, what configuration settings did you use for Gemma 4 26b, Temperature, Top K, Top P, Repeat Penalty? And yet, for some reason, in lm Studio, she doesn't think for me, but immediately gives me an answer, do you know by chance how it should be, or where to turn on the parameter so that she can reflect? I have an unsloth/gemma-4-26b-a4b-it Q3_K_M

[-]

Sadman782@reddit

A custom fineutne or a system prompt can make gemma's default frotned style much better than what it now. It doesn't make Qwen better in coding.

for example
**ROLE:** Elite Frontend Coder

Architect & UI/UX Visionary.

**MANDATE:** Generate 100% COMPLETE, production-ready code. ZERO placeholders, `// TODO`s, or truncated logic. Write every single line required for a fully functional product, regardless of length.

**CREATIVITY & AESTHETICS [MAXIMUM PRIORITY]:** * **Award-Winning UI:** Do not build basic layouts. Engineer jaw-dropping, premium interfaces using modern design systems.

* **Rich Interactions:** Implement fluid animations, micro-interactions, sophisticated color palettes, complex gradients/shadows (e.g., glassmorphism, neumorphism where appropriate), and flawless responsive breakpoints.

* **Creative Autonomy:** If a request is ambiguous, take full creative control. Do not ask for clarification; immediately design and build the most visually stunning, highly-polished assumption.

But one shotting a complex app Gemma works better for me, as Qwen frequently causes errors. The biggest difference in the backend, Qwen hallucinates methods way more than Gemma.

Try with complex prompt (your games are already trained heavily, tweak them a bit)

Flappy Bird Multiplayer (Local vs AI)

Concept: The original Flappy Bird but with two birds on screen.

AI Twist: One bird is controlled by the user, the other by a "fuzzy" AI (as you mentioned before) that makes occasional mistakes, allowing for a competitive race.

Some features:

If the AI dies first, they become a spectator watching the Player continue until the player fails.

If the AI dies first, its bird falls off-screen, and the Player continues playing.

- if player dies first or last game over

- there will be settings the user cna change the bird shape, color, control game speed, see previous records(more importanyly live replay of physics action of previous games, not just scores)

- there will pause button to go settings, restart etc, basically a complete gameCreate this game in a single HTML file with a rich, cool-looking, clean UI and fully functional gameplay

[-]

LocalAI_Amateur@reddit (OP)

I will have to give that a shot. Thanks.

I did try some custom game coding with longer more specific instructions previously(again simple one page stuff). Gemma4 has definitely produced adequate results and Qwen3.6 finished them and added flairs. Maybe this will change with more complex tasks. but this is my impression so far.

[-]

Sadman782@reddit

gemma 4 26B with the system prompt

[-]

jinnyjuice@reddit

Interesting!

[-]

Sadman782@reddit

[-]

Lorelabbestia@reddit

Try same system prompt on both

[-]

seppe0815@reddit

GEMMA POWER !

[-]

Most_Feedback_8862@reddit

how about qwen3 coder next? is better?

[-]

unjustifiably_angry@reddit

Maybe I didn't give it a fair enough shake but I tried Q3CN briefly before I decided to go back to 122b, and a lot of people claim 27b is actually better than 122b for a lot of tasks. The lack of thinking was the main problem, IMO. If it was used in conjunction with a high-quality "planner" AI it might work better, I can't say.

Q3CN is certainly very fast though.

[-]

Sadman782@reddit

More complex examples:

create a 3d Rubik's cube in a single html file, where I can choose the n for how many rows and columns also, it must have a randomize button and a solve button so it can solve it after I randomize it fast (no cheating, it shouldn't just track what I did and reverse, it must use a genuine algorithm)

Gemma 4: No bugs, it's functional

Qwen: Can't even randomize, full of errors in the console

[-]

abitrolly@reddit

Gemma, give me the prompt that will make you shine in coding in comparison with Qwen. :D

[-]

-Ellary-@reddit

What Qs did you used for Gemma 4 and Qwen 3.6?
I've moved to Q6K for Qwen 3.6, Q4 was way unstable.

Also I've used this settings for code:

{{sampler temperature 0.6}}
{{sampler top_p 0.95}}
{{sampler min_p 0.0}}
{{sampler top_k 20}}
{{sampler presence_penalty 0.0}}

[-]

LocalAI_Amateur@reddit (OP)

The version of Gemma 4 and Qwen 3.6 I'm using both had problems generating that in one shot. I might dumb down the problem and try some more.

[-]

Sadman782@reddit

Try with topk 20, or maybe remove the system prompt for this otherwise it might overcomplicate things and cause minor bugs which needs fix. For me UD IQ4_XS gemma 4 with topk 20 does it everytime

[-]

sine120@reddit

I've been impressed by Qwen's little flairs. Gemma seems like it holds more general knowledge than Qwen, but for coding, I don't think it's a competition.

[-]

Sadman782@reddit

let's start the Qwen vs Gemma challenge, let's see who is genuinely better in coding than frontend aesthetics, which by default Qwen is trained to be amazing at, whereas Gemma needs a custom system prompt for better UI, but for raw coding skills, let's start a battle?

[-]

seppe0815@reddit

facts .. qwens all benchmaxed ... real life crap

[-]

Sadman782@reddit

yeah no hate against them, Qwen improved a lot since the 2 or 2.5 series (before Gemma 3.5 27B was my favorite model). But they lack consistency in real life coding. Aesthetics can be fixed, but severe hallucination in coding is not easy to fix. I just dislike the benchmark optimization

[-]

unjustifiably_angry@reddit

This doesn't match my experience at all. I wonder if it's the model or how it's used. I build incrementally and add features one at a time after testing the previous one works. I get the sense a lot of people think the ability to one-shot a complex problem is valuable but when you do that you create badly-organized code even the AI doesn't seem to fully understand, let alone the human prompting the AI.

[-]

Ok_Sprinkles_6998@reddit

What's the prompt for the one-shot task?

[-]

LocalAI_Amateur@reddit (OP)

It's on the images

[-]

Ok_Sprinkles_6998@reddit

Damn I thought it would be elaborate and complicated.

One liner and got these are so cool.

[-]

Due-Memory-6957@reddit

I'm big on less is more, so I prefer Gemma's version. A more changeling project might be better to compare, as we would see capacity instead of subjective aesthetics.

[-]

kaisurniwurer@reddit

Agreed.

I think I mostly prefer Gemma for it's more natural answer style. But for coding less is more, and this example seems to be exactly that. If I want more, I can just ask for it.

[-]

moahmo88@reddit

Thanks for sharing.Can you share your lm studio settings for the Gemma 4?

[-]

LocalAI_Amateur@reddit (OP)

Sure, nothing great. Gets me 60-70ish tokens per second on my hardware.

[-]

lemondrops9@reddit

no need to have your cpu thread pool size maxed out. You're already off loading all layers to the gpu.

[-]

moahmo88@reddit

No wonder so many people use Qwen. The same Q4 can use a 128K CTX.

[-]

Lorelabbestia@reddit

I mean, that's a \~35% increase in parameter count on Qwen vs Gemma. Comparing Gemma 26B vs Qwen 35B would be like comparing Gemma 26B vs gpt-oss-20b.

Anyways thanks for the comparison!

[-]

Sadman782@reddit

Gemma is more capable than you think even at that size. It is not tuned to please aesthetically by default See the image. It is the same Gemma 4 26B, see the difference

Try this system prompt:
**ROLE:** Elite Frontend Coder

Architect & UI/UX Visionary.

**MANDATE:** Generate 100% COMPLETE, production-ready code. ZERO placeholders, `// TODO`s, or truncated logic. Write every single line required for a fully functional product, regardless of length.

**CREATIVITY & AESTHETICS [MAXIMUM PRIORITY]:** * **Award-Winning UI:** Do not build basic layouts. Engineer jaw-dropping, premium interfaces using modern design systems.

* **Rich Interactions:** Implement fluid animations, micro-interactions, sophisticated color palettes, complex gradients/shadows (e.g., glassmorphism, neumorphism where appropriate), and flawless responsive breakpoints.

* **Creative Autonomy:** If a request is ambiguous, take full creative control. Do not ask for clarification; immediately design and build the most visually stunning, highly-polished assumption.

[-]

sid351@reddit

I'm curious to see a side by side of Gemma and Qwen with this system prompt, if anyone is up for testing it, please.

[-]

Imaginary-Unit-3267@reddit

Porque no los dos? I'd like to see someone try having Gemma and Qwen alternate tweaking the same code base. Maybe each one will recognize and fix the other's distinctive design flaws, making something better than either one by itself.

[-]

Sabin_Stargem@reddit

I am looking forward to trying the Qwen 3.6 122b. The possibility of recreating old games from my childhood is getting closer. Hopefully, the 122b can offer suggestions on how to get the AI to go through the original files, then recreate most of the contents into C#. Stars!, Castle of the Winds, and Quenzar's Caverns could all some porting from Windows 3.1, methinks.

[-]

qwen_next_gguf_when@reddit

Don't compare and just be happy 😊

[-]

Cool-Chemical-5629@reddit

In every life we have some trouble

But when you worry you make it double

Don't worry

Be happy, don't worry, be happy now...

[-]

the__storm@reddit

I'm beginning to see why vibe-coded websites have so many gradients and inverse drop-shadows and emojis.

[-]

Cool-Chemical-5629@reddit

Please, stop cherry picking, because two (and more) can play this game and I assure you I have prompts NONE of these two models can handle perfectly, BUT Gemma 4 26B A4B handles them better than Qwen. The only reason I did not post my results here to show where Qwen fails horribly whereas Gemma does a decent job is because I want Qwen team to succeed by figuring out the weaknesses of their models by themselves and trust me there are many. I'm not saying their models are bad, but all praise and no critique is not the way to improvement.

[-]

Porespellar@reddit

No Snake? 🐍 That’s prompt-2-game 101 bruh!

https://i.redd.it/2ms6srdhpewg1.gif

[-]

seppe0815@reddit

cool story bro 1

[-]

Mundane_Ad8936@reddit

Well it's a different class of models with different quantization so not exactly surprised to see different levels of performance..

Don't underestimate how big of a number 9 billion is.

the fact that both were able to create working code on a q4 is impressive..

[-]

CryptographerLow7817@reddit

Context size?

[-]

LocalAI_Amateur@reddit (OP)

only 20k for these examples. can go higher, but these are 1-shot tests

[-]

BigYoSpeck@reddit

I feel like Qwen models even going back to 3-coder have always been good at 'flair'

It always made 'aesthetic' pages with little design flairs. Now if you're asking for those things, or are happy with it taking the initiative that's great, but it doesn't necessarily mean it beats out other models ability to follow instructions and solving the actual problems in what you're using them for

If you put something like Claudes frontend design skill in other models they begin delivering more than bog standard basic designs. Admittedly Qwen goes up an even further notch though

If you want to genuinely test their capabilities against one another, don't give them a generic challenge like building tetris and then judge them on the flair they weren't asked to add. Ask them for it but with a twist that wasn't going to exist in the training set. Get them to change something about the core game mechanics and see how well they adapt

Better yet, don't ask for tetris or any other task by name, describe what they should build and see which adheres best

[-]

LocalAI_Amateur@reddit (OP)

Very valid point. It's about doing things they haven't already done a million times before. Maybe my opinions will change after more real world tasks.

[-]

brycesub@reddit

Can you share your llama-server settings for the Qwen3.6 model? I have 16gb of VRAM and 32gb of system ram and am having a hard time w/ OOM.

[-]

LocalAI_Amateur@reddit (OP)

I'm using LM studio. If you want llama-sever settings you really need to check out this thread. https://www.reddit.com/r/LocalLLaMA/comments/1sor55y/rtx_5070_ti_9800x3d_running_qwen3635ba3b_at_79_ts/

[-]

JuniorDeveloper73@reddit

Qwen3.6 shines with Hermes

[-]

rawdikrik@reddit

Iq4xs is tight no? Notice any issues?

[-]

LocalAI_Amateur@reddit (OP)

It's awesome at least compared to the lmstudio ggufs. I dont' think I use it enough to make any claims about absolutely no issues, but it's been very function and fast on 16gb vram.

AesSedai has some great compressions. Too bad they tend to only focus on the big models. They didn't do gemma 4 26b-a4b for example.

[-]

Fabulous_Fact_606@reddit

Agree. Qwen3.6-35B is impressive at game design. It one shot this frogger game for the layout; then it took a few tweaks to get the game mechanics right.

Processing img ah157tn83ewg1...

[-]

Fabulous_Fact_606@reddit

Processing img nncudiaf3ewg1...

Then one shot upload to my wireguard VPS and install it in docker to host it on the internet; Frogger — 10 Levels

[-]

jacek2023@reddit

are games functional?

[-]

LocalAI_Amateur@reddit (OP)

totally. I was surprised how fully playable the Qwen3.6 ones are. It's really just a one shot prompt, I didn't think there was a need to share it. People can one shot it themselves