Comparing Qwen3.5 vs Gemma4 for Local Agentic Coding

Posted by garg-aayush@reddit | LocalLLaMA | View on Reddit | 95 comments

Gemma4 was relased by Google on April 2nd earlier this week and I wanted to see how it performs against Qwen3.5 for local agentic coding. This post is my notes on benchmarking the two model families. I ran two types of tests:

My pick is Qwen3.5-27B which is still the best model for local agentic coding on an 24GB card (RTX 3090/4090). It is reliable, efficient, produces the cleanest code and fits comfortably on a 4090.

Generation speeds based on llama-bench

Model Architecture Generation (tokens/s)
Qwen3.5-35B-A3B MoE 165.84
Gemma4-26B-A4B MoE 164.38
Qwen3.5-27B Dense 45.88
Gemma4-31B Dense 44.42

Single-shot agentic coding tasks

I tested two prompts (simple httpx script and a more complex Gemini image generation workflow with TDD) where the model has to figure everything out on its own.

Speed in llama.cpp + OpenCode setup

Model Prefill tok/s (P1) Prefill tok/s (P2) Gen tok/s (P1) Gen tok/s (P2)
Gemma4-26B-A4B 4,338 4,560 135.5 134.4
Qwen3.5-35B-A3B 3,179 3,056 136.7 132.3
Gemma4-31B 1,466 1,357 37.7 35.2
Qwen3.5-27B 2,474 2,188 44.9 44.6

Generated Code Quality on complex prompt

Aspect Gemma4-26B-A4B Gemma4-31B Qwen3.5-35B-A3B Qwen3.5-27B
Structure 2 files, basic separation 3 files, clean separation Class-based with helpers, cleanest design 3 files + dead main.py stub
Error handling Minimal, no API error handling Poor, no try/except around API Adequate but no batch error recovery Weak, silent failures
TDD Placeholder test, no real TDD One integration test, superficial Integration tests only, claimed but not real Integration tests only, claimed but not real
Cleanliness Acceptable, concise Good, readable, concise Good structure but unused base64 import Good docstrings, type hints, pathlib usage
Critical issues Broken summary, no uv run setup New client per API call Hardcoded API key in tests, wrong model Dead main.py, new client per call

Key Takeaways

You can find the detailed analysis notes here: https://aayushgarg.dev/posts/2026-04-05-qwen35-vs-gemma4/index.html

Happpy to discuss and understand other folks experience too.