Gemma-4-E2B-IT seems to be as good or better than Qwen3.5-4B while having massively shorter reasoning times on average

Posted by ZootAllures9111@reddit | LocalLLaMA | View on Reddit | 14 comments

[-]

Doct0r0710@reddit

Similar experience over here. Although since my use case doesn't require thinking I'm still falling back to Qwen3 4b 2507. For social media feed aggregation and categorization i like its results more than Gemma E2B or E4B.

[-]

Final-Frosting7742@reddit

Why use Qwen3? Just disable thinking on Qwen3.5.

[-]

Doct0r0710@reddit

It's nowhere near the same... "just"...

[-]

audioen@reddit

Your prompting is slightly illogical. Here's what I got:

Maybe try to write a more clear request?

[-]

adel_b@reddit

iIRC E2B is 4bit quant of 8b or something, you need to double check

[-]

InnonCoding@reddit

e4b is 8b, e2b is 4b.

[-]

Final_Ad_7431@reddit

you have to use a better frontend and just have good prompting, ive literalyl never experienced this multi minute thought thing on qwen3.5, in openwebui, in hermes, it thinks like, the same as other models

[-]

HugoCortell@reddit

Just about any model has shorter reasoning than Q3.5, its reasoning step is a monstrosity. Not surprised the same level of quality can be achieved while cutting most of that fat.

[-]

EndlessZone123@reddit

I hope you are not coming to a conclusion from just a translation test because gemma/google models have typically been the best at it outside of Chinese.

[-]

DeepOrangeSky@reddit

Are you using a GGUF? And if so, which one? Have you had errors to load their E4B model (not E2B)? I tried to test the E4b model but it gives some "failed to load" error message in LM Studio. But lots of people on here seem to be able to run their models, so, not sure if it is just specific quants or models or what

[-]

FamousFlight7149@reddit

You should download the GGUFs for Gemma 4 from the official Google DeepMind site: https://deepmind.google/models/gemma/gemma-4/#download

[-]

ZootAllures9111@reddit (OP)

the Unsloth one didn't work, dunno why, grabbed the lmstudio-community one instead, and that works fine.

[-]

DeepOrangeSky@reddit

Alright thx

[-]

Confusion_Senior@reddit

qwen opus finetunes fixes this