Gemma-4-E2B-IT seems to be as good or better than Qwen3.5-4B while having massively shorter reasoning times on average
Posted by ZootAllures9111@reddit | LocalLLaMA | View on Reddit | 14 comments
Doct0r0710@reddit
Similar experience over here. Although since my use case doesn't require thinking I'm still falling back to Qwen3 4b 2507. For social media feed aggregation and categorization i like its results more than Gemma E2B or E4B.
Final-Frosting7742@reddit
Why use Qwen3? Just disable thinking on Qwen3.5.
Doct0r0710@reddit
It's nowhere near the same... "just"...
audioen@reddit
Your prompting is slightly illogical. Here's what I got:
Maybe try to write a more clear request?
adel_b@reddit
iIRC E2B is 4bit quant of 8b or something, you need to double check
InnonCoding@reddit
e4b is 8b, e2b is 4b.
Final_Ad_7431@reddit
you have to use a better frontend and just have good prompting, ive literalyl never experienced this multi minute thought thing on qwen3.5, in openwebui, in hermes, it thinks like, the same as other models
HugoCortell@reddit
Just about any model has shorter reasoning than Q3.5, its reasoning step is a monstrosity. Not surprised the same level of quality can be achieved while cutting most of that fat.
EndlessZone123@reddit
I hope you are not coming to a conclusion from just a translation test because gemma/google models have typically been the best at it outside of Chinese.
DeepOrangeSky@reddit
Are you using a GGUF? And if so, which one? Have you had errors to load their E4B model (not E2B)? I tried to test the E4b model but it gives some "failed to load" error message in LM Studio. But lots of people on here seem to be able to run their models, so, not sure if it is just specific quants or models or what
FamousFlight7149@reddit
You should download the GGUFs for Gemma 4 from the official Google DeepMind site: https://deepmind.google/models/gemma/gemma-4/#download
ZootAllures9111@reddit (OP)
the Unsloth one didn't work, dunno why, grabbed the lmstudio-community one instead, and that works fine.
DeepOrangeSky@reddit
Alright thx
Confusion_Senior@reddit
qwen opus finetunes fixes this