Gemma 4 for Mac 16GB

Posted by bachlac2002@reddit | LocalLLaMA | View on Reddit | 7 comments

Hi guys,

I'm fairly new to this Local LLaMA stuff but I want to run one on my Mac mini M4 16GB. I have been digging around and manage to find 2 suitable models. Have anyone tried it or anyone have a better model for this specs?

https://ollama.com/batiai/gemma4-e4b

https://www.reddit.com/r/LocalLLaMA/comments/1scjoox/gemma4_26b_a4b_runs_easily_on_16gb_macs/

Thank you!

[-]

mrskeptical00@reddit

Gemma4:e4b runs great on M4 MacMini 16GB.

MacMini M4 16GB

total duration: 23.153652625s load duration: 148.027958ms prompt eval count: 23 token(s) prompt eval duration: 386.053042ms prompt eval rate: 59.58 tokens/s eval count: 653 token(s) eval duration: 22.398108786s eval rate: 29.15 tokens/s

MacBook Air M5 16GB

total duration: 17.521039958s load duration: 171.784625ms prompt eval count: 23 token(s) prompt eval duration: 589.334375ms prompt eval rate: 39.03 tokens/s eval count: 575 token(s) eval duration: 16.483901037s eval rate: 34.88 tokens/s

[-]

Fuzzy-Layer9967@reddit

Gemma 4 is cool but 26B on 16GB is gonna be rough, you'll get a ton of CPU offloading and it'll feel sluggish. The 12B fits way better on your setup.

Also worth trying Ministral 3 8B, it runs super smooth on Apple Silicon and punches above its weight for an 8B. Vision support too if you ever need it. Just ollama pull ministral-3:8b and you're good.

If you want something crazy fast for quick stuff, look at Gemma 3n E4B too, it's Google's edge model so it barely uses any RAM, but honestly with 16GB you can afford to go bigger.

The general rule on 16GB: stay in the 8-14B range and everything fits in memory, that's where the magic happens. Once you start spilling to CPU it gets painful fast.

[-]

Gemma 4 for Mac 16GB

mrskeptical00@reddit

MacMini M4 16GB

MacBook Air M5 16GB

Fuzzy-Layer9967@reddit

Practical-Collar3063@reddit

Fuzzy-Layer9967@reddit

totonn87@reddit

Status_Record_1839@reddit

Safe_Sky7358@reddit