Questions about running Gemma 4 on Apple Silicon

Posted by TaylorHu@reddit | LocalLLaMA | View on Reddit | 14 comments

Hello all,

Just picked up a used Mac Studio, M1 Ultra, 64gb. Pretty new to running local models. I wanted to play around with Gemma 4 31B, through Ollama, but running into some trouble. When I load it my memory usage jumps to \~53gb at idle, and if I try and interact with the model at all the memory peaks and Ollama crashes.

According to this, it should only take \~20gb of memory, so I should have plenty of room: https://ollama.com/library/gemma4

Now Google's model card does list it at \~58gb, at the full 16-bit: https://ai.google.dev/gemma/docs/core

So neither of those line up exactly with what I am seeing, though the "official" model card does seem closer. Why the discrepancy, and is there something, in general, I should know about running these kinds of models on Ollama?