Misunderstanding memory usage - 11.68gb quantized model takes up 22gb of RAM?
Posted by NotARedditUser3@reddit | LocalLLaMA | View on Reddit | 17 comments
I'm running unsloth/qwen3.6-35b-a3b IQ2\_XSS. It's 11.68 gb on disk, and when I load it in LM studio, it claims it will use / is using about 13GB of RAM.
In Task manager, my memory usage goes from 7GB to 30GB or more.
The individual process shows only \~15.5gb in task manager, but literally, that's the usage increase when I load the model, and it goes back down when I eject it in LM studio.
What's up with this? I've been struggling to load this model for a bit now thinking that quantized versions should need less RAM, but I'm running out.
I'm running on a CPU, running out of system ram. I can get \~20 tokens per second, but literally have no system memory to have anything else open, so I can't have any apps on this machine make use of it.
(This happens to me on the MTP and non MTP versions of this model btw)
Am I missing something? I had figured the RAM amount would always be roughly the disk size, but this is quite a bit off.
17 Comments
nickless07@reddit
NotARedditUser3@reddit (OP)
nickless07@reddit
NotARedditUser3@reddit (OP)
nickless07@reddit
Wrong_Mushroom_7350@reddit
NotARedditUser3@reddit (OP)
Badger-Purple@reddit
NotARedditUser3@reddit (OP)
Badger-Purple@reddit
NotARedditUser3@reddit (OP)
Snoo_81913@reddit
NotARedditUser3@reddit (OP)
Massive-Question-550@reddit
NotARedditUser3@reddit (OP)
Happy_Brilliant7827@reddit
NotARedditUser3@reddit (OP)