which macbook configuration to buy
Posted by Ayuzh@reddit | LocalLLaMA | View on Reddit | 9 comments
Hi everyone,
I'm planning to buy a laptop for personal use.
I'm very much inclined towards experimenting with local LLMs along with other agentic ai projects.
I'm a backend engineer with 5+ years of experience but not much with AI models and stuff.
I'm very much confused about this.
It's more about that if I buy a lower configuration now, I might require a better one 1-2 years down the line which would be very difficult since I will already be putting in money now.
Is it wise to take up max configuration now - m5 max 128 gb so that I don't have to look at any other thing years down the line.
I posted this in LocalLLM as well, got some good responses. I wanted to get opinions from people here as well.
Zestyclose_Yak_3174@reddit
I would go with a Max CPU and at least 64GB. I learned that the hard way a few years back
Ayuzh@reddit (OP)
please tell me more about it
tony__Y@reddit
If you’re just want a local chat bot, 64GB is more than enough. For any long context serious work, you’ll want 128GB. Little ~30B model with 300K context has crashed my 128GB mac multiple times. Even though ~120B with 4K context always runs fine.
No_Algae1753@reddit
Skill issue
xeeff@reddit
you're telling me I can run 35b a3b with 200k context on my 16gb VRAM but you're struggling with 128gb?
tony__Y@reddit
Interesting... what's your setup? I just loaded a 35B-A3B Q4 model with 256k context and it's already using 24GB VRAM doing nothing.
xeeff@reddit
inference: https://github.com/atomicmilkshake/llama-cpp-turboquant (ROCm but supports CUDA as well)
KV cache quantised to turbo3. was surprised to see how well it actually behaved but turbo4 is better if you're wary of degradation. I personally wouldn't use the triattention feature since i like my agents with long context and NIAH tests start becoming an issue past 32k
model: https://huggingface.co/byteshape/Qwen3.5-35B-A3B-GGUF (GPU-3, 2.89 bpw fits well for me)
aside from -ctk/-ctv turbo3, other flags aren't anything special. let me know how it goes if you end up trying it out
shbong@reddit
Macbooks are great because they share the ram with their gpu so tecnically you'll get the same amount of VRAM of your RAM
Willybecher@reddit
Minimum you want a Max CPU with 400GB/s and 64GB RAM - if you go the agent rout - just local chatting M1 16/32GB is sufficient If money is available Studio Ultra 96GB or bigger - agentic work with bigger models should work