which macbook configuration to buy

Posted by Ayuzh@reddit | LocalLLaMA | View on Reddit | 9 comments

Hi everyone,

I'm planning to buy a laptop for personal use.

I'm very much inclined towards experimenting with local LLMs along with other agentic ai projects.

I'm a backend engineer with 5+ years of experience but not much with AI models and stuff.

I'm very much confused about this.

It's more about that if I buy a lower configuration now, I might require a better one 1-2 years down the line which would be very difficult since I will already be putting in money now.

Is it wise to take up max configuration now - m5 max 128 gb so that I don't have to look at any other thing years down the line.

I posted this in LocalLLM as well, got some good responses. I wanted to get opinions from people here as well.

[-]

Zestyclose_Yak_3174@reddit

I would go with a Max CPU and at least 64GB. I learned that the hard way a few years back

[-]

If you’re just want a local chat bot, 64GB is more than enough. For any long context serious work, you’ll want 128GB. Little ~30B model with 300K context has crashed my 128GB mac multiple times. Even though ~120B with 4K context always runs fine.

[-]

No_Algae1753@reddit

Skill issue

[-]

xeeff@reddit

~30B model with 300K context has crashed my 128GB mac

you're telling me I can run 35b a3b with 200k context on my 16gb VRAM but you're struggling with 128gb?

[-]

tony__Y@reddit

Interesting... what's your setup? I just loaded a 35B-A3B Q4 model with 256k context and it's already using 24GB VRAM doing nothing.

[-]

xeeff@reddit

inference: https://github.com/atomicmilkshake/llama-cpp-turboquant (ROCm but supports CUDA as well)

KV cache quantised to turbo3. was surprised to see how well it actually behaved but turbo4 is better if you're wary of degradation. I personally wouldn't use the triattention feature since i like my agents with long context and NIAH tests start becoming an issue past 32k

model: https://huggingface.co/byteshape/Qwen3.5-35B-A3B-GGUF (GPU-3, 2.89 bpw fits well for me)

aside from -ctk/-ctv turbo3, other flags aren't anything special. let me know how it goes if you end up trying it out

[-]

shbong@reddit

Macbooks are great because they share the ram with their gpu so tecnically you'll get the same amount of VRAM of your RAM

[-]

Willybecher@reddit

Minimum you want a Max CPU with 400GB/s and 64GB RAM - if you go the agent rout - just local chatting M1 16/32GB is sufficient If money is available Studio Ultra 96GB or bigger - agentic work with bigger models should work