Would a MacBook M5 16/24/32GB be an upgrade, complement, or waste next to my RTX 4060 laptop?
Posted by heitortp0@reddit | LocalLLaMA | View on Reddit | 38 comments
Hi everyone,
I’m trying to understand whether buying a future/possible MacBook M5 with 16GB, 24GB, or 32GB unified memory would make sense for my local AI workflow, or whether it would mostly be a waste given my current setup.
My main machine is:
Acer Nitro laptop
RTX 4060 Laptop GPU, 8GB VRAM
Intel i7-13620H
32GB RAM
Around 1.5TB SSD
Windows 11, with WSL2/Linux available
My current/desired local AI use cases are:
Running local LLMs through LM Studio, Ollama, llama.cpp, etc.
RAG over legal/jurisprudence documents
Transcription with faster-whisper
Document processing and summarization
Possible local agents / automation
Maybe voice assistant experiments
General AI tinkering without relying entirely on cloud APIs
I understand that the RTX 4060’s 8GB VRAM is the main limitation for larger models, but it is still a real NVIDIA GPU and works well with many local AI tools. On the other hand, Apple Silicon has unified memory, great efficiency, battery life, and seems attractive for running larger quantized models that do not fit in 8GB VRAM.
My question is: would an M5 MacBook with 16GB, 24GB, or 32GB unified memory actually improve my local LLM experience in a meaningful way?
More specifically:
-
Would a 16GB M5 be pointless for local LLMs compared to my RTX 4060 laptop?
-
Is 24GB unified memory enough to make the MacBook a useful complement?
-
Is 32GB the minimum where Apple Silicon starts to make real sense for local LLMs?
-
Would the MacBook be better as a secondary portable/efficient machine rather than a replacement?
-
For my use case, would I be better off spending the money on a desktop GPU with more VRAM instead?
-
Are there workflows where the MacBook + RTX 4060 laptop combination makes sense, or would I just be duplicating capabilities?
I’m not trying to train large models. I mostly care about inference, RAG, document workflows, transcription, and experimentation.
I’d especially appreciate opinions from people who have both an NVIDIA 8GB VRAM laptop and an Apple Silicon Mac with 16–32GB unified memory.
Is the MacBook a real improvement, a nice complement, or just not worth it for this setup?
SeoFood@reddit
I’d think of the Mac as a complement, not a replacement for the 4060 laptop.
Your NVIDIA machine is still the better “I want CUDA support and maximum compatibility” box. For a lot of local AI tooling, that matters. The Mac starts making sense if you specifically value portability, battery life, quiet operation, unified memory for larger quantized models, and a smoother daily-driver experience.
For your listed use cases:
If money is tight, I wouldn’t buy a 16GB Mac mainly for local LLMs. If you want a portable complement and can stretch to 24/32GB, it becomes a lot easier to justify.
ProfessionalSpend589@reddit
Laptops are for travelling.
Do you travel a lot and have many hours without stable/fast internet, but have electricity nearby to keep the laptop battery from draining? - A more powerful laptop may suit you.
Everything else - probably not. Actually I would say a strong not. Why would you want 2 batteries, 2 displays, 2 sets of keyboard and touchpad and speakers for running whatever type of workload you’re running now? None of those contribute with tokens, yet they are an expense.
heitortp0@reddit (OP)
I have a motor disability, which makes laptops easier to use for me. Also, my idea was not to carry two laptops. The MBA would be my main machine daily, and the Nitro would work as a more "brute force" workstation.
ProfessionalSpend589@reddit
I’m not sure it’ll be good for the batteries to be plugged in 24/7. Does the Nitro laptop allow you to limit battery charge to 60% or 80%?
A mini pc + eGPU and 32GB GPU would have been an obvious suggestion last year.
heitortp0@reddit (OP)
Yeah, I limited the Nitro battery to 80%. You're absolutely right about the eGPU suggestion; it's a shame those prices have skyrocketed.
neuromacmd@reddit
For your stated mix, a base M5 is a nice complement, not an upgrade, and only at 24GB minimum, ideally 32GB. If portability and silence matter, get the 32GB and keep the 4060 for whisper and fast small-model work. If they don’t, a used 24GB desktop GPU is the better spend by a wide margin. The one config I’d actively talk you out of is the 16GB it’s the worst of both worlds for this workload. The Mac does not necessarily give you a lot faster inference and more importantly, does not buy you faster prompt processing which is under appreciated everywhere. The only way there is fast video cards (ideally Nvidia) this is why the 3090 is still so popular. What it does get you is a very efficient computer with great battery life that can do some llm inference on the side. I have tried all sorts of hardware for local inference and if I had to start again I would go the Nvidia route from the get go if that is your primary intended use. This is coming from someone who’s laptop is a MacBook Pro.
heitortp0@reddit (OP)
Maybe replacing my Nitro with a PC with a better GPU would be the best alternative, but I'd still like to test the MBA in the workflow.
neuromacmd@reddit
Not sure where you are geographically. Some stores allow you to try laptops for a while. Local laptop ai is useful for simple chat and things like whisper and auto completions but real agentic workflows and multistep workflows are limited by the memory bandwidth. Take a look at the strix halo laptops. They have higher bandwidth than the non pro MacBook Air and the 64gb is around the same price. I am playing with an Asus Proart px13 and I am loving the power and form factor. The advantage is the x86 compatibility including Linux but you are giving up on battery life and efficiency.
libregrape@reddit
heitortp0@reddit (OP)
First, thank you for such a complete answer. I was very excited about the idea of the MBA because (i) I was always a Windows user and (ii) I thought that unified memory would give me a gain over my VRAM. But I'm starting to think it's not such a good idea.
MrPecunius@reddit
Get 32GB for a regular M5. According to oMLX data, you should be able to get \~50t/s generation and >1,000t/s prefill with a 4-bit MLX quant:
https://omlx.ai/c/aa2ktmf
With 32GB you should have all the RAM you need to run the OS+apps+LLMs appropriate for the processor.
heitortp0@reddit (OP)
It looks useful to me thx
xraybies@reddit
Forget anything < 32GB. Even then the biggest problem is MacOS. OOTB it will consume 6GB as soon as you load any app Chrome, OpenCode you're hitting 12GB. So you have \~24GB usable @ <400Gb/s which is 3090 at best. You can clawback another 2-3GB by disabling everything you can in MacOS... like https://github.com/rayone/machete/blob/main/disable.sh
So from your perspective it's like you have an RTX 4060 laptop where you can choose between 8-24GB VRAM. I would say totally not worth it.
My M5 Max w/ 128GB is usually in the \~30GB of memory used without even loading a LLM, just Chrome, Edge, VS Code, OpenCode + skills. As soon as I load oMLX + Qwen 3.6 mxfp8 it's hot, loud, using \~90GB and much slower than my i9 1300k + 4090, except the SSD which is fast <16GB/s.
So M5 with >64GB only starts to make sense from a usage perspective... cost is subjective.
The only aspects of the M5 which impress me are the SSD and battery life, when not running an LLM, everything else is avg, and the audio and macOS are a joke.
heitortp0@reddit (OP)
I'm getting a little disappointed with the idea I had about the MBA if that's the case :/
jcdoe@reddit
Apple silicon 16 gb feels like 8, because you have to run your os and web server too. FYI
heitortp0@reddit (OP)
Yeah, I've read about it, but I'm looking for the practical knowledge of this sub
jcdoe@reddit
I own an m1 mbp with 16 gb ram. It feels like 8 gb because of the overhead. I’m not speaking hypothetically, this is real experience.
Get at least 48 gb if you go Mac, you won’t regret the extra ram.
boston101@reddit
I’m on m1 8gb and running ternanry models - can’t wait to upgrade
jcdoe@reddit
I upgraded my M1 Pro 16 gb to an m5 pro 48 gb recently. I liked the idea of running llms on the old machine with a web interface for personal assistant type stuff.
But it sucks at 16 gb. I can safely run models up to 10 gb before i have to worry about it going into swap space (it still happens tho).
Oh, also, Apple limits memory bandwidth for their binned chips. So his 16 gb will run slow, and he’ll be pulling down 30/40 t/s even with the model fully loaded into ram.
But don’t listen to me, he wants “practical knowledge” lmao
boston101@reddit
Ugh, I wanted to use my Mac’s as old servers I wish Apple would allow no limits and ability to turn off so many processes I don’t need for the server.
TBH I’m going nvidia route instead of osx upgrade.
jcdoe@reddit
You should. Nvidia and Linux are the primary platforms for local LLMs, the best bang for buck, and use standard bash commands (I spent too much time rewriting a bash script because Macs don’t use apt-get. No no. They use “brew”, which takes slightly different command line arguments than Linux.
I like my Mac, but I bought it as a computer. LLMs are just a bonus.
boston101@reddit
lol I have access to cloud gpus and and developer , I just want to do it all in house now.
Fuck I’m old and full circle , on prem is back
heitortp0@reddit (OP)
I didn't think you were the kind of person who wouldn't know how it works in practice. What I meant was I came here to know exactly what people like you, who happen to use this exact hardware, could say about it. No need to shade like this. Thank you.
MrPecunius@reddit
48GB isn't available for M5 and would be too much for it to use effectively in any case.
48GB isn't really enough for a M5 Pro, either (source: I had a M4 Pro/48GB and now have a M5 Pro/48GB).
FineClassroom2085@reddit
If you get a 32gb M5 pro, it will outperform your PC in everything but prompt processing. Honestly if you can swing it, go 64gb, that opens full weights/context Gemma 4 and Qwen 3.6 27b which are staggeringly good for their weight class.
You probably won’t use your PC any more except for maybe stable diffusion work.
heitortp0@reddit (OP)
You mean a macbook pro?
MrPecunius@reddit
Speaking as a Macbook Pro (M5 Pro/64GB/2TB) owner, I'd get a Macbook Air for a regular M5.
FineClassroom2085@reddit
Yes
heitortp0@reddit (OP)
I'm from Brazil. Not sure if I could afford. But what do you think about the 24 and 32 ram versions?
Barbaricliberal@reddit
I have the binned M5 Pro, 48 gb of ram, and it's been great.
The 48gb binned M5 Pro (14 in) MBP seems to be a good value if price is an issue.
heitortp0@reddit (OP)
It doubles the price I'm willing to spend unfortunately
itsappleseason@reddit
What does the secondhand M1 Max market look like in your area?
Barbaricliberal@reddit
Ooof, fair enough.
Have you considered getting a MacBook Air? You almost certainly can get the same specs as the M5 MBP (including the RAM) for cheaper.
For instance, in the US it's $1500 for the MBA vs $2100 for the MBP for the 10 core M5 and 32gb of ram.
heitortp0@reddit (OP)
Yeah, the 32 MBA is the one I'm more interested in. Good choice?
Barbaricliberal@reddit
I'd say so, it's the same specs-wise for the most part as the base M5 MBP, but cheaper.
The only difference performance-wise vs the MBP are the thermals are better on the MBP since it has a fan. But the difference isn't a big deal.
FewBasis7497@reddit
Mh, but what about thermal throttling?
I have a work MacBook Pro Max M4 - 36GB. Even with a room temperature of around 20 C° the fans start to rev up after a short usage of llama.cpp and do not stop until the usage has been stopped.
-Yes, M4 != M5 but LLM inference is power demanding.
see also:
https://www.reddit.com/r/macbookair/comments/1jd2cbc/is_m4_macbook_air_32_512_good_for_aillm/?show=original
libregrape@reddit
Then what do you mean exactly? There is no just "MacBook". Do you mean MacBook Air M5? In that case, as others said it would only make real sense with 32GB ram, but that would cost you 1.7k... And you probably can construct a better rig with that.
FineClassroom2085@reddit
32gb is ok, it’s just that you’ll regret not buying the most ram you can possibly afford, especially as a daily driver.