Bloomberg: No Mac Studios until at least October
Posted by eclipsegum@reddit | LocalLLaMA | View on Reddit | 52 comments
https://9to5mac.com/2026/04/19/new-mac-studio-may-not-arrive-until-october/
What’s coming first? Deepseek v4 or the Studios that can run it?
eclipsegum@reddit (OP)
Should have bought the Mac Studio M3U 512GB two months ago. Waiting 6 months in LLM time is like Miller’s planet in Interstellar. Feels like 20 years will pass with tons of new models.
flyingbanana1234@reddit
im in this exact boat i forsaw this as exact situation with current mac studios unavailable and m5 studio delayed
but thought id uncharacteristically wait till june for the m5 studio
might buy a dgx spark instead atp
rpkarma@reddit
I love my Spark (well Asus GX10), but buying it for inference isn’t what I’d suggest. It’s lots better than it was. But SM121 is so unique that we’re only just getting to the point where kernels run on it to take advantage of its abilities.
Where it rocks is: learning CUDA and fine tuning bigger models. It’s a great device, I love mine. But just setting expectations!
flyingbanana1234@reddit
thank you for your advice !!! :)
TheRealMasonMac@reddit
On the bright side, you'll get better hardware for the same amount of money when things calm down.
eclipsegum@reddit (OP)
Feels like the zuck meme of peeking in the window at all the M3U 512 owners
fivetoedslothbear@reddit
I was waiting for the M5 to buy a Mac Studio 512GB.
In December 2024, I bought a 128GB M4 Max MacBook Pro on short notice for "reasons" and looking at the current situation, appreciating what I've got. Even the Mac Mini is capped at 64 GB right now.
fragment_me@reddit
Dude you’re going to love qwen 4.0 on that thing !!!
StupidScaredSquirrel@reddit
Just accept that hardware is a nightmare and run whatever sota model fits your current hardware. Best of luck
segmond@reddit
Oh well, Blackwell pro 6000 is looking like the option.. I was waiting to decide if I should get M5 Studio Ultra or at least 2xBlackwell pro 6000. M5 Ultra is suppose to match 4090, if that's true and it has at least 512gb then it will be worth the wait. However rumor is that it will now max out at 256gb. I'm going to wait till the end of the year. If it doesn't come out, I go blackwell pro, if it comes out and doesn't measure up, blackwell pro. For now, I'll manage with my current rigs.
FullstackSensei@reddit
Oh , no.... Anyway, I finally installed the fourth 3090 that's been sitting in my parts cabinet since six months in my triple 3090 rig, making it a quad 3090.
I now get a consistent 17-18t/s TG and ~76-80t/s PP running Qwen 3.5 397B Q4_K_XL all the way to 180k context. This is using vanilla llama.cpp. Ik would probably be faster if I bothered tuning parameters. Might not sound much, but with prompt caching PP takes less than 30 seconds per request and the whole request is done in under a minute for most requests. Power draw is ~600W from the wall during inference.
Even at today's prices, I could build it for half the price of the M3 Ultra 512GB. Doubt I'll use 5k worth of electricity over the lifetime of the machine.
segmond@reddit
You are preaching to newbies who just got into the scene recently or a few months ago. They don't understand...
droptableadventures@reddit
You're buying second hand 3090s because the hardware you want isn't decently priced brand new.
You've also got a long way to go - you'll need 18 more 3090s to hit 512GB.
FullstackSensei@reddit
No. I have 512GB system memory. So, I have a tad over 600GB total. The four 3090s take care of all the attention layers and context, while the octa-channel system memory takes care of the FFN layers. How else do you think I'm running a 400B at Q4_K_XL (245GB) plus 180k context?
droptableadventures@reddit
8 channels of RAM is decent (I have a Threadripper Pro machine), but it's about the bandwidth of an Apple Silicon "Max" chip. The Ultra is double that.
FullstackSensei@reddit
Like everyone downcoting me, you're ignoring the compute of the 3090s.
On large models, the Ultra doesn't even get to 30% of the memory bandwidth because it's limited by compute. You can double the memory bandwidth and it won't run much faster because most of the actual time is spent crunching attention.
The M3 ultra has the GPU compute of a single 3080, or a single Mi50. Even if an M5 Ultra doubles that, it'll still be the compute of about 1.5 3090s.
This sub is full of people reporting benchmarks on the M3 Ultra. Lookup how many t/s they get on a 400B model with 150k context.
I understand the appeal of the machine, but it's too darn expensive for what it offers if you know what you're doing.
droptableadventures@reddit
I have a dedicated machine with two 3090s and 8 MI50s.
I now know more about PCIe than I ever thought I would, and I've had to take a soldering iron to some of the hardware to make this setup work. It's loud, heavy and lives in a custom case made from T-slot aluminium extrusion. If I put it in the back of the car and drove it somewhere, it probably wouldn't work on arrival because some connector somewhere would need reseating. And mine's one of the ones you could feasibly move.
I definitely get the appeal of a Mac Studio, even if the performance per dollar isn't as good.
segmond@reddit
your stuff is loud because of the MI50. I have a rig with plenty 3090 that is quiet because they all have triple fan and rarely kick in.
I also have 10xMI50 in a crypto miner case with massive fans that is quiet and fan controlled to the temperature, barely hit 40% speed.
CheatCodesOfLife@reddit
I thought you were the 8 x MI50 guy?
FullstackSensei@reddit
Six Mi50s, which is my 3rd rig. The 8 are P40s, my first rig, and the 3090s is my 2nd rig. I also tried to make an A770 rig last summer, before Mi50s hit the market, but the software exper was an utter failure, so the A770s got resold at purchase price.
CheatCodesOfLife@reddit
Smart move, any of the "performance improvements for sycl" in llama.cpp seem to target BM or newer only.
FullstackSensei@reddit
It wasn't the performance. I couldn't get them to work at all. I never managed to compile llama.cpp with sycl. I only got them to work using Intel's own build which was like 2 months behind. Qwen 3 235B was very unstable. I could get a short prompt to work, but if I threw 10k at it llama.cpp would just crash. And this was after like 3 or 4 days of trying.
fallingdowndizzyvr@reddit
I've compiled it before. But there's really no point. Vulkan works faster.
marhalt@reddit
I have an M3 Ultra 512 GB. I love it. Can run everything I throw at it (except Deepseek 3.2), just too large with a decent context size. I wanted to pick up the M5 Ultra the minute it comes out, but I am wondering if another M3 Ultra 512 is the way to go, and then pair them with EVO. Unless the M5 comes out with 1TB, not sure where the M5 will be so much better than the M3?
segmond@reddit
rumor is that M5 ultra won't even get 512gb but will be limited to 256gb due to ram shortage.
fivetoedslothbear@reddit
You'd have to get one used, because Apple's not selling the M3 Ultra with 512GB at the moment. 256GB is the upper limit.
Kind of disappointing that all the neat clustering demos hit YouTube, and then the hardware got constrained.
nomorebuttsplz@reddit
You should try deepseek again. I can run GLM 5.1 with 100k context
FullOf_Bad_Ideas@reddit
What PP do you get with Qwen 3.5 397b at various context lengths?
michael_p@reddit
I got a m3 ultra with only 96 gb and for my use case, I couldn’t be happier. Probably the best thing I’ve ever bought
benevbright@reddit
Which model? And do you use it with coding agent? (Curious about the use case that makes you happy)
michael_p@reddit
Qwen for confidential business analysis for acquisition purposes. I do not use it for coding - just Claude code.
benevbright@reddit
Um… but big dense models don’t give you good token speed for agentic use, no? I think the best one would be still Moe model like Minimax even if you have 512gb ram mac or?
eclipsegum@reddit (OP)
If you have $35K to drop on eBay for one of the last 512s in existence then yes. Must be nice to run GLM5.1 locally
Objective-Picture-72@reddit
It's clear at this point that many had inside information on this development and have been buying up the large M3 Ultra models in advance.
thrownawaymane@reddit
Maybe, but really this happened right after OpenClaw exploded. The whales realized that AI could be used for work locally and bought all of them IMO. The 512 unit must have been a specialty sku for them anyway, who knows how many they normally kept on hand.
redmctrashface@reddit
How do you guys have the cash to buy it?
michael_p@reddit
It’s a rounding error compared to how much it helps save in human labor costs
redmctrashface@reddit
I don't deny that. If I could afford one I would. Unfortunately, living in the Germany is not the best pick regarding salaries
michael_p@reddit
Yea I think there’s a big disconnect on here as most people are hobbyists on a budget and some are using it in a business setting where for the ROI, money is not ever going to be a problem.
fallingdowndizzyvr@reddit
LOL. It's called a "job".
redmctrashface@reddit
I guess we don't have the same salary then
fallingdowndizzyvr@reddit
Note that just because a Mac Studio might come out in October doesn't mean it'll be a M5 Ultra. It'll probably be a M5 Max, so fundamentally no different than getting a Macbook today.
Ultra tends to lag by longer than that.
flyingbanana1234@reddit
except the saving of 2 grand but also on top of that they might offer m5 max with more ram for the same price as the 128 gb macbook m5
fallingdowndizzyvr@reddit
Doubtful. Have they ever? Whether it's in a Macbook or in a Mac Studio, it's the same chip. The RAM is built onto the chip platform. So it's whether they make M Max chip with that much RAM. If they do, then there's no reason to not offer it on the Macbook too.
pantalooniedoon@reddit
I think he means they could still offer a 256 version but itll be same price since it isnt a laptop. No screen, no keyboard, etc.
flyingbanana1234@reddit
when have they ever only offered a 128 gb mac studio
with no higher ram configs and no ultra variant ?
plans change plus ultra chip was seen in the software a while back
Veearrsix@reddit
The notebook form factor itself is a trade off. Maybe not worth 6 months, but cooling and therefore max sustained from a thermal perspective is better on a Studio.
NoFaithlessness951@reddit
Deepseek v4 or GTA 6 first?
Ok_Technology_5962@reddit
Deepseek v4 next week
senrew@reddit
Half-Life 3
LoveMind_AI@reddit
Not getting the Mac Studio when I could have is one of my greatest regrets.
eclipsegum@reddit (OP)
Top 3 regrets for me