Old Mac Pro still proving its worth
Posted by Hephaestite@reddit | LocalLLaMA | View on Reddit | 34 comments
The “Trash Can” Mac Pro, once the most expensive machine you could buy from Apple, mine was just shy of £10,000 in 2016 — that’s £14k in today’s money.
Until recently mine was just running as a kubernetes single node development platform, it’s 64gb of ram and 24 logical cores made it perfect for that.
Its most powerful asset, a pair of D700 GPUs, essentially sat idle for years… that is until yesterday when I discovered that while its old southern islands based GPUs weren’t supported in ROCm, they were now supported under Vulkan — thanks to new drivers and a new Linux kernel.
That means it can run basically any model that llama cpp can throw at its 12gb of VRAM. Time to do some benchmarks, right?
Qwen 3.5 9B Q4 MTP — 11 t/s output at 70k context
Qwen 2.5 coder q4 — 22 t/s output at 70k context
Not exactly lightening fast but totally usable, especially for planning tasks where you can just set it and forget it.
The thing that’s really blown my mind though is that the planning output from qwen 3.5 is significantly, and it’s not even close, better than Claude Sonnet 4.6. It absolutely smashed planning on a complex csharp .net 10 app with nuget packages that sonnet struggled with, qwen just googled the docs.
Mind blown 🤯
What other ancient hardware are people running that’s still capable of doing real LLM work?
AccurateSun@reddit
“ qwen just googled the docs” what tooling are you using that lets Qwen do this? Something like LMStudio with plugins?
Qwen3.5 9B also runs on my machine but I never expected to hear it would match Sonnet 4.6 at planning (or anything) so I haven’t ever really used it for anything, but now I’m curious
BitGreen1270@reddit
Post seems like AI written.
Hephaestite@reddit (OP)
Actually 100% human written, maybe I’ve just been reading too much AI generated content and it’s started to seep into my brain?? 😂
BitGreen1270@reddit
Why so many em dashes in the text?
corruptbytes@reddit
i lowkey love the design and wish they brought it back now that they’ve really improved thermal performance
Hephaestite@reddit (OP)
This design but with an M5 Max in it would be amazing
Antoniethebandit@reddit
Mac studio designs are great as well
jcdoe@reddit
That’s dope. I’m using a 2022 MBP with 16 GB unified RAM to locally host a 7B Gemma 4 model. I’m using Open WebUI to handle the web server, and it’s great. Next step will be opening it to the web so we can use it anywhere. :)
I realize it’s not nearly as old as your Mac Pro, but it’s still 4 years old. I’m tickled it will run models at all.
Hephaestite@reddit (OP)
What sort of token per second are you getting? Is that an m2?
Eldoradooo@reddit
You moved the MacOS Recycle Bin into your desk?
lol just joking, my friend managed to run deepseek-v4 flash q4k on his, try that
ComfortablePlenty513@reddit
Lol I remember these. Paid $4600 for one back in 2015, it was a 6 core with the D500s. The graphics cards ran too hot for the case, so it was only a matter of time before it overheated and you got kernel panics during renders or intense compute.
the-username-is-here@reddit
Oh, had one of these back in the day, really cool (but very impractical) computer. One of D700s burned out on me, had to replace.
It's mind-blowing how the box half its price and size (DGX Spark) these days gives literally 10x performance.
MarcusAurelius68@reddit
Mind blowing to me is that a 2013 Mac Pro can still contribute. Time to dust mine off and give it a job.
the-username-is-here@reddit
Not sure it's worth the electricity, considering reported smartphone-level performance. 😞
HIGH_PRESSURE_TOILET@reddit
DGX Spark is way less than half the size. Mac Pro trashcan has a diameter of 16 cm and a height of 25 cm. Spark is a square of side length 15 cm and a height of 5 or 6 cm.
Hephaestite@reddit (OP)
Or an AMD strix halo, half the watts at full throttle and 10x the power
olli-mac-p@reddit
You need an arm based M model. You can try M1. And get as much unified RAM as possible ( at best 36gb or more). Low energy consumption and way faster than older x86 hardware.
Hephaestite@reddit (OP)
This trash can out performs my M2 Max on the same models
ganhedd0@reddit
If you're still going to be using one of these in 2026, you should probably make sure that the cooling is up to snuff.
https://makerworld.com/en/models/2690630-2013-mac-pro-trashcan-mac-stand-with-air-vents
motorcycle_frenzy889@reddit
Oh wait, is it all southern islands GPUs that are supported by Vulkan now? Discrete GPU in my 2015 MacBook Pro might work
Hephaestite@reddit (OP)
Yep so long as you’re on a recent Linux kernel and on the amdgpu drivers
brickout@reddit
Nice! I'll bet it'll run MoE pretty well, relatively. 35b-a3b at Q6 or Q4 should be fun
I'm using some old 2018ish imacs with 7700k and 8Gb r480 (i think). Surprisingly good, considering
premolarbear@reddit
I thought the same. But if you calculate the energy costs, its cheaper to buy something else.
Hephaestite@reddit (OP)
If my electricity wasn’t free I’d probably agree
premolarbear@reddit
free? how? UAE?
Hephaestite@reddit (OP)
Australia ☀️
spammmmmmmmy@reddit
Still, time is money. See if it can do anything, but what I suspect is you'll see the need to buy bespoke hardware to run ai workloads.
premolarbear@reddit
nice
Kahvana@reddit
You might wanna try MoE models with partial offloading, should be quite fast too!
Give Gemma4-26B-A4B and Qwen3.6-35B-A3B, both at Q8 a try
Hephaestite@reddit (OP)
I’ll give it a go tomorrow
Metalmaxm@reddit
My Garbage can has also use.
Kal-LZ@reddit
I have a Dual E5 2697 v2 with 256GB sitting in storage. I'm wondering if it's still worth keeping or if there's any use for it
jamexcb@reddit
Xeon E5-2600 with 384 GB RAM. Suuper slow. gpt-oss:20b 3.2 t/s or gemma4 2 t/s. This server is only to test some ideias it's OK.
Positive-Stock6444@reddit
3060 and a P520, with 256gb, but still. Obsolete by any definition.