What's the best machine I can get for $10k?

[-]

Firm-Fix-5946@reddit

custom model training

$10k is woefully insufficient to tread into this world. This reads like saying you're ready to buy a car that's both comfortable daily and good for a track day, and so you've saved up $400

[-]

present_absence@reddit

Why are you jumping in the deep end to "explore" this topic? Seems kind of absurd, no? Unless you're an extremely wealthy hobbyist.

[-]

doradus_novae@reddit

As others have stated, unfortunately its not gonna happen.

10k will get you 1 A6000 pro and a non-server workstation that wont be capable of upgrading beyond 2 gpus if you are lucky.

If your end goal is anything serious at home, you need workstation class hardware:

Motherboard: 1200$ minimum

CPU: 1800$ minumum

RAM: lol, do you think desktop ram is expensive? I paid I think 2000$ for 256gb 8 months ago...

Thrown in an extra 500$x2 for the power supplies you need.

Oh you think you can run this on a normal circuit? Might want that electrician come and install 2 dedicated 15a circuits for 1000$

And you can then one or two small models at nerfed context windows. Nothing near 687b params on consumer hardware will be possible unless its quantized and nerfed to hell.

Apple/G10 memory is too slow. Not a viable option for serious work.

It is a fun hobby but not for everyone yet.

[-]

mister_conflicted@reddit

Have you tried renting a lambda instance and trying larger models to see if they accomplish what you want, and then deciding on hardware?

[-]

allenasm@reddit

mac m3 ultra max 512gb and its not even close. Model precision is so much more important than inference speed.

[-]

Fabix84@reddit

> I need to be able to run massive full model
> such as full deepseek 671B.

Sorry to burst your bubble, but even with $10k you’re nowhere near running a model like DeepSeek 671B. I’m not even close with a $35k setup, and I wouldn’t be, even with $50k worth of hardware.

So before anything else, try to get a realistic sense of what $10k actually represents in this space. For the average person it sounds like a huge amount of money, but in this field it’s basically pocket change.

[-]

S4M22@reddit

I'm curious: what's your $35k setup and what can you run with it?

[-]

No_Conversation9561@reddit

wait for M5 max/ M5 ultra

don’t get M3 ultra.. trust me, I have two of them

[-]

chaosmikey@reddit

What’s your issue with the M3 Ultra? I’m curious. I only like them because of the 512GB RAM.

[-]

No_Conversation9561@reddit

Too slow for agentic coding unless you use a smaller model like Qwen 30b a3b.

At first you think you’re gonna use something like GLM 4.5/4.6 since you have so much wrong.

https://i.redd.it/eid4ko6y544g1.gif

[-]

blbd@reddit

GLM is definitely a Charlie Murphy to your GPU and Unified RAM Rick James.

[-]

cdevr@reddit

Unityyyy!

[-]

pmttyji@reddit

What's the performance with 100B models like GPT-OSS-120B, GLM-4.5-Air, Ling/Ring/LLaDA Flash, Llama-4-Scout AND MiniMax-M2(Q4), Qwen3-235B(Q4), etc.,? Please share. Thanks

[-]

Consistent_Wash_276@reddit

These are go to, but minimax on my 256 gb M3 Ultra

[-]

pmttyji@reddit

, but minimax on my 256 gb M3 Ultra

No wonder u/No_Conversation9561 insisting not to get M3 ultra

Thought it would run MiniMax-M2(Q4), Qwen3-235B(Q4) since those quant size comes around 120-140GB.

Atleast are you able to run Q8 of 100B models? I see that GLM-4.5-Air's Q8 is 120GB.

So what t/s are you getting for 100B models approximately?

[-]

Consistent_Wash_276@reddit

I’ve run all these models, but minimax q4 and I believe minimax q6 came in around 18 t/s. Haven’t ran q4. Just not in my current wheelhouse.

I believe the majority are between 50 to 75 t/s.

[-]

pmttyji@reddit

18 t/s is usable one & it's possible to increase that with all available optimizations.

Also 50-75 t/s .... cool!

Thanks for the stats.

[-]

Consistent_Wash_276@reddit

In the end I have my go-to’s - gpt-oss:120b - qwen3-coder:30b fp16 - DeepSeek-r1:70b - glm-4.5:air q4

I could use larger models but I have multiple LLMs running in parallel while working on other tasks

[-]

Turbulent_Pin7635@reddit

I refuse to touch the oss. GLM 4.6 (15t/s), quen3-235 (not quant) (30-40 t/s)

[-]

pmttyji@reddit

Just noticed multiple variants there for that Mac. Yours 256 or 512 GB?

[-]

Turbulent_Pin7635@reddit

512gb

The Mac is a beast, but kept in mind that it's memory bandwidth is 819Gbps it is very close to a 3090, also the KV cache and the lack of CUDA impacts the performance of the Mac. But, the things are getting better with more and more effort being put to improve the MacStudio capabilities to deal with LLM.

It is very fun, it already runs models with best answers than ChatGPT and Gemini. At least when compared with the no-PRO versions. Also, you don't need to care about drivers, noise, energy consumption, heat, reselling a Frankenstein... It works.

[-]

pmttyji@reddit

512gb

The Mac is a beast, but kept in mind that it's memory bandwidth is 819Gbps it is very close to a 3090, also the KV cache and the lack of CUDA impacts the performance of the Mac. But, the things are getting better with more and more effort being put to improve the MacStudio capabilities to deal with LLM.

Text Generations OK. Really curious to know how good with Image & Video models. I don't see benchmarks on this often here.

[-]

MoffKalast@reddit

Llama-4-Scout

Lmao, why even bother?

[-]

pmttyji@reddit

:D It just came out of head when I was thinking for 100B models. Rarely I saw few people still do use this one.

[-]

hyouko@reddit

Some options in that price range:

Build your own system. Probably your best bet would be to find the best deal you can on an RTX Pro 6000 and then build around that. Someone will probably say 'stack a ton of 3090s' but that is very much not a turn-key solution. This won't run Deepseek 671B but can run something like gpt-oss-120b that is well-regarded.
Get the M3 Ultra Mac Studio. The version with 512GB of RAM and 2TB of storage just squeaks in under your $10K budget, and is probably the closest you will get to running a really big model locally (though you'd still need a quantized version and I expect it's not going to be very fast).
Find an old server (AMD EPYC?) with a big pile of RAM and run everything on CPU.

The Mac the most turn-key and may be able to run the really big models, but probably won't be good for custom model training and won't do anything with CUDA if you need that. An RTX Pro 6000 can do some light model training and will run smaller models fast but won't fit the really big models. The old Epyc server route is probably similar to the Mac situation, but potentially expansible with GPUs down the line, but also it's gonna be noisy and suck down electricity like a mofo.

$10K would buy a lot of server time on various hosted services that are out there, so consider that as an alternative that would let you try out various configurations.

[-]

Consistent_Wash_276@reddit

Second the M3 Ultra Mac Studio

[-]

Turbulent_Pin7635@reddit

Third the M3 Ultra Mac Studio. For text inference, is the best one in that range of money.

[-]

Consistent_Wash_276@reddit

🤝

[-]

Own_Attention_3392@reddit

You can run gpt oss 120b on much cheaper hardware -- I have used it with reasonable speed on a 5090 paired with 64 GB system RAM.

[-]

Dersonje@reddit

I second the old epyc cpus. With ddr4 ram since ddr5 is price prohibitive right now. Then you’ll also have enough PCIE lanes to add GPUs as needed

[-]

chaosmikey@reddit

Mac Studio is the only thing that comes to mind. A 2TB with 512 of RAM is about $9900 USD. You can lower internal storage and use an external SSD with thunderbolt 5. This is the route I would go. You can chain them with Exos and share compute power.

[-]

iMrParker@reddit

He mentioned training. So Macs are out the window

[-]

General-Yak5264@reddit

And yet 3/4's of the comments...

[-]

Kqyxzoj@reddit

Depending on the ridiculous piles of cash you are rolling around in, I'd say maybe rent a couple of configs first. That would allow you to dial in on what makes sense for your use case. And this is coming from someone who firmly believes in the Own All The Shit You Depend On [tm] methodology. Oh wait, not too much time building it. Mmmh, tinybox?

[-]

LoaderD@reddit

Yup. The fact op doesn’t differentiate between inference and training means they shouldn’t be buying anything before their use-case is better figured out.

[-]

No_Afternoon_4260@reddit

That's the answer

[-]

Original-Tree-7358@reddit

Brilliant suggestion

[-]

YearZero@reddit

RTX PRO 6000 workstation edition + as much RAM as you can afford.

[-]

MengerianMango@reddit

6000 + dual channel ddr5 sucks. Have tried. Do not recommend. Even Qwe3 235b 3bit quants suck on this setup.

I ended up spending another 8k to build a 12 channel ddr5 system (epyc). Deepseek is sorta slow but acceptable in the new setup.

For a strict 10k budget, OP is going to have to compromise: either smaller models or more work building. If he really has to run deepseek, then probably best to buy a bunch of 3090s and do it the janky way.

[-]

Past-Reaction1302@reddit

What was your build that worked? I’m wondering and looking as well

[-]

DustinKli@reddit

That will put him well over $10k very quickly.

[-]

Turbulent_Pin7635@reddit

M3 Ultra... It runs everything, suffers to produce videos. All the MLX files > 300Gb it will run with 15-40 t/s. Anything less than that 25-80 t/s.

I'm getting better answers with GLM 4.6 than I get with the GPT and Gemini paid versions.

[-]

abnormal_human@reddit

You’re missing a zero from your budget if you want to run that overparameterized pig of a model in any meaningful, usable way on a turn key system.

6x RTX 6000 MaxQ on a base system that costs your whole budget would do it though.

[-]

philmarcracken@reddit

if you want to run that overparameterized pig of a model

cries in 8gig of vram..

[-]

HyperWinX@reddit

Mac Studio M3 Ultra with 512GB of RAM. It will be so damn fast

[-]

Narrow-Belt-5030@reddit

With only $10K and a dream to run full Deepseek 671B ... I would suggest API calls to a provider and/or rent hardware on need.

[-]

chibop1@reddit

Mac might be ok for inference with popular LLMs, but if you need to do dev work with PyTorch, you may encounter errors such as "NotImplementedError: Could not run xxx from the MPS backend." PyTorch can also produce inferior results compared to Cuda even when running the same model. Overall, MPS support in PyTorch still lags behind Cuda.

[-]

960be6dde311@reddit

NVIDIA RTX PRO 6000 + Intel Core 9 285K or Ryzen 9 9950X.

[-]

Denelix@reddit

usually would be a server CPU and like 1TB of ram + a crazy GPU buttt...... market lookin pretty bad rn

[-]

phido3000@reddit

Old Dual Xeon 6200 series/Eypc 768Gb of ram

Or just buy $10k of machine time.

[-]

takuarc@reddit

A maxed out Mac Studio is your best bet, especially that 512gb ram will come in really handy.

[-]

false79@reddit

Honestly if you are just exploring, I wouldn't go on the deep end. You would have all this tech under your fingertips and may not be using it to it's fullest potential because of lack of prior experience.

There are so many cheaper options to dive into before throwing cash at the unknown.

[-]

ZodiacKiller20@reddit

Better off spending 5K on a RTX 5090 machine and then use the leftover 5k for runpod.

That way you can train large models on runpod while still keeping your 5090 machine free.

[-]

Western-Source710@reddit

RTX 6000 Pro with the 96gb vRAM, room to expand to 2-4 GPUs, good processor, probably Ultra 9 285K or Ryzen 9 9950X3D if you aren't going server mobo, bunch of good ram, fast SSD. If you expand later on, add more RTX 6000 Pro with 96gb vRAM each. Four of them would be a nice 384gb of vRAM. :)

[-]

juggarjew@reddit

OP would want a threadripper rig at that point, quad channel memory and all the PCIe lanes you could ask for.

[-]

kc858@reddit

You can't run 671b at any usable speed for 10k lmao

[-]

giant3@reddit

It is better to build your own rather than buy a custom one.

Sooner or later you will encounter issues that you will have troubleshoot and better to get your hands dirty from start.

Also, warranties are 3 or 5 years for components, but only a year for most pre built systems.

[-]

_matterny_@reddit

The nvidia spark is an interesting option. But I don’t think anything can run the full 671B model sub $10k in a reasonable timeframe.

I could probably run it as a cpu model with a couple of xenon processors for $10k, but the response time is going to be so slow as to be meaningless.