Worth investing in hardware now? If so what?
Posted by StandardKey7566@reddit | LocalLLaMA | View on Reddit | 27 comments
2 weeks ago I bought a Mac Studio M3 Ultra 60 GPU/96GB from Apple. I returned it yesterday because I wasn't sure if I made the right decision, the 1TB storage was already looking quite small and for machine learning it wasn't quite as established as I liked. the 96GB ram also felt like I might have missed out on a "breakpoint" so to speak. I thought the GB10 "AI Computers" with 128Gb Memory and 4TB storage might be better but then I read last night on here that they are a lot slower, and by the time pre-fill is done the Mac would have finished.
So now I'm lost.
I spent £4,199 on the Mac and another £500 on a 10TB dock. Mac is returned but the dock hasn't been taken back yet, I feel like it's a good backup storage (But will return it depending on how the next investment goes.)
I have a Minimax Token Plan and this is my daily runner right now (Yes I know, it's not a local model, shoot me!), I was planning to invest in hardware in the hopes that the new releases like Qwen3.6 and Gemma 4 continue to pave the way for local models and I can ditch the monthly subscriptions.
So help a totally lost ADHD Infused ferret navigate the market right now. I want something I can run say 120B models on and be an investment in the future, potentially start the rabbit while of fine tuning models and still work on 24/7 agent harness/framework.
Advice welcome 😊
CryptoChartz@reddit
honestly you might be overthinking it a bit. if your goal is local LLM stuff, VRAM is king and the M3 Ultra is actually pretty solid for that but yeah, you’re locked into Apple’s ecosystem and upgrade path which kinda sucks.
nakitastic@reddit
I’m waiting for the M5 version with 128gb, coming soon, maybe June.
StandardKey7566@reddit (OP)
Yea I was looking at the MacBook Pro as it has the Touchpad, Keyboard and Screen so I don't need to keep switching between my desktop and that. (7800x3d, 64gb, 4090) But I was then warned that it gets hot and throttles compared to the studio, so may be worth waiting.
With the advancements in MLX etc Apple Silicon still seems a strong choice, just not sure what the price points will be.
eidrag@reddit
just use gaming laptop cooler with mbp
rudidit09@reddit
this! ideally plugged in to power, when LLMs are active battery drains way too fast
toomanypubes@reddit
Throttling only happens on the MacBook Air, MacBook Pro has fans that spin up to prevent thermal throttle. The new M5 Max 128GB is a beast.
rudidit09@reddit
i'm in similar boat (including "ADHD infused ferret") and i think i'm leaning towards 128gb M5 max MBP. Part where i'm going back and forth a lot is 14 or 16. anywho, i figure that this field fill change rapidly, and i'm more at peace with that. and even while i'm not sure if 128gb is enough for decent coding help as opposed to frontier models, i have successfully used LLM on current 24gb MBP to generate audio for games, and to go through docs and emails and be helpful there. i'm not even sure how much i'd use it as portable vs desktop vs server, but having keyboard and screen built in, plus applecare if there's hardware issue, helps me a lot to with laptop
jacek2023@reddit
Local LLMs are hobby for small number of people, other people are happy with the cloud. People who are happy with cloud and hate local models are here, on this sub. That's why you read "electricity is not free" and "you must pay for Claude Code or Chinese cloud instead using local model". You must ask yourself who are you. Are you interested in local LLMs or are you just waiting for cheap cloud access, because these groups are not compatible.
Senior-Reserve3732@reddit
❗ Power consumption
❗ Complexity
❗ Use case mismatch
Most people don’t need 120B models.
kaisurniwurer@reddit
Stop being so dramatic
3×3090 = 3x260W ~780W
at 8h/day => 6,24kWh <1$ per day so virtually nothing, or actually nothing if you have solar.
The rest I will just lol at.
The only real issue with multi 3090 setup is the case.
ea_man@reddit
Right now I would buy some 3090 because you can always resell those, things move very fast here.
Yet if you wanna experiment and have fun I would throw some money on operouter API, the thing is that local small models still have some issues with tooling now.
Maximum-Wishbone5616@reddit
So on my machine I have 2x4TB NVME for local models.
Currently 3.6TB used...
It is good to download, and test out different version. There are big differences in their quality between releases from different providers (i.e. unsloth)
Either_Pineapple3429@reddit
If I were you (and I sort of am, similar non tech back ground, but like making things)
I would buy if you don't already have a older gaming/server rig and throw a 3090 in it.
Mess around with 30b models, use opus/sonnet to build your local workflow/analysis pipeline.... and then set the pipeline to do leg work with a smaller local model.
I bought a old dell T7910 for $300 and a 3090 for $800, and I have 24gb of vram, I am going to snag 2 or 3 more 3090s periodically to bring me up to about 72-96gb of vram. All for $2500-3000
Low lift to get things off the ground and you can scale as you build and learn
StandardKey7566@reddit (OP)
I have a 4090/7800x3d but I was fed up with not being able to game and run stuff 24/7 haha 😆
Either_Pineapple3429@reddit
Lmao .... check out old server workstations from dell or hp like T7910, T7920 (these work well up to 3-4 GPUs).... you can set up a homelab, hook up a bunch of older cheaper GPUs, off load your ai to your local server and dedicate your 4090 to gaming.
ExcellentDeparture71@reddit
Perhaps you could tell us what you want to do with it?
StandardKey7566@reddit (OP)
I'm a Chartered Accountant working as a forensic accountant, self-employed. I love modelling financial markets and was planning to create an agent that could help with that. I bought VectorBT Pro for the Mac Studio but I realised I wasn't really wanting to search for Alphas or build trading algo portfolios and fine tuning a model or something else may be a better avenue for me. An agent that could assist with some work tasks whilst also giving me insights into the market etc would be a huge help and I wouldn't have to stare at a Bloomberg terminal all the time.
I have subscription with Minimax and I was with Z.AI but I can't use these for anything work related as they are not local.and I can't trust the offload of information!
So use case;
1) Algo Trading was a hobby but not a hard requirement. 2) Local LLM for bouncing ideas off and monitoring stock portfolio's would be awesome, if I can eventually have it reading charts and making suggestions for me to manually verify that would be awesome. 3) Being able to quickly batch process some accounts from clients locally and raise flags for me to manually check would open up my workflow to allow more customers, I'm turning some away right now with the new financial year. 4) Back to point 1, it's a hobby for me. I have a smart telescope but I'm not an astronomy nerd, just enjoy taking the odd photo, same with this. I'm not expecting Opus 4.6 in my local network but I'd love to be playing with agent harnesses and working on code/developing the harnesses and memory, that sounds like a lot of fun to me.
Yes, I'm fun at parties 👀
uti24@reddit
You will feel that at any (reasonable) amount of ram, like with 128GB you will be able to run Qwen3.5 397B in Q1 or Q2 quant, and you will feel that "only 16/32 GB more" will allow you to run it in Q4 or whatever. So yeah.
StandardKey7566@reddit (OP)
So it's a spiral into chaos chasing just another 16GB? Which was my first mistake!
uti24@reddit
Basically, yes.
Well, there is a limit, you would never want less then 1t/s, so..
Hector_Rvkp@reddit
the strix halo stopped being cheap. The dgx spark is expensive, and very niche (i dont think it's your niche). Your uncertainty tells me you should sit on it and consider buying a M5 ultra studio when it comes out, w 128 ram. That will draw little power vs nividia stack, and be slower, but able to run large models at usable speeds.
If you want something that's both legit smart and legit fast, you're looking at a blackwell 6000, which pretty much doubles your budget, so i doubt it makes sense here.
StandardKey7566@reddit (OP)
Appreciate it, info a lot of financial modelling and backtesting, was using vectorBT Pro. Was fed up of my 4090 gaming PC being constantly stressed so wanted to offload it and invest into LLM future models.
I'm trying to spread myself thinking instead of focussing on my use case but I felt the ability to train models or fine tune them for financial use would be helpful and potentially open a new door for me.
Accomplished_Code141@reddit
I guess there are a lot of bots in a thread like locallama trying to dissuade you, companies are trying to profit with API service. Your life, your choices, your use case.
Ell2509@reddit
The dock sounds good. Buying in nvne ssds is over 100 per tb now.
I had a similar experience as you with hardware, but just kept doubling down. I am now 10k deep and hoping (fully expecting, tbh) that the next item will be sufficient for me.
If you really have 5k to spend, build yourself a desktop with 2 5090s. I suppose you would be looking att 7k for the full build, but if I was at step 1 now, that is what I would do.
d4t1983@reddit
I’m intrigued by your use case and what you spent 10k on?
Ell2509@reddit
I am building a distributed system with several devices networked together, models on each, and orchestration, rag, tool use built in. My own local AI to help me run my business, as I am a one man band.
d4t1983@reddit
Sounds interesting… would love to know more 😁