The decision whether to self-host or not depends on what you are going to use that LLM for.
There are two extremes.
Expert answers at the same level of quality as Claude Code (Opus) and ChatGPT 5? Yeah, the cost to host that is prohibitive. Cost savings: could be 12 months of usage before you see any!
Text classifier to tag each input from hundreds of thousands of customers into one of several buckets? Yeah, I did that on self-hosted Ollama 7B on a HP victus laptop throttled to 2.0GHz.[1] Cost savings: Tremendous, considering that the classifier running on SOTA 80b models (or whatever) is almost certain to produce the same classification as the 7b )llama model.
Between those two extremes lie an entire host of problems that your LLM can solve: if you're on the first extreme, rent an H100 for a few hours or use a token-provider like Claude or ChatGPT if your usage is spread throughout the day.
If you're on the other extreme just use your current computer whatever it is. You can even do like I do and throttle it so when you run stuff overnight it won't overheat.
[1] Kept the temps low so that I don't shorten the life of the laptop too much by running it at 80-celcius for 48 hours. Instead, it took double that time, but ran at <50-celcius.
I've walked this path myself, multiple times. The harsh truth? Your MacBook is your best bet. I know it's not the answer you were hoping for
This doesn't seem all that harsh, and it was what I was hoping for. Either that or the AMD Strix APUs w/ 128gb of shared memory. I'm not sure what the point is here, local inference is relatively accessible and affordable nowadays.
The workstation in the image is not for local inference, it is for training, that is a different ballgame
So if I am building processes, and potentially products on this tech (which I am btw.), I'll at least make whatever I am building flexible enough to use something other than cloud provided LLMs with no changes required.
And btw. "cheap" is not the only consideration. Data safety and the legal compliance issues that come with it are easily as important. The same is true for customer trust.
So you want to outsource your LLM to an AI company that is burning through cash at an alarming rate and has to dramatically increase prices if they don't want to crash out?
Cool. Let's say that works out for you. What happens in 5 years when your business can't run without that AI company's systems. And by some miracle of creating accounting, they still exist.
Well it's time to start earning a profit. Expect your AI price increases to make VMWare feel like a gentle kiss. And you'll pay it. You already fired everyone who knows how to do the work. And they own your training data so you can't just start over with a new company.
lelanthran@reddit
This is a crap conclusion.
The decision whether to self-host or not depends on what you are going to use that LLM for.
There are two extremes.
Expert answers at the same level of quality as Claude Code (Opus) and ChatGPT 5? Yeah, the cost to host that is prohibitive. Cost savings: could be 12 months of usage before you see any!
Text classifier to tag each input from hundreds of thousands of customers into one of several buckets? Yeah, I did that on self-hosted Ollama 7B on a HP victus laptop throttled to 2.0GHz.[1] Cost savings: Tremendous, considering that the classifier running on SOTA 80b models (or whatever) is almost certain to produce the same classification as the 7b )llama model.
Between those two extremes lie an entire host of problems that your LLM can solve: if you're on the first extreme, rent an H100 for a few hours or use a token-provider like Claude or ChatGPT if your usage is spread throughout the day.
If you're on the other extreme just use your current computer whatever it is. You can even do like I do and throttle it so when you run stuff overnight it won't overheat.
[1] Kept the temps low so that I don't shorten the life of the laptop too much by running it at 80-celcius for 48 hours. Instead, it took double that time, but ran at <50-celcius.
BlueGoliath@reddit
So you expect actual programming content on /r/programming? Don't.
church-rosser@reddit
Article was pure trash.
ClownPFart@reddit
I mean it's about LLMs. Its hard to make a non trash article about trash
hi_im_bored13@reddit
This doesn't seem all that harsh, and it was what I was hoping for. Either that or the AMD Strix APUs w/ 128gb of shared memory. I'm not sure what the point is here, local inference is relatively accessible and affordable nowadays.
The workstation in the image is not for local inference, it is for training, that is a different ballgame
Venthe@reddit
Regardless, it is not. At the current point, SaaS inference is miles cheaper.
That being said, the "cost" is not the only factor - dependency on the third party and data security to mention few others.
Big_Combination9890@reddit
Even if we accept this at face value, whether it remains cheaper is a different question.
So if I am building processes, and potentially products on this tech (which I am btw.), I'll at least make whatever I am building flexible enough to use something other than cloud provided LLMs with no changes required.
And btw. "cheap" is not the only consideration. Data safety and the legal compliance issues that come with it are easily as important. The same is true for customer trust.
hi_im_bored13@reddit
This calculation also greatly depends on how much you value that computer as a standard workstation outside of doing inference
Like the framework 128gb Max 395+ motherboard is $1.7k, that is enough for 12 months of Claude max 5x & $500 worth of misc. api usage & subscriptions.
But I still need a PC for work, and its $300 for an x870 mobo + $500 for a 9950x + $150 for 64gb of ddr5 + $300 for a 5060 + $50 for cooling
So comes out to be a $400 difference, and that doesn't buy you quite as many credits
grauenwolf@reddit
So you want to outsource your LLM to an AI company that is burning through cash at an alarming rate and has to dramatically increase prices if they don't want to crash out?
Cool. Let's say that works out for you. What happens in 5 years when your business can't run without that AI company's systems. And by some miracle of creating accounting, they still exist.
Well it's time to start earning a profit. Expect your AI price increases to make VMWare feel like a gentle kiss. And you'll pay it. You already fired everyone who knows how to do the work. And they own your training data so you can't just start over with a new company.