Why run local? Count the money

Posted by Badger-Purple@reddit | LocalLLaMA | View on Reddit | 128 comments

I’m not a coder, but I run local models. I gave in to agent hype (I was building my own, but there is so much to do) and installed Hermes. Running with Qwen-397b out of a 2 spark cluster.
So…I asked Hermes today to tally the token count, and the result…200 million tokens. In 5 days.

At this rate, using an agent for tasks like installing software and debugging things I want to try out, what is the cost I am saving? Artificial Analysis says the price is about 1.25 dollars per million tokens on average from providers. At current pricing per Artificial Analysis, that gives me about 1250 dollars per month, and my sparks will pay themselves by 6 months.

So, caveats of course I bought them at cheaper prices than today, but it’s a simple estimate that there is some valid reasons to go local.

Like I said, I am not programming and I know there are programmers that easily triple my token count in the same time. That implies that if you use 100 million tokens per day, the return on investment is still there today, even with crazy computer prices.

To me, local AI is about the desire to utilize a cool technology without the strings attached that threaten individual privacy and intellectual property. But knowing that my investment is not just purely hobbyism gives me more conviction that local AI is the future.

I know I am preaching to the choir…So the question is, has anyone else felt their rig is becoming more sustainable now than 6 months ago, price wise? Would love to hear!!