Agent Use is gonna drop off a cliff once its all usage based
Posted by Venisol@reddit | ExperiencedDevs | View on Reddit | 464 comments
I didnt use agents much, then 2 weeks ago I decided to try it. I hooked my anthropic api key to opencode and built a personal notes app with zero sync on a long weekend.
It cost me around 50 bucks. In a fresh project, with essentially one page and one feature.
It did cool stuff, like build me an AceJump plugin into CodeMirror6 editor. Im not saying it doesnt work, im not saying its not useful for very small, very specific stuff.
But it was 50 bucks.
Then I got a 20$ subscription and started using it at work, i dont even max out the limits on that one ever. Even though i used easily 50x the total tokens I used for my little notes app.
All of this shit is gonna vanish. All the personal stuff people do with agents right now, gone. Or moved to local, free LLMs. None of the scammy micro saas crowd would ever invest 5 grand into their own shitty app. Even these people know better.
Even at work, if you spend 5k per engineer per month no real company is going to do that. Those economics dont even make sense for the overpaid US engineers, where technically you maybe only need 50% productivity increase per engineer to make the cost work. You do not get that lmao.
In the EU you def cant make those economics work.
For me I use the agent pretty much exclusively for "simple stuff that touches a lot of files", cause theyre so fucking slow for small questions / fixes. Im way faster to copy the relevant code snippets, paste it into the chat, then copy the result back into my code base.
I literally write my components with hardcoded strings and once im ready I tell it to look at the changed git files and move all the hardcoded strings to translations and also add the translations. Its perfect for that.
Miserable_Study_6649@reddit
I really hope it doesn’t go that way, but I would gladly unplug from AI if I had to. It’s only accelerated my work and capabilities. But what I use it for does not directly generate revenue so I would just walk.
Gunny2862@reddit
It's going to put an even bigger gap between big companies and small ones. (If AI turns out to be useful.)
seanamos-1@reddit
This is the exactly where the saying, “AI won’t take your job, AI costs will”.
ruby_fan@reddit
Yes they will. Especially if you see the models continue to get better at a rapid pace.
foresterLV@reddit
what experienced dev can build for you for 50$?
Eliarece@reddit
As a developer from a third world country, it has been very depressing to see American compagnies spend the equivalent of my yearly salary on one week of token for one dev
GotAim@reddit
My CTO has spent over $300 000 on tokens this month by himself 😬
codescapes@reddit
He must have so many complete and successful projects he can point to 🙂
GotAim@reddit
He spent it mostly making our app support other languages, like 1000s of PRs
RiddleGull@reddit
Damn, do you have like hundreds of thousands of localization strings and added support for a hundred languages? 300k is nuts if it’s anything less than that.
GotAim@reddit
Don’t know exactly how many, but at least 10s of thousands, maybe hundreds of thousands
dnbxna@reddit
There's a good chance some/many of those localized strings are just terribly wrong lol
GotAim@reddit
Yeah, we are putting together a team of random people in the org who speak those languages to QA most places
ReferentiallySeethru@reddit
$300k is nuts though, even for a moderately sized code base. Was it done efficiently at all or is like the output of a manic developer at 3am on Adderall and coffee?
Lotus_Domino_Guy@reddit
Manic stimulant-inspired code is best code.
GotAim@reddit
I mean it’s just basic stuff so as far as I know everything went smoothly, but yeah no idea how he reached that high number. Probably he’s been doing something wrong, but the results are fine enough
warm_kitchenette@reddit
You could ask him, then use that as a checklist on what not to do. My guess is that he has multiple agents serving as designer, QA, implementer, architect. Then the robo-interns got into several spirals of misunderstandings, bad specs, and hallucinations
TheTacoInquisition@reddit
The question to ask, is whether or not the product needs to do that right now. If it does, then maybe it was worth it, but if it's not opened the door to any more customers or a price increase, then he wasted both time and money on the project.
ViejoConBoina@reddit
The question is: “How long would a software developer take to do this?” because you could pay 2-3 years of a salary with that amount of money and even if they take 6 months to get it done then you’re losing at least a year and a half of a developer’s output.
GotAim@reddit
It is something we lose customers on compared to competitors, so definitely useful and needed
Irrealist@reddit
Wow, that's a lot of money for i18n.
dudevan@reddit
That’s at least 10 multi-million dollar projects based on the tokenomics we’ve been fed.
inthiseeconomy@reddit
all these sycophantic CEOs, CTOs, I hope history is not kind to these motherfuckers
GotAim@reddit
My CTO and CEO are both pretty nice people, but go off I guess
nullbyte420@reddit
Whistleblower situation tbh
Ohmyskippy@reddit
Yea that's fucking wild lmao, cto chugging the Kool aid xD
air_thing@reddit
That's insane. I start getting nervous when I'm approaching $300.
GotAim@reddit
In my company it’s almost the opposite, the more you spend on AI the better. A colleague of mine got a stern message from his manager because he was only spending a couple bucks on AI a month
tripsafe@reddit
https://tenor.com/QCpt.gif
me_hq@reddit
Ahahahah
TempleDank@reddit
As a dev from a first world country, I will say: same
schmerg-uk@reddit
5+ years trying to get management approve buying a handful of tools that for about $5k (and recurring 10% annual license fee) would boost the productivity of \~100 devs.
No chance, but enterprise wide AI tooling and shaming anyone who doesn't drink the koolaid by burning all their tokens so the shitty LLM can do poor code reviews that read very pretty (ooh colourful icons in the text!) and confident but are plain wrong (and losing one of the main points of reviews being that someone else is at least aware of what you're doing and why and how it works) ??
SIGN US UP NOW !!
styroxmiekkasankari@reddit
Yeah honestly the biggest problem right now that I’m seeing is that people are generating tests (without good steering as well) and using it for reviews as well. All code is shit and garbage that could be thrown out technically speaking, but good validation of features and recognition that the RIGHT THING was built is too important to leave to a clanker.
I’m fine with code generation, hell we had a great number of tools already doing this before llms and in general they were even more reliable, but outsourcing the engineering part? Bad idea in the long run if not even short term. Knowing the product-/sales people types these quick PoC have a habit of turning into live products and then everyone is stuck maintaining really bad software.
binarycow@reddit
IMO, AI should write either the code or the tests, but never both.
coworker@reddit
This is like saying the same engineer should not write the code and the tests.
One-Bowler4807@reddit
But AI will fake the tests or change the code to make the test green. This is rare with experienced developers and usually only happens if the code needed to be refactored to be testable.
reostra@reddit
Honestly not a bad idea, really; kinda like an extremely intricate code review. The only reason it wouldn't really work is that it's hard enough to get anyone to write tests, let alone for someone else :)
coworker@reddit
It's not a new idea. Traditionally QA was tasked with writing tests but the industry realized that resulted in developers not caring about the testability of their implementations.
styroxmiekkasankari@reddit
I guess bottom line is that it’s almost always quicker to do either/both yourself if you have to nitpick what it is and isn’t allowed to do lmao.
fuzzyFurryBunny@reddit
Yes. Aside from some few tedious tasks and coding that literally have little to no logic, AI coding is only faster for ppl that don't code. Just like a doctor will have a quicker better diagnosis than me trying to piece together so the facts. But the whole point of having professions is cause that's efficient. Now we have everyone trying to recreate the same small apps, meanwhile damaging our environment and community
poralexc@reddit
Tests are way too important to leave to LLMs.
I usually get urges to rearchitect things along the way to make things easier to test, at which point it's the opposite of boring boilerplate.
It's also important to manually break tests from code every now and then to make sure you're testing exactly what you think you are.
tr14l@reddit
Ai proposes arch and tests. Humans validate they are appropriate. Ai develops to those. Then the pipeline validates they were adhered to
poralexc@reddit
Do humans verify that they work? What if your pipeline isn't actually validating anything?
With the obfuscation offered by mocks and different test harnesses, it's way too easy for agents to assert 'true == true' and for people to rubber stamp that as ok.
tr14l@reddit
Yeah, that's why humans validate the tests. The same way a team lead would say "no, you didn't test this and this test sucks"... You need to do that now
0xjvm@reddit
Nah but it’s ALOT easier to review tests than it is to review hundreds of lines of code across a few files.
If youre a good dev, you’ve only asked the agent to do one specific thing. You should KNOW the codebase, you should KNOW generally where the edge cases arefor this 1 case and then it’s just a matter of reviewing do those tests cover those edges.
In my experience there comes a point where the implementation details aren’t under that much review. I’m paid to ship value not produce the most optimised code. As long as those tests are in line with expectations and cover what I expect, I’m happy for LLMs to do both.
But you need to know exactly what you want in the first place, you can’t review if you have no opinion on what it should look like.
poralexc@reddit
I agree: This is essentially the Responsibly use Guidelines I've been trying to push at work. These tools can be dangerous if you don't already kind of know the answer.
In some legacy codebases I've seen, the testing is the most complicated part. Validating layers of mocks and testcontainers requires understanding those systems as well as your actual code.
styroxmiekkasankari@reddit
Agree. Really depends what the source should be: I think it’s probably fine for simple functions to let it do both but larger integrated pieces or policy enforcement is off limits. You can also always steer development by writing some test cases before hand.
new2bay@reddit
If the function's that simple, why not just write both yourself?
styroxmiekkasankari@reddit
Exactly! Unless I’m doing a hundred such functions at once I’m probably writing them myself
binarycow@reddit
The point is, LLMs are perfectly happy with making a test an effective no-op in order to get tests passing. They're also perfectly happy with breaking the code to make tests pass.
If you write the tests and the LLM is prohibited from changing them, then that prevents that from occurring.
You might be okay with having two different sessions do their part.
styroxmiekkasankari@reddit
Yeah I got that haha. A lot of this does come down to instructions but YMMV. Also generating both tests and code will effectively mean the developer will not know anything about the program and leaves it up to review to understand what’s going on which is also terrible and slow and wasteful when things to wrong.
binarycow@reddit
I use the same logic when I say that the person that wrote the code shouldn't be the one to test the code.
Obviously, the dev should do at least the 'happy path' testing, but that should be followed up by someone (not the original dev!) who is actively looking for edge cases
Few-Impact3986@reddit
I disagree. I agree for greenfield, but solving a bug should work the same was a humans. Recreate the bug with a test. Solve the bug. Rerun test and see if it passes.
Does an engineer still need to look at? Yes
lmpdev@reddit
I found Github Copilot is really good at reviews, especially once you provide custom instructions. For us it's finding bugs no human reviewer noticed. We've had cases where when we look back at the PR that caused a bug and there is a comment from CoPilot identifying left ignored.
It of course doesn't replace human review, but it's a good addition, just like static analysis scanning tools, but smarter.
mindfulnessman14@reddit
Copilot can catch obvious bugs, sure, but the part that makes me nervous is when people start treating that like real review when the tool has zero context for why the feature works that way in the first place.
styroxmiekkasankari@reddit
Yeah not saying it’s not at all useful. Plenty of times it’s reminded me of something I forgot about or a regression after ”one last commit” type of deal.
DualActiveBridgeLLC@reddit
Yup. My company has a really bad habit of thinking that PoCs should become products. I can see the AI psychosis rolling around where where now everyone thinks their low effort turd is now the future of the company. The worst part is that poor processes and decision making is starting to be discussed as if LLMs will fix the issue. CEO has a shitty way of deciding what customer projects to select. No worries llms will help with productivity. Terrible technical evaluations of effort, no worries our pre-sales bot will come to the rescue. It is getting very concerning.
OHIO_PEEPS@reddit
You can use them to write tests but you need to total separate code generation from testing. The tester only gets the contract and the intent of the function written by YOU. Then you instruct it to TRY to write tests to expose bugs. And then you need another pass with a fresh context takes those same tests and writes new ones specifically at the boundaries between each layer. Then take everything you possibly can and run mutation testing till you can get to 90% killed. I dont write code anymore really. I just plan architecture and write test plans.
roger_ducky@reddit
Reason for agents to generate tests is because without it, the agents starts generating code that doesn’t work.
If you’re saying you’re writing the tests by hand first, that’s fair.
fuzzyFurryBunny@reddit
Yes, and the ppl at the top dont take the time to think. LLMs are inefficient. They seem amazing to guess the right thing but they are inherently inefficient if you actually account the resources used. We are doing the exact opposite of efficiency. Let's put our brains to not think and work the llm as hard as possible until it comes to the same conclusion we already know--and we have to know it to verify it.
Ppl are lazy now and instead of using the right direct tool they go to the llm that can't even deterministically do it right. Ppl rewriting the same small apps cause they don't know to look for the already tested open source project. Or using llms as file manipulators when we already have a lot of very fast direct tools.
Private companies that have such massive implications on the entire market should be limited in size. There should not be 1-2 giant private company that hides financials to allow all the lies they tell.
Spimflagon@reddit
I'm not saying it's not stupid, but I think it might be systemic stupid rather than perpetrated by your management.
See, your dev-boosting tools are great and all but the benefit to development then gets passed to clients, which improves sales, all great stuff but it's all very vague stuff, hard to track, hard to quantify - maybe you benefit and maybe you don't.
Whereas, you nail AI onto the company mission statement you get a 5% bump in share value. It's like adding ".com" to a company name in the 2000s; people rolled their eyes but that shit did actually pump share value.
mikelloSC@reddit
Because AI is FOMO for leadership
austinwiltshire@reddit
Because improved productivity is the present day comprimise. The long term vision for folks buying AI is firing everyone they've always been jealous of for knowing how to make computers do things.
Your better IDEs and larger screens pitch never had that as an eventual promise, and so it'd never get approved. They don't want the productivity, they want the (in their minds) incremental move to the day when they can tell everyone to fuck off and use some sort of inifinite money printer.
It doesn't make a lot of sense. But these folks were almost always in their positions of power because they just said they were entitled to be so, and there's enough enablers in society who seem to WANT bad bosses that we let them. They never were strategic geniuses.
Hziak@reddit
I had been evaluating some new tools for my support team a while back because we don’t have a good on-call management suite. Long story short, I got turned down at the pricetag. I told the sales guy who had been pitching me the one I wanted and he said he understood, but wanted to try one thing first. He ADDED a line item that we hadn’t discussed before for an AI pre-sorting module that we absolutely didn’t need and asked me to resubmit it. Overall, the price was about 20% higher, but the purchase was approved because AI was on it in big letters. To this day, we still haven’t even enabled the module, but we’re enjoying our new on-call tools.
Moral of the story seems to be “if you’re having trouble getting approved, add a bogus AI line item to the quote and your tone deaf, ignorant, managers will eat it up.”
darkstar3333@reddit
I mean did they say AI stood for "Artificial Intelligence" or "Actually Important".
Give it up to the sales guy on that one.
tersreply@reddit
It stands for "Accelerated Irresponsibility"
schmerg-uk@reddit
The depressing thing is you're right...
bapman23@reddit
What are those tools?
schmerg-uk@reddit
Specialised debugger, heap tracker, and profiling tools for ~5 million LOC of C++ development (quantitative finance maths library developed over 20 years and increasing wrapped with Python code for on desk analytics in addition to being the one true source of pricing for all the in-house risk systems etc)
bapman23@reddit
I know a place where they bet on AI so hard, that they freezed hiring because that's their strategy for the token price increases. Now they are left without a developer for a specific platform and nobody knows that platform. Guess it's gonna be fun for them to maintain and develop that product.
But hey, productivity is skyrocketing!
Lotus_Domino_Guy@reddit
The executives can point to their cost cutting as they get hired to go wreck a new company and no one will care about the shambles they left behind at the old company.
new2bay@reddit
Wow. Talk about cutting off your nose to spite your face.
abrandis@reddit
... And this is why I used to tell developers to do a better job at selling their needs, not just trying to use technical jargon and facts to explain things... The people who have purchasing authority often are non tech who are very susceptible to marketing hype
DigmonsDrill@reddit
"But what if we buy the $150 standing desk and you don't use it?"
Few-Impact3986@reddit
Been B2B saas most of my career. Customers always want a 'Do anything machine', but in reality what they need is solve this very specific problem machine. Humans are good at do anything, machines are the best at do 1 thing.
PrudentWolf@reddit
They want a tool to eliminate \~100 devs and their salaries for the price
darkstar3333@reddit
That's the dream but not the reality.
Value extraction is not a new concept, the more reliant you are on providers the more they can just gouge you.
Keep in mind those businesses often have customers who can/will stop payments if/when something breaks.
If you YOLO'd your organization into a handful of AI agents and pushed a product/update that had no one accountable, your customers will bankrupt you.
RelativeObligation88@reddit
They do but they will be gravely disappointed. In fact with a freeze on hiring juniors and less people getting into the industry for fear of long term stability, they are going to end up paying for the tokens and for higher salaries of engineers due to labour shortage.
johanneswelsch@reddit
read my reply: they want your DATA, so they can automate you away
damnburglar@reddit
As a lead who was ripped a new asshole only 6 years ago for authorizing a mouse purchase….same.
lucy_in_the_skyDrive@reddit
Dude my company paid $100k last month on claude tokens. It's unbelievable. I guess at the end of the day it's still cheaper than being multiple engineers
TempleDank@reddit
170K$ here
zeroconflicthere@reddit
The tables have turned. It was depressing for Western developers to see their jobs outsourced to third world countries
Efficient-number-one@reddit
First world and third world are not relevant concepts anymore, just FYI.
Source: me I lived in both "worlds" and they are not that different.
8004612286@reddit
Bro tf? A local living in Berlin has a VERY different than life than a local on Cambodia.
L3monPi3@reddit
For sure he lived in the good part of a poor country
eloel-@reddit
Or vice versa
James20k@reddit
I often think this when it comes to the maths results, where AI was able to solve some maths problem. Its cool, but it cost $x trillion to get there
If you spent a trillion dollars on research - any research - we'd be able to solve pretty much research problem you want. You could pay 1000 researchers in 100 fields a decent salary for their entire lifetimes, and still be left with roughly half a trillion dollars left over
The return on investment would likely be much higher as well
quentech@reddit
Building datacenters isn't research. The only way AI cost $x trillion is if you include the cost of building datacenters.
G_Morgan@reddit
They are desperately hoping it will reach the point where it can be done without devs and then they can scale costs up and down as needed. It is why I've not remotely been worried about all this, it is a pipe dream.
RedditNotFreeSpeech@reddit
We recently blew through one of our monthly budgets in 5 hours when they accidentally started making more calls to ai than intended.
Megamygdala@reddit
We are told to increase our usage. Since the company is paying a lot of money for AI access, its only profitable to them if we use a lot of it. Say if 3000 developers at a F500 company used the majority of their Github or claude plan, the AI company would easily lose hundreds of thousands to millions of dollars per year.
There's even an exploit on Github copilot (although it'll be patched soon) where you can trick the model into giving you unlimited tokens and just burn Microsoft's cash
pickledplumber@reddit
What is that trick?
Emotional_Papaya3282@reddit
Bro the amount of money in this field is overwhelming.
I worked on a platform team early in my career and was asked to promote FinOps practices to save some cash. I told my boss I was able set up a monitor and reduce our VM costs. He barely reacted when I told him we'd be saving $2.5k a month.
The scale of finances at these companies is crazy. What is cheap to them vs expensive is absolutely crazy.
Exotic_eminence@reddit
Not for me - I make zero dollars ever since AI came out - and I am finally finished mourning my career - I see what it has become and I don’t miss it, I am doing so much more with my life now - it was holding me back for so long - I am happy to be free of the burdens that come with this career
CleanishSlater@reddit
What burdens? Development is one of the cushiest jobs in the world lmao
Exotic_eminence@reddit
iYKYK
CleanishSlater@reddit
Are you having a manic episode?
Exotic_eminence@reddit
I’m always happy and it’s a great way to repel ppl who are always butthurt
CleanishSlater@reddit
I hope you can code better than you can construct sentences, Christ.
eloel-@reddit
How low is your pay? Even the most ai-heavy workflows I see pay like $1000/month.
These developers must have been getting your yearly salary daily if what you're saying is true.
KayLovesPurple@reddit
You haven't seen people bragging on LinkedIn about spending tens of thousands in tokens, to the point where they themselves admitted an actual employee would have been cheaper. And they weren't referring to an employee from a cheap country either.
eloel-@reddit
I have not, I don't spend a lot of time on LinkedIn.
I'd be curious to see what they're doing with AI to get there though. I have some LLM or another running constantly during the work day, very often multiple, and I barely hit ~$500 a month. Are they just running 10 of them with the latest model and having them each build every context from scratch? Just spinning their wheels for no reason?
alchebyte@reddit
yes. but they are loud about it. so implications...
The_yulaow@reddit
they are spending more than even most others first world country salaries
5olArchitect@reddit
You absolutely get 50% productivity increase if compared to no LLM usage. What are you comparing against? Local LLMs?
qwertyshmerty@reddit
I have been waiting for the tipping point on AI and I think we are approaching it. It can really go 2 ways, companies stop buying into the koolaid and made up productivity numbers, and the usage will back off. Or, companies double down and lay off employees to cover the increasing costs.
Typically what happens is a FAANG (or whatever the acronym is now) makes a move that sets the precedent and everybody else follows suit. And unfortunately, Meta recently did that when they laid off 8000 to try and replace them with AI. I think the best thing we can hope for is this quickly reveals how important human engineers are, and that AI is a tool to be used but not a full replacement.
IceMichaelStorm@reddit
I just hope Meta to fail, I anyways don’t get how they make so much money. But that’s just me
developerknight91@reddit
Fb will fail, but Instagram, Snap and WhatsApp will endure. Honestly if the current US administration changes some of its policies the best we can hope for is for Meta to get split apart.
FriendOfEvergreens@reddit
They have 90+% of the worlds wealth on instagram fb and WhatsApp every single day. Surely you realize those eyeballs are worth tens of billions in advertising
IceMichaelStorm@reddit
Yeah. I mean I don’t see ads on whatsapp and I rarely use IG. So I would guess I don’t pay them much.
oh and FB feels so dead to me. But already for 5 years or so
SecretaryAntique8603@reddit
I mean, having 8000 engineers to fire in the first place is completely absurd. What do meta even do that warrants that much work? Surely it can’t come as a surprise for any of them if they’ve been paying the slightest bit of attention. Because there is no way in hell these 8000 people are all solving meaningful problems and contributing actual knowledge.
But a more normal but large-size company with a few hundred engineers maximum probably cannot replicate that, because odds are a lot of those people actually know a few things about the business and contribute in ways beyond just pumping out tickets. Surely a few of them can get downsized, and a few companies will probably try to axe most of them and fail, but I think this is mainly a problem with super bloated organizations.
LambdaLambo@reddit
If you want to hear something crazy - zoom has 7,500 employees. For a product that is the same or worse than what we had in 2020.
People are not gonna wanna hear this, but our industry is way overemployed for what we produce.
AI is just an excuse to course correct from the vastly inflated employee counts.
developerknight91@reddit
I’ve been thinking this for years. The biggest possibly is they are using AI as “one stone to kill two birds” the FAANGs for example were heavily overinflated with employees.
When their valuations tanked they had to let go of people. And they’re hoping they can get some type of return from AI.
I don’t know if AI will ever get a return in its present state(as many have pointed out here it takes too much compute power to get decent return) but I highly doubt that even if it does tank they’ll hire back at the amount of people they have now.
I predict that what will happen is, they’ll hire back on lesser people at an inflated salary to fix the AI slop.
I do believe it can be possible to get AI to make your workflow a bit more efficient when it comes to boilerplate code - but there is no way they can replace our skillset.
Vibe coders and end users using vibe coding CAN NOT replace a seasoned Software Developer/Engineer. It’s just not happening.
Gold_Emotion_5064@reddit
What are you basing that assumption on?
LambdaLambo@reddit
Cases like Zoom, as I just said. Do we really need 7,500 highly paid workers for a product that is in no way better than it was 6 years ago?
Different-Train-3413@reddit
I think you’re grossly underestimating the complexities Zoom deals with
LambdaLambo@reddit
I think you're underestimating how many people 7,500 is. That's 750 ten person teams or 1,500 five person teams. And we're talking about a company that has had essentially flat growth for the last 5 years (low single digits revenue growth).
Gold_Emotion_5064@reddit
It’s really not that surprising, it’s an enterprise company serving millions of customers. My mid tier company has about the same amount of people. There’s a lot of work that goes into infrastructure at scale and the business operations side. It’s not all engineers.
LambdaLambo@reddit
Whatsapp had 55 employees and ~500m MAUs when it sold to facebook. Yes yes I know it's much easier to handle text vs video, and whatsapp had users whereas zoom has customers, but also we're talking about 100x fewer people.
Gold_Emotion_5064@reddit
And now it’s 3,000 and a much more mature company. As your company grows the complexities of infrastructure, business operations. No one is saying you can run a successful software company with a few dozen people. But enterprise companies having thousands of employees is pretty common. It’s not that hard to understand.
LambdaLambo@reddit
Zoom was just an example of headcount being overinflated. You can add today's Whatsapp and Meta and many other companies to that list. I say this from personal experience as well. I've worked at big companies before and most of the work done is bs busy work.
Worldly-Pie-5210@reddit
I mean wasnt instagram started by like 4 people or something? most of this stuff is insanely inflated
phonage_aoi@reddit
You can also compare headcount’s pre-Covid and post-AI.
Like Block was under 4k in 2019. Slashed 40% of their workforce cuz of “ai improvements” and is now sitting at… 6k
drew8311@reddit
What does meta do with all those employees? Pretty much just maintain facebook/instagram + misc new product attempts that don't go anywhere. Theres at least 500 employees dedicated to optimizing thirst trap algorithms I assume.
Annual_Negotiation44@reddit
8000 laid off =\= 8000 laid off engineers, it was a mix of employees from many departments
TumanFig@reddit
a bit off topic: i get why meta has that many people I dont get why ebay has like 12k
TempleBarIsOverrated@reddit
Pretty funny to hear as we're selling on eBay for an average of 100M revenue with 2 developers.
vitaminMN@reddit
What?
eliquy@reddit
They burned however many billions on the abject failure of the Metaverse. AI is the excuse for cutting back on a hugely redundant workforce
I_dont_like_tomatoes@reddit
Idk why, maybe my cope but I think you’re right. I think the energy crisis is going to actually tip the scale so not every thing is AI dependent
08148694@reddit
They’re betting on token cost coming down exponentially as hardware improves, models become more efficient and data centres scale
If all 3 of those things improve linearly we will see AI costs come down very fast. I’m a bit sceptical but these labs have far better insider knowledge so who am I to make any predictions
_JaredVennett@reddit
I know what your saying…. but they’ll find someway to get as much as they can. As someone else said yes cost per token decreases, but now consumption of tokens increases…. as the finance bros would say ‘creative accounting’
OverclockingUnicorn@reddit
Qwen 3.6 27B and 35B are very close to usable if you are good with the prompting and not working at the edges of what's possible - which most of us are let's face it. Those can just about run on <1mo salary for a mid-senior EU dev. Give jt another 12-18 mo and a top end gaming GPU should be almost if not exactly equal to Opus (in theory of course....)
This is just the start of this technology and there is always an early adopters tax
kaeptnphlop@reddit
I’d call Qwen 3.6 27B on Claude Sonnet 4.5 level. I run it locally on a Strix Halo (128GB) with 128k context window and Q3.6 35B A3B with 2 64k ctx slots (another 128k). 27B analyzes, plans and orchestrates the execution of tasks from a plan file in subagents that use the 35B A3B model.
It’s slower than an API, but it feels very capable. Maybe because I optimized for token usage and it doesn’t need to chew through 16k tokens of instructions and tool definitions before it gets started.
skailrsays@reddit
Would be curious how this model fares for planning, design or overall project code reviews. I have been using Sonnet for dev and Opus for complex planning or architecture decision review and tradeoff analysis.
kaeptnphlop@reddit
Don't expect Opus level with a 27B model. I've got it running in a custom built Pi Coder harness and am pleased so far though.
OverclockingUnicorn@reddit
Imo, slowness isn't a problem, I'm happy to write out the prompts well, let it go plan overnight, return the next day, maybe do a bit more planning, then leave it going until the next day when I review the MRs. Repeat until done, just run this process a few times in parallel in a bunch of tickets/issues, then you'll have 5-10 MRs to review which is plenty
kaeptnphlop@reddit
It’s not that slow :)
With the recent addition of multi token prediction to llama.cpp 27B runs at 300-100tk/s PP and 25-10tk/s TG. 35A3 800-300tk/s and 60-40 tk/s.
Should work well with what you’re doing.
halfercode@reddit
I'm interested in building my own local LLM PC. Have you an experience running Ollama etc on a good GPU? I am looking at something like RTX 4090 or maybe even RTX 4090. I've tried some smaller models in CPU-only mode, but I am wondering of a chunky graphics card might end up being disappointing compared to Claude/Cursor etc.
Sisaroth@reddit
This. I've been trying agentic coding for the first since a week ago with local qwen3.6-35B. And I made the opposite conclusion as OP.
The stuff it can do on my shitty hardware is impressive, definitly claude can offer better quality stuff that is still cost effective for them.
drew8311@reddit
Improving hardware gradually over time is a big cost issue as well because large data centers can become out of date quickly. There is huge growth in this area and data centers were built recently and currently being built, its another huge cost if there is a minor hardware improvement 6-12 months from now when what they got recently was already pretty expensive.
Sprinkles_Objective@reddit
Well the reality now is they're selling AI at a major loss and usage based is just a bandaid on that. The reality is costs need to come down A LOT till they are going to break even.
Sen-dev@reddit
I don’t think the cost will come down watching how the cost of living just go up and up
familyknewmyusername@reddit
The cost of a static intelligence level are going down extremely fast. Deepseek v4 flash is $0.10 in / $0.20 out per million tokens while performing better than GPT 5.0 which was $1.25 in / $10.00 out per million tokens, and state-of-the-art last August
madtowneast@reddit
DeepSeek v4 Pro is 1.6 trillion parameters and needs about $800k in hardware to run.
familyknewmyusername@reddit
People are selling DeepSeek v4 Pro inference (profitably, not subsidised, I'm not counting deepseek themselves here) at $1.30 / $2.60 https://openrouter.ai/deepseek/deepseek-v4-pro
The $800k up-front cost doesn't matter when you can use a 3rd party API
madtowneast@reddit
Depends on the space you are working in. I can hear the lawyer/compliance alarm bells going off if you are using a Chinese model through a 3rd party vendor with no knowledge whether the model phones home.
UnbeliebteMeinung@reddit
I mean thats a monthly ai invoice for some companies. Doable.
Ok-Hospital-5076@reddit
That's not sustainable, Currently all companies are throwing money at AI without any growth in revenue. If they don't find real use for Agents & workflows they will stop spending that much on AI.
NatoBoram@reddit
Sustainable? If it's a one-time purchase, it's very much sustainable
madtowneast@reddit
Nothing absolutely nothing (okay maybe your coffin) is a one time purchase. Everything eventually needs replacement/maintenance or has OpEx associated with it.
madtowneast@reddit
At least from a security perspective, having an audit (even if 50% incorrect) is better than no audit at all. That is where most of our AI usage in prod is going.
UnbeliebteMeinung@reddit
Bro... we have real usage and we invested in hardware. Its running 24/7 we use it....
Whoever dont finds real use for agents and stuff doesnt even buy this hardware lol.
Distinct_Bad_6276@reddit
Qwen 3.6 and Gemma 4 both outperform state of the art models from a year ago and I can run them on my PC at home.
madtowneast@reddit
Sure you can run them at home, but are you getting enough throughout to support 5, 10, 100 devs? Every dev now potentially needs a laptop, and a $3-10k workstation/GPU. Plus the time spend on care and feeding of the workstation/GPU.
The baseline for full DeepSeek v4 Pro is $500k (quantized) and $800k (full model). Now add the human effort, the power cost, scaling with more users, KV caching, etc. Sure in a year things will be much better given current trends, but this stuff isn’t cheap either and even the cost/benefit of frontier models is being questioned when you have a single person burning through $1M in tokens a month without accomplishing the same as equivalent amount of people.
Note I am bullish an AI, I just think LLMs is not where the impact will come from.
maxintos@reddit
You're thinking way too small. If there was demand there would be plenty of new companies that would focus purely on hosting and selling open model services. They could reduce costs by focusing on scale while not having to spend any money on massive engineering and research teams like the frontier model companies are doing.
Distinct_Bad_6276@reddit
It’s extremely hard to break into a commodity industry once your competitors have already established economies of scale, which is what the existing hyperscalers are so good at
maxintos@reddit
Not if your competitors are spending half of their money on training new models and researchers.
Distinct_Bad_6276@reddit
The only hyperscaler doing that is Google
oursland@reddit
This is a standard setup for a developer that is paid $10/mo and more.
Drauren@reddit
The reality is is that cost is cheap for most people. Another 3K on a GPU setup isn't that big.
Distinct_Bad_6276@reddit
Now you understand why Nvidia stock is so high
a_slay_nub@reddit
Depends on the throughput you need but you can get a respectable setup to run that for 200-300k. The model is pre-quantized to FP4 so you "only" need about 1TB of VRAM to run it. Could easily fit that in 8xH200
madtowneast@reddit
I would assume that you want to deploy this for more than a couple people to use. Just sticker the last 8xH200 box I saw was like $400k.
08148694@reddit
Cost of technology has historically been coming down rapidly, it’s a fallacy to compare general cost of living (fuel, food, housing etc) to technology costs which have been historically very deflationary
Compare the capabilities of a modern GPU to the capabilities of a GPU form 5 years ago (at the same real terms cost) - the one from today is in a whole other league
scott2449@reddit
But that hasn't been happening. Pretty much stalled after the pandemic. Compute power has been increasing but cost is scaling linearly. Even the second hand market is nuts w/ my personal computers costing the same as the day I bought them.
Void-kun@reddit
The SSD I bought 3 years ago for $250 is now retailing for $1000. Samsung 870 EVO 4TB.
Hardware prices are legitimately insane right now
quentech@reddit
It's been 6 months I haven't been able to stomach buying any drives or RAM, and I really need a bunch of both.
entropyofdays@reddit
lol most people are running their local inference setups on GPUs from 5 years ago. The local inference community is built on the backs of 3090s.
Sen-dev@reddit
Maybe comparing with housing, fuel.. you are right. But comparing with other technology companies like Netflix, PlayStation, Xbox, all increasing their subscription prices. Companies are greedy, they will increase its own benefits whenever they can
raven_785@reddit
This is a subreddit for experienced devs. You are a bot or a LARPing teenager. Your comment makes zero sense.
ClydePossumfoot@reddit
Those are not even comparable things.
ReferentiallySeethru@reddit
According to Michael Burry (from The Big Short), the limiting factor will become energy costs. According to his thesis theres not enough energy capacity to meet the needs of all this AI expansion. He focuses on how the depreciation schedule hyperscalers are disclosing in their filings won’t be possible since the older chips just won’t be efficient enough to keep in use. If his thesis is correct (he’s been wrong before), then I imagine that’d result in the costs plateauing until that energy capacity comes online.
call-the-wizards@reddit
This isn't just wrong, it's not even consistent because it's mixing up units. It's like saying we'll run out of tacos because there aren't enough photons.
There's no such thing as a unit amount of energy per compute, and no such thing as a unit amount of cost per energy. Both of these things are just dependent on technology which constantly improves. The only consistent form of this argument would be something based physics first principles, like the Landuaer limit or the Bekenstein bound, but we are very very very very far from any of these limits. There's nothing to indicate the pace of hardware improvement is slowing down, and in fact all the indications seem to be it's actually picking up.
Burry is good at picking stocks, he's clueless about computers and physics.
ReferentiallySeethru@reddit
It doesn’t sound like you’re understanding the argument. He’s not saying the pace of hardware improvement is slowing down, he’s arguing that energy constraints - that is from amount of capacity in the power grid - will make the energy costs too high to use on older, less efficient chips.
These companies have bought billions in AI chips over the last couple years and they’re estimating that they’ll be able to use these chips for up to 6 years. In doing so they’re booking depreciation losses on those chips over a longer time period, reducing those losses per year. He’s saying these old chips won’t be efficient enough compared to the latest chips to keep them running since there’s a finite capacity both physically and in terms of energy use.
So as energy costs continue to rise due to the power grid not having capacity, those older chips wont be economical to keep in use relative to the latest and more efficient chips on the market. That means they’ll have to book losses on those chips earlier than they’ve estimated which completely changes their accounting.
dweezil22@reddit
One of the silver linings of this AI Slop is that we'll probably get efficient and affordable small modular nuclear reactors in the US in my lifetime now, once that happens I don't think energy will be a bottleneck anymore. TL;DL Nuclear has really never had a proper chance to benefit from economies of scale and modern tech and now it's probably going to b/c we've exhausted all other options.
Now it might be a decade for that to hit, so in the meantime it should get interesting.
gravteck@reddit
What sense is be considering depreciation? Is he saying they are booking an asset valuation in an optimistic curve? Or, is he saying they are ignoring future capital outlays for new hardware. One is an accounting aspect to valuation, the other is capital forecasting. Those can mean very different things overall financial health.
ReferentiallySeethru@reddit
He's calling out the optimistic valuation curve, but this could lead to future capital outlays not being properly forecasted.
Specifically, he's saying hyperscalers are stretching the chip costs out to 6 years but he argues they won't last much more than 3 because the inefficiencies in the older chips combined with the bottleneck in energy capacity won't make them economical for continued use.
gravteck@reddit
Jesus Christ. The reason I ask the question is because I just finished the Enron book, and this is the same shit rinse and repeat.
For those not in the know. Enron used an accounting process called mark-to-msrket accounting. They would "reevaluate" the value of their current assets to market price (they decided whatever they wanted it to be), and they could use that as collateral for loans. Sometimes that was capital outlays, sometimes it was just for the balance sheet, but it is certainly to disguise in favor of the book makers.
ReferentiallySeethru@reddit
I don't think it's nearly as bad as Enron; they were booking estimated profits for projects, that could sometimes take decades, before they had even broken ground. If those projects never came to fruition, they'd hide the losses in fake companies (see this wild chart).
Hyperscalers have already spent the money on these chips, they're just trying to stretch out how they account for the losses. Instead of forecasting $2-3 billion per year for depreciation, they're forecasting $1 billion (for instance) per year.
Enron was doing straight up accounting fraud, while what the hyperscalers are doing is more akin to financial engineering. Not great, but also not illegal.
gravteck@reddit
Yea I was a bit demonstrative, but I am very leary of this accounting chicanery when you add the variable of the circular financing and PE involvement.
That being said, I doubt they have the same problems when it comes to a philosophy of disclosure, so that's probably good at least. I worry about leverage more than fervent fraud. I especially worry about leverage with the PE firms who are behaving like banks but not regulated like them.
8004612286@reddit
Same guy that said index funds were subprime mortgages btw.
ReferentiallySeethru@reddit
Don't get me wrong, he's a bit alarmist, but he also has a point about index funds. It's made it a lot more difficult to get proper price discovery when so many people just buy the index, and we're somewhat seeing that today with how the market doesn't really appear to be reacting much to worsening economic conditions.
spez_eats_nazi_ass@reddit
The big D in Ebidta - the bullshit number these companies use is the real killer. Always fucking always assume when someone starts talking about profitability in ebidta terms they are actually insolvent.
EvilTables@reddit
The price is also heavily subsidized at the moment. So it would have to come down quite a lot for AI companies to even make a profit
Information_High@reddit
> They’re betting on token cost coming down exponentially as hardware improves, models become more efficient and data centres scale
When in the history of modern capitalism have decreased costs been meaningfully passed to customers?
Corporate shareholders and executives have *very* sticky fingers, and will fight tooth and nail to keep those increasing profit margins for themselves.
Anyone drooling over the thought of huge bargains on token prices is likely to be... disappointed.
fsk@reddit
When there is actual real competition. The big AI firms don't have a monopoly. They're also competing with free open source models you can run on a desktop. Given those two factors, I don't see how the big AI firms can ever make the revenue they need to justify the money they raised from investors.
For example, look at the price of a computer. Every year, the cost decreases.
ReachingForVega@reddit
Inference is cheap, what's not cheap is nvidia and their GPUs. China has come AI cards coming out in 2027 that should hopefully reduce the cost of training. Innovation should change the architecture.
bigorangemachine@reddit
I honestly don't think the hardware has room to grow. All they done really is make bigger GPUs
Congenital-Optimist@reddit
Nvidia Rubin architecture should become available later this year. It promises 2,5x lower compute per watt compared to the current Blackwell architecture. Feynman which will arrive after that promises even bigger improvement. There is plenty of improvements in hardware left.
thecrius@reddit
And it will be cheaper because of the low demand right? Right?
And when if so, can you name me an example of an industry in which something cheaper comes along and the major actors reduce their prices?
Exactly.
Congenital-Optimist@reddit
> And when if so, can you name me an example of an industry in which something cheaper comes along and the major actors reduce their prices?
Yes. History of computing has been a consistent trend of lower costs per compute.
But if it helps with your anxiety:
1. The hardware costs are dropping and will keep dropping. Current estimates are that in 5-6 years you can run the current SOTA models at home on your own physical hardware. So no need for subscriptions or dependencies on cloud providers.
Now, the models are still improving. We haven´t hit the wall yet(but there is a decent chance that at some point the improvement stops and all the models level off at the same competence level). We might be hitting the long tail with the training right now, where 2-3% improvement in performance might need a 2-3x increased costs in training and inference. There are increasingly signs that LLM models are becoming commodified and the moat is in tooling instead. When Opus 4.8 increases token pricing by 50% you can just select Codex 5.6 in the dropdown and have similar or close enough performance that you don´t care.
While LLMs are here to stay, I wouldn´t worry too much about one single company dominating the industry. The competition and existence of open weight models makes that impossible.
bigorangemachine@reddit
Sure but is it adding more VRAM at higher or similar speeds.
From what I understand the plumbing/interconnect that's hitting the limits now. They can't get the wires any smaller. That's my understanding what the limitations are now and it being bigger is just a work-around and they can't keep going to that well
neuronexmachina@reddit
Both GPU and TPU performance-per-watt have been improving exponentially over time, kind of like Moore's Law.
quentech@reddit
Only because we're still in the first few years. Look to cryptocurrency miners if you want to see how fast new hardware progresses up to the latest in process node tech. Couple more years and GPU/TPU will be progressing just as slowly as desktop & server hardware does these days.
HayatoKongo@reddit
There are already options that can be run locally for no more than the cost of buying a GPU one-time and the electricity needed to run it. The open-source options will only increase and improve from here. Deepseek v4 Pro has also permanently slashed their price by 75%. A lot of companies will be fine with using Chinese inference unless they're in a security sensitive industry.
MapLarge614@reddit
I'd argue that US models are a problem for many countries as well.
mattcrwi@reddit
moores law is dead. the cost aint coming down exponentially
cbunn81@reddit
Maybe, but these companies have already invested hundreds of billions of dollars. At some point investors/shareholders are going to want a return.
ReactionJifs@reddit
"They’re betting on token cost coming down exponentially as hardware improves"
much in the same way that lower oil prices results in cheaper gasoline, right? 👀
LittleLordFuckleroy1@reddit
Lmao the labs is who you’re trusting? They were betting on ASI and world domination. That was the big gamble. Turns out that was all a load of fucking bullshit, now they’re caught with their pants down.
The entire game now is them scrambling to keep hype around their products to allow for enough money to subsidize usage and hopefully get a huge swathe of the economy hopelessly coupled with their slop machines. Then they can tighten the belt and actually turn a profit.
They’re hemorrhaging billions right now and this is real money that investors expect back. Obviously these “labs” are going to be spinning yarn about how it’s all going to work out in the end. Don’t be so naive.
Dumb_Dick_Sandwich@reddit
And frameworks and kits will burn even more tokens, because they can. Token usage, like memory usage, is like a gas: it expands to fill the soace
Ahchuu@reddit
Yup, I agree cost will go down dramatically. It's possible to run Local LLMs locally, but it's still expensive to do and you don't get the performance. As the hardware improve it will become much easier and in 5 years it will be so much faster. This is why I don't understand all the people who think programming will "go back to the old ways". I think those days are done. I think the future is self hosting your own LLM + custom harnesses.
iostack@reddit
Current models might become faster but newer will always use a shitload of hardware
Ahchuu@reddit
Maybe, time will tell if that is true. You likely will need a crazy shitload of hardware for training, but inference on large models will become cheaper and likely will hit a point that they can easily be run at home.
Agitated_Marzipan371@reddit
These are racing against: investor subsidizing and circular funding becoming unsustainable. Expectation to make a profit. Sustainability to offer paid models for something that may become a ubiquitous resource
FauxLearningMachine@reddit
Could you please explain the math to me how you think 3 things improving linearly could result in exponential cost decrease?
eastcoastblaze@reddit
The problem is as hardware improves their capex spend still goes up as they have to buy the new hardware and then actual train the new models which is another massive capex expenditure
Suspicious-Art-7852@reddit
Nah that's naive. They're betting on raising prices 100x. You don't do 100s of billions in bond offerings on "hope" that costs will go down in the future (once you already spent your bond holders money, lmfao). You do it when you know you can hook people and keep raising prices to get to 80%+ profit margins.
Valuations are expectations. Costs going down will not make the kind of money for these expectations, especially when the money's already spent.
Costs can't even go down, because land, power, water and maintenance DWARF the cost of the GPUs.
NotMyRealNameObv@reddit
My bet is that all the major AI companies are burning through venture capital trying to get companies locked-in so they can jack up the prices.
MiniGiantSpaceHams@reddit
Tech always gets cheaper. We have a temporary reversal in upfront hardware costs due to shortages that are caused by the sudden unplanned demand, but those will ease in time, and that only impacts the purchase cost anyways. And even with that, we're getting new hardware that is better optimized for AI loads and software optimizations that bring down the cost to run it. Those things will continue to happen, and will eventually make local AI accessible as well.
I also think we get to the point where SOTA models won't be required for a lot of tasks. You could argue we're already at that point to some extent, but it will just continue. If you're writing a CRUD app (like most are) then you can probably already get almost the same quality output from Haiku vs Opus. Or you could even switch over to a cheap Chinese model.
People act like this is a mature industry, but we're only ~3.5 years since the original ChatGPT launch, and really only ~1.5 years since AI started becoming really useful (when o1 launched). We in the infancy of this tech.
steampowrd@reddit
The cost of influence has gone down by a factor of 10 every year historically until recently. It is going down, not due to cheaper chips though. It goes down due to better models better networking better structure in the data center etc.
Watchguyraffle1@reddit
What a great typo.
tenthousandants44@reddit
If hardware improves, they have to rebuild all those data centers
hikingsticks@reddit
The cost per token is decreasing dramatically, but the number of tokens consumed is increasing at a similar rate. all the reasoning and back and forth internal thoughts consume tokens.
potatolicious@reddit
For SOTA stuff sure, but an “Opus 4.5 class model doing Opus 4.5 things” is decreasing fast. The reckoning isn’t if anyone will use this stuff daily (they will) but that they will accept using good-enough models that are cheap to run rather than throwing the frontier top end models at everything.
IceMichaelStorm@reddit
Since when do machines become more capable? Last 5 years almost nothing happened, right? Moore’s law is dead, too. Prices even increase!
_JaredVennett@reddit
your last paragraph… very smart, will try that
_JaredVennett@reddit
Similar to cloud right…. in the early days every ceo screaming to move to cloud, getting rid of on-premise DBA’s….’cos cloud can manage it all’ …then being shell shocked at the bill … buh buh we thought cloud would be cheaper…. Naah, all that happened was you moved your DBA over to your cloud provider… and while not having to deal with local infra depreciation your still paying to cover the depreciation for your cloud providers infra…. Hilarious. The cloud products are more convenient tho, one can very quickly find the true costs via dashboards etc… while still small there is a revolt happening where customers are moving off cloud to VPS with something like DokPloy for deployments.
walterbanana@reddit
The AI companies were making 2 betts:
What is odd, is that they mostly just made bigger models that are less efficient. Now investors are pulling back, because neither of this assumptions are completely true yet.
Whitchorence@reddit
The prices are also not really broken down and I think it's very likely the case that the cost is mostly in training and not inference.
theawesomew@reddit
I have spent an ungodly amount of time attempting to model the profitability of running large language models (using DeepSeek V4 Pro 1.6T A49B as my proxy for 'frontier models' as a class) and the modelling I have done suggests that inference — purely considering power consumption and GPU assets depreciation — might be profitable at the current pricing assuming 100% paid utilisation of GPU assets at all the time.
The biggest cost by far, and the one which has been subject to the most accounting mendacity, is the depreciation of the GPU hardware itself. Hyperscalers have extended their depreciation schedules of their GPU assets, servers, and networking hardware to 66-72 months when it has historically been 24-36 months; a number corroborated by cryptocurrency miners who are the only other 'industry' to use GPUs at such high wattage and utilisation & who are inherently reliant on their GPUs being maximally efficient for the longest period possible (cryptocurrency miners use tricks like underclocking and under-volting to extend the useful life of GPUs to 4 years but, this increases latency and TTFT in a way that would be unacceptable for most LLM providers/users)
These GPU are run at very high utilisation and very close to the their TDP (Thermal Design Power) which, if you're familiar with Black's formula, increases the rate of electromigration in the silicon itself massively reducing the MTTF (Median Time To Failure) because that quantity is inversely proportional to the current density squared [in the case of GPUs]. This is worsened by NVIDIA relentlessly releasing newer, more energy-efficient GPUs which accelerates the technological obsolensce of the GPU hardware meaning that the economic depreciation timelines are closer to 18-36 months (because competitors using Blackwell, for example, can provide more compute at lower prices than a provider using Hopper GPUs which pushes the cost of compute lower than the depreciation of the Hopper GPU itself and its electricity usage). This means that the huge capital expenditures on GPU hardware have to repeated far more frequently than most people realize which is far & away the most brutal cost for AI laboratories/neo-clouds/datacenter operators. It's more like an ongoing cost than "infrastructure" as AI laboratories continuously refer to it; trying to get you to imagine a bridge or road which lasts tens or hundreds of years.
AndyDentPerth@reddit
I think you are possibly confusing depreciation schedules with consumption.
Actual tax depreciation occurs over 5 years with USA IRS rules.
If you burn out something in 3 years you still can’t claim it immediately.
At scale, this may become significant, but I suspect they are all assuming continued injections of VC $$ to fill the gap.
Whitchorence@reddit
That's a good point; thanks for a thoughtful post.
CowBoyDanIndie@reddit
They forgot the 3rd option, open models will be good enough and cost a fraction of the price
Dziadzios@reddit
They didn't - in order to sabotage it, they've bought all of the best GPUs, RAM and VRAM, so customer market will get overpriced crumbs.
S0n_0f_Anarchy@reddit
But they did. Okay, they managed to raise the prices of hardware, but for how long will that last? They did that on a bet that they will succeed, but if they don't, they can't keep burning money on that, and who will buy the hardware then? Prices will have to go down
Dziadzios@reddit
I don't know how long, but I know how it's going to end. With pressure from Chinese competition which will catch up to Nvidia's complacency.
DeadLolipop@reddit
I don't know about you but ai stocks are still rising. What is this pulling back you smoking on
walterbanana@reddit
Microsoft and Google adjusted their targets in regards to building new datacenters for AI because big investers were getting worried.
DWLlama@reddit
The bigger, less efficient models are better at convincing users that the models in general can do what they're hyped up to do.
walterbanana@reddit
This frustrates me. It is a language model, it is good with language and sucks at everything else. It doesn't have to be applied everywhere.
LambdaLambo@reddit
Their first goal is AGI/ASI. Hence the lack of focus on efficiency
walterbanana@reddit
That is a bullshit goal
namuan@reddit
Can you try the same experiment with DeepSeek4 Pro / Flash models?
You may need to guide them a bit, but in my experience, they've replaced frontier models for a fraction of the price.
They are also open, so I think large companies will look into internal hosting if cost becomes an issue in the long run
bitspace@reddit
People are going to learn to not rock Opus 4.7 for document summarization
Neeerp@reddit
My company has always been using api pricing and I personally have had weeks where I’ve personally spent nearly $2000… you underestimate how much big corporations are willing to shovel into the fire
seven_seacat@reddit
Small companies too. The record within my company was $6K in one week
aphantasus@reddit
How on earth is that supposed to work? I live in an EU country and I can tell you that we devs here are not earning that much here.
Strus@reddit
Most big enterprises already pay per token. It was like that at the very beginning at my job.
On the other hand I don’t know what people do with their agents if they spend thousands of dollars per month. I haven’t written a single line of code manually since January and my monthly bill never exceeded $400. But maybe that’s because when I use agents I already exactly know what’s need to be done, and agent just execute my vision. So there is little back-and-forth or code that needs to be thrown away.
krzyk@reddit
My $400 usage (estimated from copilot) was about 80 sessions of some work, but I do write code, and use it more like a help sometimes with finding hard to track bugs or refactoring.
mxldevs@reddit
Someone's going to create a token burn to feature shipped metric and we're all going to have to start putting it on our resumes to prove that we aren't expensive lol
Whitchorence@reddit
I mean, that's just as gameable as "token burn" as a measure of productivity though. All my feature tickets would get really, really tightly scoped.
Tcamis01@reddit
I'm in the thousands. We're starting to crackdown so it will change but these things will raise your bill: - expensive models - multi tasking. Im frequently running 3 completely separate tasks at once. Like different repos / completely different subject matter. Whether I should be doing this is another story but we're a skeleton crew after massive layoffs - autonomous complex, unbounded workflows. Again probably not the best idea and falls somewhat under the below notes about poor planning. One such agent analyzes implementations across a dozen repos / layers (because no one understands the full system) and can autonomously create docs / gather context for system wide refactoring, etc. Another autonomously ports legacy SDK features to a modern SDK as they are implemented.
This has all been mostly explaratory and probably not sustainable unless token prices come down.
Whitchorence@reddit
You can save a lot by only using the advanced models for the planning... well, sometimes you find that isn't working, but most of the time it works pretty well.
Whitchorence@reddit
I was under $400 but it started going up because I realized I could really work on 3 or 4 unrelated tickets at a time with worktrees and could also make the AI do a lot of tedious ticket and Wiki janitoring. But still I haven't gotten into the thousands.
geft@reddit
I spent $400 using 95% Sonnet. Rarely wrote code manually unless it's a one liner but I can easily exceed $1k if I had the budget, especially if I start using Opus lol. But then I also had a lot of stuff in the pipeline.
pwnrzero@reddit
I use Sonnet 4.5, 4.6 at work. Opus is restricted from being used in agent workflows due to the costs.
They also limited rollout to only the most overburdened devs that could benefit. Kind of smart actually. I've been using it for dirty scripts that need turnaround in a day or even by that afternoon.
coworker@reddit
What I've personally noticed is that our weaker engineers burn tokens performing implementation during the design phase. They are unable to think at higher levels so they need the agent to actually build the solution before they can determine if it's the right one. This wastes huge amounts of time and tokens versus actually planning first. Our better engineers pick apart the plan and discuss with others before any code is actually touched.
Mundane-Charge-1900@reddit
This is something I've noticed as well. I've been moving much more to using the more expensive models to do planning in a more synchronous, hands on mode to get tickets created and a good, detailed plan assembled. Then I can use cheaper models run in parallel and less monitored to implement the details.
We have an internal dashboard at my job that shows how much coding agent spend everyone in the company has incurred. It's wild to see the highest spenders are either super high level principal engineers or new grads. They're either doing some massive, cross cutting change or they're prompting the agent in super inefficient ways.
CrusTyJeanZz@reddit
Perhaps you should do a lunch and learn or something to try and teach the “weaker” engineers how to optimize token usage by planning before implementing. I doubt that they’re unable to think at higher levels. Unless either they’re fresh junior engineers or your department is bad at hiring.
coworker@reddit
I'm a principle. I've already done this and attempted to mentor several. Not everyone admits they have a problem especially when they think they know better (and velocity metrics say they are doing fine).
But still their reckoning day will come when tokens get more expensive
__idkmybffjill__@reddit
A principal engineer talking about "weaker" devs "incapable of higher level thinking", on reddit
You sound like a dickhead
coworker@reddit
And you sound like a weaker engineer who refuses to accept criticism
:)
BrewerAndHalosFan@reddit
We are encouraged to build then refine. It feels so gross.
coworker@reddit
Don't worry that will change as token prices increase. That approach is basically brute forcing solutions which can save time when agents are doing it in the background. Once then subsidies end, engineers will be expected to subsidize with human tokens again
ShoulderIllustrious@reddit
Same here, I use an opencode sub and I barely hit the hourly limits, like maybe 60% is that highest I've hit per hour.
I do spend a ton of time writing high level and low level design info, also add information about constraints for each feature. I like this kind of style for the little things, it really forces you have your idea on paper literally. It helps weed out the stupid ideas.
niveknyc@reddit
I'm already hitting $2-5k a month on Claude, most of which is used for evaluating, documentation, and generating requirements docs. It's been a force multiplier so my company has no problem with that additional cost. However I agree, these AI company spent BILLIONS on this shit; very soon the cost to play with them will go up exponentially. Eshitification will strike.
TitanTowel@reddit
Are you just letting it edit without manual approval? I'm a power user in my company but I'm only a 10th of your usage...
krzyk@reddit
Yeah, I used 80 (out of 300) reqs and it was aroudn $400 if it would be in the new token based pricing. My code reviewer uses 300 out of 300 most of the time, and the cost per moth in tokens would be $6000 - $8000.
allllusernamestaken@reddit
i think part of it is people are using it wrong.
if you never close a Claude session, your bill spikes because your conversation is used as context. So if you've had your terminal open for 10 days as your "random questions" terminal, you have 10 days of context in there. That balloons your usage.
Less-Bite@reddit
You're not a power user
Whitchorence@reddit
Damn $5k as a single person is quite a lot. Are you actually reviewing all that?
Annual_Negotiation44@reddit
So will this bubble actually burst and the corporate Pavlov dog c-suites will actually back down on AI euphoria/implementation?
niveknyc@reddit
Only time will tell. My take is yes, very much so. C-suites NEED AI to be IT right now because nobody wants to believe the truth - we're in a massive recession that's going to blow up in everyones faces when the AI economic bubble pops.
Icy_Accident2769@reddit
Compare this to us. I get like 300 GitHub copilot tokens a month. Which roughly translates to 100 opus 4,6 SESSIONS. Keyword is sessions. This cost us 60 euro.
With smart usage of not restarting sessions till they deteriorate. I’m able to crunch out so fucking many tokens it’s insane. Until I start a new session I don’t lose those tokens.
There is no way me, and my colleagues, only use 60 euro in AI costs. So they are heavily subsidising it for now.
Type-21@reddit
We have the Copilot Business Plan which is 20 usd and I used GitHub's cost estimation tool for tomrn based billing and the cost would only go up to 60 usd per seat. And I edited or added around 100k lines with copilot last month.
spez_eats_nazi_ass@reddit
You will be getting a surprise in the form of a gaped butthole on July 1 if you keep that rate up. Why i know a lot of people w copilot are burning it hard last week of may. I have a shit ton of porting work im using it for - big angularjs codebase need to migrate. Which is the kind of work these things are made for.
meltedmantis@reddit
That ends june first,...
Type-21@reddit
Reading comprehension
meltedmantis@reddit
Hit reply on wrong comment apologies lol
Downtown-Pear-6509@reddit
thats not much at all mine goes $10 too $1500
Type-21@reddit
Yeah we put work into optimizing our codebases for ai so that it can work more efficiently.
meltedmantis@reddit
That ends june first,...
Significant_Ad_8032@reddit
Github copilot is moving to usage based plan from June 1st.
krzyk@reddit
I have exactly the same feelings.
Our company has copilot, and it changes from premium request into token based pricing. It will be a massacre for those that get dependent on it.
Previously one could get few hours work out of 1 premium requests (out of 300), tokens unlimited, any number of tool calls, any number of subagents (and does can get subsubagents and those can get even more).
I used it sporadically, usually around 80 premium reqs out of 300 per month (and some of them for personal things, as I couldn't find any usage in my work).
I checked how much my previous would cost in usage/token based pricing, and instead of $19 it would be ~$400, this month is a bit different but still instead of $19 it would be $90. And company is settings people at the $19, with overusage only for selected few (about 10% out of 4k).
Anthropic also converts all new companies (at least since December when we signed up) with >150 people to API usage.
It is a nice gimmick, but Juniors are cheaper and with investment they get better from LLMs quicker.
What makes LLMs really good are code reviews, they catch things that people don't notice (they give a different perspective from human reviews) - unfortunately those cost quite a bit - I wrote a code reviewer that used copilot, and in token based pricing it would use $8000 in April and ~$6000 in May (doing about 3500 pull request reviews). That is not worth it if we would pay in tokens.
armostallion2@reddit
This hits hard, and I like that it’s both written well but also slightly 💩y at the same time. Interesting times for sure.
public_void@reddit
This is a bad take. I am an engineering leader and if I was told to cut my budget 20% you bet your ass I’m not cutting ai tooling access, I’m cutting people. My best engineers were producing some multiple of my worst, and ai only multiplied the gap.
AftyOfTheUK@reddit
What? I cost my company around 550k/year. Even at bill-out rates, 5k is what I cost by 11am on Wednesday. The rest of the month is free
If my productivity is doubled, that's worth over 2k per business day
Puggravy@reddit
No probably not. The trend with more recently released products are actually improving on benchmarks while using less tokens, the reasoning model token bloat era seems like it's ending. Seems like even free plans are now net profitable from the statements made. Would guess inference usage trends continue to rise.
hell_razer18@reddit
our first approach is always credit based tp see hoe many use it first. If it reaches out 100%, then we will evaluate. The thing is a lot of us bring our own LLM. Company use factory droid max with 2 accounts, we share this access to entire eng for around 30 to 40 people. Maybe actual use only less thab 15 people. We also have claude team.
Some of us use cursor, opencode, codex. I personally subscribe minimax for my openclaw, opencode for my personal project and planned to subscribe gpt plus for gpt 5.5 a bit here and there.
As long as the budget do not take up more than 5-10% of my salary, I am fine by subscribing the plan for myself.
I will avoid any direct usage of api key toward money, maybe only openrouter but for testing new stuff and only certain model
kosmos1209@reddit
Remember when Lyft and Uber rides were $2 in 2015-2016, when they were trying to get new users onto the platform as much as possible? Remember how Airbnbs were 40-80 a night in the mid 2010s? AI is going through the same thing right now and we’re paying way lower than how much it actually costs. They’ll pump up the price when we’re all hopelessly addicted to it, just like ride shares and short term rentals did.
Whitchorence@reddit
I remember reading a lot of blogs about how Uber's business model was fundamentally flawed and they'd never turn a profit, which I found convincing at the time, but it clearly wasn't correct.
ESGPandepic@reddit
People in these threads make the interesting assumption that they understand the underlying technology, roadmap and economics better than the companies doing this and the investors behind it and can therefore better predict the future.
Whitchorence@reddit
I mean, you know. Businesses failing is something that happens all the time. It can happen. But I think there's a tendency for all of us to believe things are more likely simply because we'd like them to happen.
zeroconflicthere@reddit
The big difference is that open source models will improve. That couldn't happen with Uber and lyft
MercuryFoReal@reddit
This. It's all this. Burn that VC infusion to get lock-in, then hopefully become profitable. Blitzscaling is the term.
Gets very exciting and bloody when a bunch of giant startups all do it at the same time, since there won't be many winners.
Cool_As_Your_Dad@reddit
Exactly.
BobJutsu@reddit
Worse. My company expects us to use agents to produce code. But also won’t pay, it’s devs responsibility to pay *personally* for their usage.
account1233@reddit
At work, there has been an even more aggressive push by management to have all devs move to an agentic flow. When I brought up the topic of how much this will cost, I was told, "well, what's the cost of losing ground to because this is what they're going to be doing and blowing past us".
Management is fucking blind as a bat and I can't wait to see this blow up in their face. Since no one is thinking about budget im absolutely blowing through tokens haha
cport1@reddit
local models are getting very good
SmartCustard9944@reddit
We are still in the exploratory phase. Companies are allocating money towards AI in order to stay competent. It does not mean it will be a token buffet forever.
Rguttersohn@reddit
This is why I have never vibe-coded any project. I work for a small org. Once everything is token-based pricing, there is no way my org could afford to pay for the token usage — and I’d rather it be spent on our salaries anyhow. I’m happy just using it to write tests or set up projects, monotonous stuff like that.
bwainfweeze@reddit
The first one is always free.
T0c2qDsd@reddit
I mean, I think the sweet spot for enterprise use in the US, unless things get a LOT better, is probably going to be ‘soft’ $1k/eng caps if it allows them to have 10-20% fewer developers with current prices.
So far, I haven’t seen folks consuming a lot more produce equivalently more features with any level of quality.
tenthousandants44@reddit
You can't replace engineers with tokens, or else it would have already happened. If you're cutting only 10-20%, those are just the dead weights who know too much to fire. Would AI actually change that? You can give it your source code but not everything gets written down, not to mention that getting things done often just means talking to the right people. Is AI going to replace those relationships, too?
T0c2qDsd@reddit
Does what I said include me saying “I think we are in that spot” right now?
I think there’s a point where fast code generation means you can accomplish something that would have taken 5 people a month with 4 people in the same month. I think many companies in the US would pay $4k/month for that, but probably wouldn’t pay $20k/month.
New-Inspection7034@reddit
This is exactly why I've been making my own harness and using Gemma4 with MTP. I'm able to do 90% or more now that I've add LSP support.
aitchnyu@reddit
But the companies got a time window when people generated Rube Goldberg machines they cant understand nor update by hand.
LambdaLambo@reddit
All depends on how you use it. Getting the right structure and architecture is still incredibly important and can be done with good engineering. And the thing is, even in the before times I would have to reread code to understand it if it’s been more than a month or 2, so I don’t care that I’m less familiar with the code now.
As for productivity, it’s been an insane boost. Yesterday I spent $60 in tokens and 3h wall clock time, maybe 30m of focused time, implementing something that would’ve taken me 4 days by hand. I get paid ~$100/h so the ROI is clearly worth it.
_predator_@reddit
No one is denying the benefits for those who already are "experts" and follow good engineering practices. The problem is that AI makes skipping all that much easier, thus raising the chance of ending up with Rube-Goldberg machines without coherent architecture.
LambdaLambo@reddit
Yes ofc. But this is /r/experienceddevs
nyanyabeans@reddit
Experienced does not mean a) good at engineering or b) good at using AI responsibly.
NorthernBrownHair@reddit
Experienced (3+ years)... Yeah
LambdaLambo@reddit
Is that my flair? Cant see. Probably made it a long time ago. I have 8 years now.
Sunstorm84@reddit
Nah it’s the minimum requirement for the sub
Klinky1984@reddit
Some of the worst systems in the last 20 years have been hand-crafted dog turds. Many of the frustrations I have dealing with AI applied to other human developers. Some people cannot code out of a paper bag or design a functional system.
Sad-Cardiologist3636@reddit
Having a very mature code base with very good test coverage gives a great launch point.
If you have been giving tasks to coding agents where you yourself didn’t have a clear understanding on how to do it, you are digging a hole. The people who gave coding agents a green field are going to have a hell of a bad time when usage prices accurately track operational cost.
A good term I like to use is “pouring concrete”. If you are having coding agents pour concrete, you’re going to have a really bad time eventually.
alchebyte@reddit
yep. it's a race to get something useful out of it before the bubble bursts.
call-the-wizards@reddit
Cost-based arguments never work because the nature of technology is that things get cheaper.
Here's a list of things some people said would never take off because of how expensive they were per usage:
Things start out being out of reach for most people, then get cheap enough to afford, then become commodities that are so cheap we don't even think about them.
Most of these things (except satellite internet maybe) went from "too expensive for most companies to use" to "every company uses them" in <5 years.
dagamer34@reddit
With the rise of “agentic workflows”, this is why the closest analogue for coding agents is healthcare: great if everyone had it, but it costs too much for what you might receive.
Which I leads me to say the next thing, because it matters how much you use, these companies do not have zero marginal cost. They are not tech companies, they are GPU rental companies with some slick software on top. When was the last time we had a company slathering itself with “tech”, was about to IPO, and it didn’t quite work out?
WeWork.
Distinct_Bad_6276@reddit
Qwen 3.6 and Gemma 4 beat state of the art models from a year ago and I can run them on my laptop (albeit slowly, but for background tasks who cares). In a year we’ll have local models that beat GPT 5.5 and Opus 5.7.
dagamer34@reddit
Newer models being better than what was available a year ago is orthogonal to the class and economics of models frontier labs are serving themselves. It would only be beneficial if there was no need to keep developing new versions, and there was no fear that the competition would outclass them.
calvintiger@reddit
> When was the last time we had a company slathering itself with “tech”, was about to IPO, and it didn’t quite work out? WeWork.
Just to make sure I have your point clear, are you saying the most recent time this happened was in November 2023? If so, doesn't that imply that every other company since then has been succeeding so far?
dagamer34@reddit
It’s probably happened since the WeWork IPO in Sept ‘19, I am mostly referring to a quite famous example.
Lopsided_Distance_17@reddit
Manager here, before AI, enterprises had/have no issue spending 5k per seat. $5k in tokens is an accounting error. $50k…that would be another story. And yes per month
Dumb_Dick_Sandwich@reddit
Folks who were teens/adults from the 90s and 00s will recognize the playbook.
Cell phone providers moves from Minutes to Data around the release of the iPhone. The BlackBerry movie summer it up perfectly: “There’s only one minute in a minute”.
There are only 50,000 tokens in 50,000 tokens.
You see the lead up with a lot of tools like SpecKit that are almost *designed* to burn through as many tokens as possible
NatoBoram@reddit
Except Canadians. Cellphone plans have always been ultra expensive for crumbs here.
JaySocials671@reddit
USA also has a greedy telecom data problem
Whitchorence@reddit
"Minutes" would not make any sense as a billing metric for mobiel data.
Mundane-Charge-1900@reddit
And yet today everyone basically has "unlimited" data and minutes, or at least limits that are high enough for cheap enough, that it doesn't even matter anymore.
If you looked at what was happening at the time, you might have assumed it was a ploy by telecoms to increase revenue by shifting usage from older, cheaper tech like landline voice to more expensive, higher margin wireless voice and data. After inflation, I bet telecommunications costs for consumers are lower than ever, especially when you consider how much more convenient a smartphone is over a landline phone.
Vinegarinmyeye@reddit
I literally just commented about this on r/shittysysadmin
Token pricing was always going to ramp up - I feel like the bean counters have finally figured it out.
I feel like a good number of the posts I see on tech professional subs are talking about reducing token use because... Money.
There was a chunk of money to be saved by giving it "Hey we'll ship absolute garbage code, get rid of the expensive engineers who know how things work, and if we have outages, leak data, whatever fuckery... We'll still be better off financially".
Maybe I'm overly optimistic, but I'm starting to notice the pendulum swing.
There's an obvious solution to AI overspend - (re) hire people who actually know what they're doing.
themooseexperience@reddit
I'm shocked I haven't seen anyone mention how the US government is pouring money into AI at a fervency that hasn't been seen since the creation of the internet, maybe even more than that.
My theory is that it will remain artificially propped up until it becomes distributed and efficient enough to be priced more effectively.
I'm not giving my opinion on whether that's a good thing or not, but that is my prediction for what will happen in the next 5-10 years.
Cristiano1@reddit
Yeah, I think usage-based pricing changes the psychology completely. Agents feel amazing when you stop thinking about tokens, but once every experiment feels like it’s burning money, people get way more selective fast.
muntaxitome@reddit
Plenty of devs in EU with over 100k salary, but you seem to be confusing productivity with salary. In Netherlands average productivity per worker is around 200k. That is an average. It includes low level jobs like grocery store checkout person. Wayyy over your math.
'50% productivity increase per engineer' is absurdly high though. Majority of cases might be seeing modest increases but 50% is insane. Keep in mind that productivity is measured in revenue per hour worked. So your 50% means 50% more revenue, or halving the amount of workers or some combination of that that works out to the same math.
Whitchorence@reddit
I mean, sure, but even before AI came along, determining exactly how economically productive each feature shipped is is pretty difficult.
muntaxitome@reddit
On an individual feature or engineer basis it is hard to determine productivity. However, from an economic perspective productivity is revenue/hours worked. If the claim is for a company as a whole over a period that is actually a fairly trivial calculation.
Aggressive_Amount_73@reddit
That's a very good point ppl often forget. All this nonsense about being more productive with AI is based on what? Because the only real metric that can be used is if it is translating into more revenue.
Doesn't matter if you produce way more code now. Is the product you're working on, getting more revenue because of you making more code ? In the end of the day is your company profiting more?
And this will be more and more important as these models start to be more expensive. Is the money you put in these models, bringing returns in terms of profit ? Because if in the end of the day you're spending more 5k with AI but your revenue is the same, you're losing 5k of profit. Doesn't matter how many lines of code you're doing.
Whitchorence@reddit
By this logic, why hire anyone either?
Distinct_Bad_6276@reddit
I definitely do see a 50% productivity gain, but not from writing code. One of the biggest value adds I find is that it drives down research time. What used to take an hour or two of reading documentation is now a thirty second query. What used to take several days of database queries to find bugs now can be done in an hour.
But all of this requires your data and knowledge bases to be accessible to AI, which I think most of the doomers in this sub are reluctant to do.
Type-21@reddit
No, less than 1%. They are extreme outliers. He's right, Europe and other parts of the world are going to be left behind even more because they can't afford as much ai usage.
muntaxitome@reddit
Pretty much all devs I know in netherlands make over 100k, where do you even live?
grilledcheesestand@reddit
The absolute vast majority of developer roles in the Netherlands pays less than 100k per year.
You are just hanging with the crowd at the top of the market.
muntaxitome@reddit
https://www.levels.fyi/nl-nl/t/software-engineer/levels/senior/locations/amsterdam-nld
I mostly hang with seniors, but I don't think that's just the top of the market.
grilledcheesestand@reddit
Yeah you just listed Booking, they're the prime example of golden handcuffs and top of market compensation in Amsterdam 😅
Type-21@reddit
At this point you're just trolling. Normal devs don't even get past the interviews at the companies listed on levels
RiddleGull@reddit
levels.fyi is the top of the market pool. There’s countless more software companies that are not listed there.
tan_nguyen@reddit
That is not how statistics work. Most of my circle make more more 100k doesn’t mean p50 is 100k :D
In Finland the p90 is like 85-90k last time I checked, and it comes from a union for tech workers (TEK).
Type-21@reddit
I live in Germany. We have official government statistics about real income here: Median software engineer (around 15 years of experience, around 48 years old) salary is 73k before taxes. Top 25% is 88.6k. Bottom 25% is 58.6k.
Basically no one of that group is on reddit. People here usually have like 5 years of experience or just finished university. So on average they don't even reach these numbers.
dmitriyLBL@reddit
I'm not buying this at all.
You can do plenty on a $20 Cursor subscription if you plan your architecture well and don't rely on the agent to go in circles to fix things that you should've defined in the first place.
With hard skills and rules, an agent is hella efficient.
spastical-mackerel@reddit
People are duplicating work, processing the same info, doing essentially the same inference on a vast scale. Little or no thought into what could be processed centrally, what really should be left to deterministic tools etc. there’s a lot of room for optimization
Whitchorence@reddit
Yeah, sure. But honestly there's zero reward for optimization at the moment so why would you?
CompoundInterests@reddit
Even optimizing which model to use. I see people use frontier models to change a css color.
kartoffeln44752@reddit
Me
yubario@reddit
I’m able to get ChatGPT to code quite well in its sandbox that it doesn’t even matter I’m losing GitHub Copilot at my job next month.
Only charges 10 credits per message, anything it does inside the chat isn’t billed. So it’s quite cheap compared to anything else
Whitchorence@reddit
It's usage based right now for enterprise?
waffleseggs@reddit
100%. I've already budgeted no more than $50/mo and I'm currently token-maxxing.
Distinct_Bad_6276@reddit
$50/hr is roughly what I expect the average US-based junior developer to make. Do you think a junior developer could make this app in one hour? No? Then you came out ahead, economically speaking.
AspieCurmudgeon@reddit
You are omitting the entire two days that the senior dev spent. The same app could have obviously been done in two days without AI
winebiddle@reddit
using scrubby.ai in front of coding agents cuts down on token usage dramatically.
The_Synthax@reddit
Yeah it’ll be called “the hydro bill”
Nottabird_Nottaplane@reddit
How are you not killing your $20 sub? I’m getting FUCK ALL value from my $20 sub on mobile. I had Claude set up a database / newsletter thing to track startup activity in NYC, and I kept running into usage limits instantly. And some weeks just the ingestion can hit the usage limits too.
pagerussell@reddit
Ed Zitron has published all of this, but basically every dollar of revenue is 8-13 in cost right now.
To say that is unsustainable is a tad of an understatement.
And that's not including capex spend to build new models. That's just the cost of an already trained model.
It's really neat the way an LLM can write an email for me, but it's far from a viable business model at the moment. As soon as the VC subsidy stops, the music will stop and it's gonna be wild.
This is one reason I have explored Ollama and self hosting. It's slower and not as good of models, but I am not going to pay hundreds and hundreds a month when this stuff finally gets priced correctly.
rrrx3@reddit
Ed Zitron is huffing his own farts and has been consistently wrong, week over week about what he posts.
HotDribblingDewDew@reddit
This kind of post makes me realize just how naive so many of you are lol. AI is only going to keep getting better and cheaper. In 10 years you're going to realize you posted this because you're in complete and utter denial right now. No other reason.
Venisol@reddit (OP)
Models have stalled for like 2 years. Costs are only going up.
There is no law in the universe that forces technology to get better at a linear or exponential pace infinitely. Is your fork better than 200 years ago? Is your dishwasher better than 50 years ago? Is github better than 5 years ago? No is the answer btw.
Technology stalls. Technology can get worse, especially modern tech.
Also think about what youre actually saying "things that are 30 times too expensive now, are not gonna be too expensive, if theyre 30 times cheaper".
Thanks for the analysis lil bro
HotDribblingDewDew@reddit
Making a lengthy, but ultimately a bad straw man argument in response to what I wrote only serves to reveal how in denial you really are. Think of a legitimate defense and get back to me big bro.
Zetus@reddit
Open models on-prem on your own devices will be the thing that when reality comes back to the room drive the conversation.
kbn_@reddit
In the US, the handwave rule of thumb for SWE yearly total cost to the company (not just comp but everything) is around $1M. That’s an overshoot for juniors but a significant undershoot at the higher ends of the seniority spectrum, so it’s close enough to work.
$5k/month at current rates is about 5 billion tokens, so a little over 1 billion tokens per week. That’s actually quite hard to do. If you’re single threading your agent and doing any amount of manual code review, you’ll be at most half that. *At most*. To get up to the billion mark you need to be doing a meaningful amount of multi-agent work and optimally some genuine swarming, and you need to be doing it consistently.
To put it on another scale: a non-trivial backend infrastructural service can be built for about 15 billion tokens all in. So it’s just really hard to have enough work to use that much consistently.
Tokens right now are definitely quite subsidized, so let’s assume that in the future this usage rate (which as I said, is pretty hard to sustain) is around $10k/month. That’s still just 12% of the handwave total SWE cost I gave. So that means that a 12% increase in productivity (or higher) justifies the expense.
Feels like a pretty safe bet.
Mundane-Charge-1900@reddit
Other comments are calling out this $1 million per dev number, but it is a reasonable ballpark for tech companies, in the sense that leadership is deciding on budgets. They're considering what they get by hiring 50 more engineers or spending $50 million more on tokens.
demosthenesss@reddit
Software engineers have $1M/year cost on average?
What? Where are you getting this crazy number.
FriendOfEvergreens@reddit
You have to consider everything, the whole management layer that doesn’t directly produce adds to each producing employees cost. Healthcare, RSU, 401k match. I think he meant FAANG average and there I could definitely see it
demosthenesss@reddit
Oh, so basically averaging entire cost of the company by the number of software engineers, because (apparently) software engineers are the only value adding folks?
FriendOfEvergreens@reddit
I mean it depends on the company obviously, the distribution is different everywhere. Nowhere did I say SWE is the only profession creating value you just pulled that of thin air. My point was that the software management layer costs should be added to the costs for engineers. If you decrease engineer headcount you decrease management headcount.
kbn_@reddit
As another commenter noted, you need to consider the full overhead. Not just total compensation (health insurance, retirement match, etc etc) but also equipment, amortized facilities, amortized HR, amortized IT, accounting overhead (payroll is more complex than you think it is), etc etc etc.
And yes I was using FAANG numbers. Smaller companies have lower costs and overheads certainly, but the math also gets fuzzier (how do you count the cost of options vesting for a private company?). But I think if you look through my whole post, you’ll see that the margins here are such that the math could genuinely be off by a ton and it would still pencil out.
Misty-knight200@reddit
"Even at work, if you spend 5k per engineer per month no real company is going to do that."
You have no clue whatsoever. Public big tech companies are demanding developers stop writing code by hand already. They are throwing tens of thousands per developer a month in the bin.
BoostedHemi73@reddit
I’m trying to hurry up and build all the little things I’ve always wanted to build while it’s being subsidized. This won’t last.
TacoTacoBheno@reddit
Devin brought down the CI/CD pipeline yesterday.
I'm sure they could figure out a way to limit builds, but for a single "feature" it would open a PR every thirty seconds, which kicks off all build!
There were 4,000 builds queued up when it died.
Agent scans for security vulnerability, 80 percent false positive. The genius principle said we should have another agent that identifies the false positives and closes them.
Wtf are we even doing?
Mundane-Charge-1900@reddit
There's some real productivity improvements using coding agents out there, but Devin has basically been a scam since its announcement.
iworkinprogress@reddit
$50 to build an entire app is pretty good, tbh
Mundane-Charge-1900@reddit
I think this is what some people are missing. It's not about someone who can build said app manually if they spent the time on it. It's about someone who can't write that code themself manually today. They're not going to be able to hire someone to do it manually for $50 either. In some cases, they can prompt their way to success for the $50. This is software that wouldn't have otherwise been written.
Does that work at scale in giant mega corporations where the real revenue is? That's still an open question.
Previous_Feeling_484@reddit
This or we run out of power first.
Think_Inspector_4031@reddit
My hope is that a new hardware company is going to pop up and create something that's a mac studio m5 ultra, but with double the RAM GPU, dedicated AI/LLM hypervisor with same energy use/heat generation/lack of noise.
Say each one cost 20k, but can be put into an office setting, serving 20 to 50 engineers. With down time used for others in the building cause we all have teams meetings.
Single 100k investment to 10x productivity to an engineer wing. Having AI parse PDFs extract text, and throw it into a spreadsheet for you. Create specific tools to automate your workflow to remove engineering hours for the BS clean up work.
DualActiveBridgeLLC@reddit
We had a 'AI black Friday' yesterday. The goal was everyone setup your system, start using Github Copilot, you have 6 hours to make something then we do a show-and-tell. On one hand I was really impressed with somethings we were able to do, but on the other I looked at our token usage at the end of the day and was really shocked at how much we had used. We had gone through about 20% of our corporate plan of tokens in one day. The best results were some tools to help with pre-sales questions (stand alone and throw away if necessary), but from a coding stand point it had only really helped creating boilerplate and unit tests (even though they were trivial cases so we wouldn't commit them).
But the biggest thing I realized is there is no way this is going to scale when they increase token costs to make a profit. If this is the golden era of token costs (aka they are giving it away way below cost) this will never be a wise investment.
My only thought is maybe if you just start with AI on the project maybe it can scale better rtather than working with existing code and tech debt, but another dude who had started at the company last month said 'no' it just can't handle large projects. But what do I know.
Dirty_Rapscallion@reddit
I think as time goes on, tokenization will get cheaper and faster. We'll also get new hardware meant to run these model plug-and-play style. Once that happens, usage won't matter, You can run an LLM all day and night to make whatever you want.
Uesh@reddit
I hope that happens. But those companies that invested so much into AI will want their cut. One way or the other. So I wonder how it will play out. Also, to get hardware to run a good LLM, won't it take many many years? Unless its a company owned server for their workers.
MrIcedCafeMocha@reddit
I’m not sure if it’s just due to inefficient use but I’ve completed months of agentic use where my total usage would cost around $150. At the highest, $300. Typically my usage is around $80-150 per month and that’s me completing 2 sprints. Not sure how everyone is racking up these $1,000+ token usages.
SinbadBusoni@reddit
Well, it wasn’t fun while it lasted.
donniedarko5555@reddit
I mean we're already transitioning to using local models where we can at my company.
You could easily have a Qwen model handle your cursor bug bot usage, even if you fully account for the cost of an on prem infrastructure behind it compared to the usage costs.
Especially with a rapidly growing company and massive mono-repos
LittleLordFuckleroy1@reddit
Completely agree and I think the writing has been on the wall for a long time now. The entire game for the AI companies is to get people hooked on their drug before they run out of money and actually have to turn a profit.
AI is super useful, and it’s not going away. But it is absolutely not economically viable to be used in the way that it is today.
tr14l@reddit
Just wait... They're going to have to do 10-100x pricing. A million tokens will probably be measured in dollars, not cents.
They're operating in deep red on compute right now for the market race. That isn't going to last much longer.
And at some point, free accounts are probably going to get either eliminated or neutered to the low tier models only with a token cap.
circalight@reddit
Definitely going to create a moat where only the richest companies can use it.
gjionergqwebrlkbjg@reddit
My company has been on usage-based billing for a while, the only thing they do is limit access to the most expensive models. You can get very far with Sonnet.
-Dargs@reddit
I modified my terminal for Claude to show my usage cost for tokens. I'm spending like $1500/d, not $5000/mo, lol.
tommyk1210@reddit
How even?
Even using opus 4.7 you’re spending about $25 per million output tokens. $1500 a day is 60 million tokens per day…
-Dargs@reddit
Well, I'm in a staff role in a small org so there's always too much to do.
The cost of multi-multi-tasking and iterative design and development on greenfield projects. I spent a few weeks nailing down a Claude skill for implementing tickets start to finish so I've got my secondary monitor up with \~3 terminals, sometimes more, usually not less, all going at it doing simpler feature work. At the same time, I'm back-and-forth with Claude on my primary monitor building and iterating on tech specs for newer projects and exploring different approaches as there's never just one right solution for anything. After I'm satisfied with the tech spec I'll peer-review with my engineering manager, then with another team member who I pass the work off to before I begin on the next tech spec... I'll admit that there's usually only one project per 2 weeks that I'm planning. The rest of my time and tokens go towards assisting my and other team's bug/product support tickets (there's always some subtle thing that doesn't quite work right on the front-end). I'll receive a ticket, clone it, link it, reformat it in such a way that Claude/my skill is able to consume the context without much useless bloat and then send it off in another terminal.
The skill flow is quite nice. It almost never generates the wrong plan or slop code. It's backed by a series of code style and design and instructions markdowns that prevent that sort of thing. But consistently generating high quality code on the first pass comes at a token premium. I can go into more detail on the skill if someone is actually interested. But I know the subreddit I'm on, so I expect hostility, lol.
niowniough@reddit
I've tried to do what you've got going, but I'm running into the agents needing feedback constantly, usually asking for permission, and find it hard to frequently switch between context for 5 sessions of disparate work, or they get done with the task quickly (including doing peer review with another agent) and tasks pile up in the human review column. The mental drain of context switching quickly between the 4 ticket sessions and trying to come back on the main human-agent design iteration session is so great that I've trimmed down to 1-2 ticket sessions and neglecting 1-2 ticket sessions deliberately when trying to focus on the main design session. What have you found useful in dealing with such pain points?
-Dargs@reddit
Instead of giving blanket permissions to use curl or access jira, PRs, etc., I put together some simple python scripts for doing the specific actions I want Claude to be able to do. Then I and my team that uses this skill are able to package up permissions to use the python scripts and not worry about other access. If you wanted to use curl to modify a ticket or PR there are an infinite ways that could be chained and therefor a crazy amount of permissions to consider. But if you have a script that can only do specific things and commands Claude knows to use, its fine to permit any access to execute `jira.py`, etc.
As for requiring feedback, I've split tasks into specific purpose sub-tasks/agents that will not constantly prompt me for feedback. They're designed to consider "the most accurate to description" interpratation, "best they could identify", and optionally "alternative that may be more correct" depending on findings. So I only need to review what was found during the initial exploration, initial planning, later implementation, and post-implementation review. It's 4 points of feedback for me but its only at critical steps that guide the direction of the work that is done.
And I explicitly ask for "selection based questions" -- I forget the name of the built-in skill but its the one that drives every response to contain multiple choice responses or other text. That saves huge amount of time and makes it much more readible. It is pretty frictionless.
Mast3rCylinder@reddit
My company pay for mu usage around 1k-1.2k$ a month. I asked them multiple times if it's too much and appearntly not. I know my manager even more than me.
Im not even use max models, I just got too much work.
I believe in the future they would have to reduce it
stikves@reddit
It will go local.
We used to have expensive workstations. I remember having specialized Lenovo machines for coding costing about $10k.
Now local AI is close to do proper agentic coding. Not talking about DeepSeek, it requires TB of VRAM, but more of its distillations on Qwen and similar
Get 128GB VRAM, and then you open up the path to local agentic AI.
Many companies can actually afford this. (Though their power bills will also go sky high)
philip_laureano@reddit
I think it's going to get better because this will make the more token efficient models that can run in far fewer memory and use less tokens be the ones that people choose and pay for.
The fact that models such as Opus 4.x and ChatGPT 5.x cost so much compute is a self-own on their part. While they are both good models, their cost efficiency means they'll be phased out sooner rather than later.
And this is where the Chinese models start to beat everyone else--when those per-usage charges begin, they'll be ready with LLMs that run efficiently, even though they're not the smartest of the bunch.
NaturalRoad2080@reddit
Something people forgets, many times the one or ones who have control of a project are very few, if you can boost their productivity by 40% it's not something you can just fix by hiring more devs.
Control I mean real control, the ones who can really see where the issues are when they happen without wasting 1 month
ryaaan89@reddit
The aim here is to get everyone addicted and then also hold models like Mythos over your head that you _have to_ pay them for to fix security.
haragoshi@reddit
It’s not going to be usage based. Subscriptions are great predictable streams of income. Just like gym memberships, Many paying customers don’t fully use their capacity
WaterIll4397@reddit
No because deepseek and others get cheaper too.
Disastrous_Poem_3781@reddit
u/Venisol
We are encouraged to use agents at work but I find them to take to long and cost to much. That's even with a detailed prompt about the work to do.
Since I have started using LLMs I still haven't found a better experience that using the studio/playground interface. You can really control the model with the temperature and system prompt.
I have developed so many feautres both at work and personally and my monthly bills have been in between 30-40 euros.
Even with all the copying and pasting between the browser and IDE at least you're still in control and review the bits of code given to you by prompting the AI.
bob301301@reddit
i run a daily workflow that costs 1m tokens to debug our pipelines
ricetoseeyu@reddit
You’re finding it useful and would like to continue using it if the costs were lower. Token prices are going to eventually come down with lower energy costs (SMRs, renewables), newer more efficient hardware, and more efficient models. Think of the super expensive PCs in the 80s, and then came along HP and Dell, etc.
theycallmeJTMoney@reddit
Why didn’t you just use a subscription plan? Good faith question. I’ve been building small projects at home since October and I’ve never needed more than a max plan. I know it’s $100 vs $50 you spent but you probably could have gotten away with the lower tier plan.
What am I missing?
BTW 100% agree that when token costs go up a lot of people are fucked.
CharlesV_@reddit
My monthly budget is $125 for Claude. Gemini doesn’t have a limit for us currently but I think that’s going to change. I’ve been mostly writing out a prompt or a spec, asking Claude if it looks good or needs tweaks, and then letting Gemini implement it. I refine it with Gemini as far as I can, adding test cases and a summary doc. Then I have Claude review. It’s a lot more efficient than always having Claude or Gemini do everything.
Idea-Aggressive@reddit
You’re claiming to be faster than the computation capability and speed of a GPU cluster…
Hmmm ok
KayLovesPurple@reddit
But keep in mind that less than half of what a dev does every day is coding.
Distinct_Bad_6276@reddit
I said this in another comment, one of the biggest value adds is specifically in the non-coding tasks. What used to take hours of reading documentation is now an instant query, what used to take a week of querying the database to debug something can be done in an hour. I swear, these people who equate dev work with coding are either juniors or non-devs, neither of whom are allowed to comment on this sub.
Idea-Aggressive@reddit
The OP is taking about implementation, writing, file system, copy and pasting.
Some people are completely delusional
spez_eats_nazi_ass@reddit
Local models will see a massive spike in interest. They are good enough for reasonable appropriate use. You just need a $3k mbp or fully loaded mac mini.
bonisaur@reddit
It’s not just coding - once ads and sponsored content hit AI it might reduce people’s usage overall to a much more realistic level and if it doesn’t - it redefine marketing roles in the sense they need to do the equivalent of SEO for AI prompts.
hangfromthisone@reddit
Internet used to be very expensive. Having servers used to be very expensive. Early computers used to be very expensive.
Got the pattern?
wise0wl@reddit
My usage based bill from Anthropocene, for my day job, is $970 or so this month. I was able to build out the automation for a self hosted monitoring stack, write custom plugins, do all the testing, and push it out to dev, and prod, within three weeks. As an experienced dev and platform engineer this would have taken me and another senior lever engineer six months to do, previously.
By self hosting we are saving over $200k a year. The $970 spent on tokens was worth it.
Nataliashayk@reddit
I think you’re right about personal agent usage dropping once people feel the meter running most folks don’t want a $50 weekend experiment. Usage-based pricing changes behavior fast.
But I don’t think agents “vanish,” they narrow. They stop being toys and become tools for exactly what you described: wide, boring, cross-file work where humans are slow and error-prone.
Where I disagree a bit is the work economics. Companies don’t ask “is this worth $5k in tokens?” — they ask “did this avoid a hire, missed deadline, or outage?” For many teams the bar is lower than a clean 50% productivity gain.
Your last example is actually the best argument for agents: delegating mechanical cleanup once intent is clear. That’s the sweet spot. Small fixes → human. Large, repetitive refactors → agent.
So yeah, hype will die. But the boring, high-leverage use cases survive and those are the ones that actually matter.
Plastic_Monitor_5786@reddit
As evidenced above they'll still be useful for writing posts.
darkstar3333@reddit
I told finance as of June our costs are going up by 15x.
They want to go into full top to bottom agentic mode, operational experience be dammed.
I am advocating for it but I understand the cost risk of YOLO'ing your business into bankruptcy.
the_real_seldom_seen@reddit
You lack vision. Shit it’s gonna be cheaper.
Hey guess what, computers used to be the size of a room, only companies can afford them. Now our smart phones have more compute
joe0418@reddit
Software engineering will unfortunately evolve into maximizing value on tokens.
Sub agents with cheaper models orchestrated by a more expensive model, stuff like that.
FriendOfEvergreens@reddit
Strongly disagree just based on my experience running cheaper models locally and on prem. They are nowhere near as good as opus but they are still miles ahead of 2024 which could definitely do simple notes apps. The harness tech does not require heavy compute, just the codegen. On my Mac I can easily run 7-11B models while still using the machine for dev work. With more advancements and better harnesses I’m pretty confident locally hosting coding LLMs is going to be doable.
But I agree with the premise of the cloud stuff, at least frontier models, will have a rug pull
UnbeliebteMeinung@reddit
No. Cursor showed a few days ago how cheap usage based inference could be made. Also cursor customers do on demand api cost all the time and its the biggest ai ide on the market...
tsingy@reddit
If ai progress and cost stops chasing completely right now, many people still pay for agents.
calvintiger@reddit
As a counterpoint, Gemini 3.5 Flash from just this week has intelligence almost up to the level of Opus 4.7, but more than 100x cheaper per token. You don’t think that trend is going to continue?
https://www.mindstudio.ai/blog/gemini-3-5-flash-vs-claude-opus-4-7-agentic-workflows/
joefourier@reddit
If you're in the EU and don't have the endless budget of American software companies, why would you pay per token for Claude instead of Deepseek, which is 10x cheaper for input and 28x cheaper for output?
cbusmatty@reddit
No, this is just like anything else - New tool, new skillset. They will hire those who are good at it, and get rid of those who are not. being effective with the tools will become more important. Universal company frameworks and artifacts will become more important. Centralized agents and process will become important.
The result of this isn't no one uses ai, the result is the good people will use ai and others will be replaced or let go
johanneswelsch@reddit
The pressure to use Ai comes from them wanting to farm your data which will be used to train your Ai to replace you.
LLMs still hallucinate like crazy with obscure languages or packages or code that is not used often, for which there is not much data. The reason LLMs have gotten so good in late 2025 is because they have collected enough data from us over couple of years. And we gave the data away for free.
bigorangemachine@reddit
You still need like $10K machine to run a decent local model.
eloel-@reddit
How expensive do you think it'll get when the real costs show up? I understand it's all operating at a loss today, but will it what. 10x? That's still peanuts compared to dev costs.
tenthousandants44@reddit
10x at least. GPUs depreciate
KayLovesPurple@reddit
Peanuts if you're a company, but if you're a random guy working on personal projects it might not be so.
IceMichaelStorm@reddit
Is it? Some companies already spend more on AI than on devs. While the exception, I think 10x would definitively let costs explode
Which-World-6533@reddit
Is that supposed to ground breaking...?
This just comes across as a coder who lacks experience and doesn't follow best practice.
Like I've said repeatedly. The people who find LLMs the most useful are those who are the most inexperienced.
It's why CEO's go ga-ga over it.
According_Basis7037@reddit
I think this is wishful thinking. I personally never let the ai write code, but it DOES make it easier to find information, it’s basically useful as a partial replacement for google which saves a lot of time. I lean on it A LOT when using unfamiliar libraries or protocols etc
Which-World-6533@reddit
It's useful to some extent. However any information found needs to be verified. Also anything fast moving or very recent is usually inaccurate.
tenthousandants44@reddit
Yes, it's a PITA when it's wrong, but you just ask for snippets and they're no harder to verify than aything you find on SO
markvii_dev@reddit
they were trash for serious work anyway
Pitiful-Water-814@reddit
I think software engineering basically will be pay to play. Software companies who can afford to pay for best LLMs will dominate the market, engineers and companies without enough budget will not survive competition.
codescapes@reddit
That's a big concern but frequently big budgets just mean people get used to being obscenely wasteful. I work for a large bank on cloud infrastructure / "FinOps" and my God, the amount of completely unnecessary, unjustifiable spend is immense. Projects just bloat to fit the budget and there are all sorts of perverse incentives where managers actually want waste because then they can reallocate it later at their discretion...
That and people get used to using money to buy their way out of bad engineering e.g. throw more reader nodes at the DB instead of resolve the obvious full table scan lol. Then there's the whole tokenmaxxing conversation - all this waste is avoidable by smaller, more nimble organisations with proper leadership and so I think there's a genuine advantage there.
The waste of the big multinationals can represent an opportunity for the smaller guys. If BigCorp is spending 10x more than they have to then eventually they'll just buy a cheaper start-up to fix their problem. The goal isn't to defeat Google or whatever, just get consumed by them and make loads of money in the process.
YesIAmRightWing@reddit
Am currently working for a company that's all in on AI but really they are all in on vibe coding
So I just vibe code away and if it goes wrong it all falls under the banner of moving fast and breaking shit
Their money they are wasting I suppose
Lonely-Leg7969@reddit
just use opencode and bypass the api keys
ivancea@reddit
It's gonna drop, for sure. But... 5k? Working with multiple agents in parallel, we rarely reach 2-3k really. It depends on the job, for sure.
And EU companies... EU is big, you know. Some companies can and some others can't. Anyway, we're all adjust reducing costs by increasing our knowledge and tooling (e.g. more MCPs/tools == less tokens overall; same for skills and aubagents). So it's not "just" going up
dbxp@reddit
All our AI usage where I work is based on a cost of around $2 per ACU, it's still very much worth it and I'm in the UK.
I think its worth considering the communication and knowledge sharing overheads you have with larger teams which don't exist for AI. On the other hand I am seeing some difficulties in my team due to having details essentially decided by one person with no real review, there's a benefit in terms of speed being able to avoid team meetings over decisions but there is also a risk there.
pawzem94@reddit
I believe that once token cost will got significantly higher we’ll see increase in small model usage. But not sure if all of it will be local. Most of agents are not that sophisticated anyway and using higher end models is a waste even now. Not to mention giving away the domain data to ai companies for training.
Intelligent-Youth-63@reddit
I have no idea what you’re even trying to say here.
UHM-7@reddit
He's saying that once provider's stop subsidising usage people will use LLMs less
szank@reddit
Give it 10 years . Companies will end up spending say $500 (in 2036 dollars) per dev to rent out hardware capable or running claude 4.6 or somesuch locally. Then it will make sense.
Home users will pay $30 a month for a capable PA/assistant /appointment maker service that will be pretty capable.
KayLovesPurple@reddit
Pretty sure a top of the line Mac can support a local Sonnet 4.6 now, via Ollama.
TehBuddha@reddit
People will absolutely not pay £30 a month for a slightly better Alexa
blob8543@reddit
Especially with all the anti AI hostility we'll see if people in all industries start losing their jobs.
GistofGit@reddit
At least they didn’t use AI. Honestly refreshing to see a human ramble
strongfitveinousdick@reddit
I wonder if it's a new writing style - ramble like a reddit user from early 2000s.
ReturnPure8518@reddit
ask claude to translate it for you
den_eimai_apo_edo@reddit
Try use AI. It's pretty clear what they're saying. You just didn't bother reading it
niveknyc@reddit
AI brain has taken over, nobody has cognitive thought anymore.
micseydel@reddit
The cognition thing is a polycrisis https://www.nejm.org/doi/full/10.1056/NEJMe2400189
AggressiveAd5248@reddit
Subscriptions are not reflecting the true cost of using the product, usage based billing is closer to the actual cost of AI, whenever investor money runs out and subscriptions increase in price, AI is dead.
It’s good for $20 but if the price for the same tier increases to $100 would you still pay it? That’s what they mean.
Immediate_Rhubarb430@reddit
It's a pretty simple message. AI use is subsidized. Much like Netflix, they will jack up prices. At which point, the value proposal will not be attractive for customers and AI adoption will suffer as a result
Keeyzar@reddit
Then you may not be the brightest human on earth.
It was easily understandable; agents = too expensive. When metered API takes over, just few people will actually pay for it.
rbnd@reddit
He's saying that the valuation of AI companies make no sense because people will not pay them a lot of money if the alternative is free local LLMS. I agree
ButWhatIfPotato@reddit
You underestimate how much money stakeholders are willing to throw to the endless void before they would admit they bet on the wrong horse. We are so far away from the average "go balls in on AI" CEO figuring out what kind of platinum medal level mental gymnastics they would use to convince themselves this clusterfuck was not their fault. Hell we have not even seen the first (and I bet you there will be at least a second) bailout from all the AI peddlers.
raverbashing@reddit
Yeah this is absolutely crazy
Use AI? Sure
Ask Copilot Agents to solve a problem here and there? Sign me up?
Have a "~~Jesus~~ AI take the wheel" approach where you "vibe code" when you're not even sure what exactly do you want to achieve? Oh boy
How many times did you ask models to fix something and it just kept running in circles (and burning $$)?
How many times did you have a spec, give the AI the spec then the code doesn't do what you wanted? (and I mean this was not a lazy spec - this was the code that did this previously! (but think different language and paradigm)
Sometimes you have to actually sit and understand what's going on
tacopower69@reddit
... why? of all the things you could build why something that neither you nor anyone else will ever use?
Venisol@reddit (OP)
Ive been using it for 2 weeks. Replaced my obsidian and now my notes sync across devices.
With "personal" i meant an app literally just for me. A single user, no auth. I got all my vim shortcuts hard coded in there etc. I have no intention of ever letting anyone else use it.
chosenoneisme@reddit
For testing and learning how to use these agents. Isn't that obvious? For learning a new tech we don't directly use it on complex tech stack or anything. We try it out on simple things
Resident-Trouble-574@reddit
It all depends on whether the hardware will scale faster than the resources usage of the agents.
chosenoneisme@reddit
Google is doing some major upgrades. So we can expect more usage in current plan but it's still expensive
CanIhazCooKIenOw@reddit
These hot takes are great.
Keep them coming
coredalae@reddit
Maybe, but just as likely is that run cost will drastically fall. So it'll end up being cheap by 2032
InterestedBalboa@reddit
I think for one person and small teams that will be true but for companies it’s just the cost of doing business, especially if human capital can be offset.
Local LLM’s will be the only way to be competitive for those trying to do their own business. All those people fishing around Reddit looking for ideas they can vibecode better get busy