Token Based Billing Changes June 1

[-]

joshocar@reddit

We are entering the phase in AI adoption where we find out if the real cost of the models is worth the value gained in productivity. Previously we have all been paying a subsidized price, but as openAI and Anthropic move to go public they will need to start showing real profits. I think leaders will take one of two paths,

They bet on the productivity gain and do layoffs. We will be expected to get more done with fewer people by using LLMs.
They limit tokens and expect people to get more efficient with their usage. We will need to figure out how to get the same output, but using fewer tokens.

My bet is that most will want to do #1, the not so smart ones will try #1, the smart ones will mix #1 and #2, no one will only do #2.

There is a 3rd option, but no one will do it. In the third option, you buy everyone workstations that can run open source models and have people spin up and maintain their own instances. The only way this happens is if 1 and 2 don't work and someone takes the risk and tries it.

[-]

TylerDurdenFan@reddit

> The only way this happens is if

...is if hardware prices and availability became reasonable again, which it won't. I guess Scam Altman does have C-level foresight after all

[-]

rotzak@reddit

Not to mention model quality improves

[-]

kayakyakr@reddit

Mac pro or the AI Max 395+ system in a box systems can run minimax or kimi for $2500. They're sufficient at coding, especially if they have a bigger model telling them what to do.

That'll be the path a lot of the smarter businesses that want to stay AI end up going. I'm curious if the market will accept a non subsidized price. We'll see.

[-]

Smallpaul@reddit

The market will absolutely accept a non-subsidized price. I would bet substantial money that we will still have a GPU shortage going into 2028.

And it’s important to remember that the cost is in part a function of the shortage. Pricing is dynamic and so is usage. There is no consistent “non-subsidized price.” If demand falls then the price can fall too. Within limits of course.

[-]

Kirk_Kerman@reddit

The floor of the price is the cost of the GPUs. The GPUs cost 70k a piece and die on average after 3 years. And Nvidia isn't going to stop introducing new 70k GPUs every year. Electricity could be free and the unsubsidized price is still 8-10x higher than what it is now.

[-]

Possible-Pirate9097@reddit

Sorry what? How lobotomized would your model be to run Kimi on a single 395? 😂 Or even a cluster 🤣

[-]

kayakyakr@reddit

Sorry, got kimi confused for a much smaller model. Minimax seems like the best model you can run on 128gb.

[-]

shaonline@reddit

A single strix halo machine is tight for minimax (I own one), we're talking aggressive quantization (3 bits-ish, which hampers quality), kv-cache quantization as well, and SINGLE user/session, at slow speeds (on the prompt processing side especially).

Running big models will still happen on the cloud for most people, the main case for local hosting is privacy concerns, not costs (not even close, unless you're a huge company spanning across timezones).

[-]

kayakyakr@reddit

Good to know about capabilities in action.

I use the small models a lot for code assist. They do well with very tight instructions and a lot of human oversight. I don't know how much time they actually save 😅

[-]

shaonline@reddit

Yeah I'm having some fun with Qwen 3.6 27B and as far as being "agentic" goes it's great, not so much when it comes to code taste though. We'll get closer eventually I think especially for stuff on the scale of minimax (the around 300B parameters mark) at least on being able to execute something right, "having good taste" or discussing architecture stuff I think will still only be doable on big trillion-ish params models, which are on the verge of being "too expensive" for most people and uses.

[-]

Possible-Pirate9097@reddit

... with how much context? You'd need two Strix Halos (or two Sparks or a single 256GB Mac Studio) to run it with enough context for actual real world use IMO.

[-]

The_Synthax@reddit

Definitely seeing some businesses moving in that direction. Big model in the sky handles coordination, memory, and prompt generation, and the expensive high-churn busy work goes to an on-premises model where the only cost is electricity once the hardware is purchased.

[-]

U_L_Uus@reddit

In my town we call this "the point where the drug dealer notices you are hooked and resumes with his market prices". Same old song, really

[-]

SnugglyCoderGuy@reddit

This is only the beginning. I am expecting the final cost to be more like 150x what it is now.

[-]

Ecksters@reddit

Near the end of this year we're going to start seeing hardware designed for inference (co-located RAM), that'll bring down inference costs by 1-2 orders of magnitude.

Without that I suspect you'd be right, but thanks to that incoming hardware, I suspect that if anything AI usage is going to explode as prices stay near the current subsidized rates, or even go down.

[-]

chickadee-guy@reddit (OP)

Thats unironically what they would need to do to turn a profit

[-]

SnugglyCoderGuy@reddit

I know, that's why I'm expecting it

[-]

writesCommentsHigh@reddit

Ignoring the fact that tech will evolve and they will get their data centres out. The evolution of the tech will continually bring prices down while simultaneously improving the tech. If that does not happen then it does not mirror what has been happening with tech all these years.

People are already starting to run decently capable local models on 16-32GB. They don't compare to frontier but thats today.

Doom was a miracle when it came out. Now you can play it on a microwave

[-]

thephotoman@reddit

In the long run, open source wins.

It happened in the Unix Wars. Today, the clear winners of the Unix Wars were Linus Torvalds and the GNU project, with Steve Jobs and NeXT taking second and 386BSD taking third. Illumos and AIX don't make the podium, but they're at least still around.

It will happen in the AI wars, too. We don't need the data centers and remote models. The RAM crisis is largely an effort to prevent OpenAI from becoming economically irrelevant due to the open source local models, and it isn't working.

[-]

Regalme@reddit

Local models are going to scrub these people no matter what. And they’ll deserve it for farming the entirety of humanities accomplishments and touting them as their own

[-]

danielrheath@reddit

The loans they are taking out to build those DCs aren’t going to get a discount when the tech improves; that aspect of the cost base is locked in for decades.

[-]

Kirk_Kerman@reddit

Why would data centers make these fucking things cheaper? The GPUs cost five figures each and have a 3 year average operational life. The depreciation is going to be a huge line item killer. Building the data centers is also seemingly intractable since every project is delayed.

[-]

NUTTA_BUSTAH@reddit

So will they eventually pull a Broadcom and kick out 99% of their customers for the few big fish that has the cash for that?

[-]

AdmiralAdama99@reddit

It's also the part of enshittification where they have enough customers so can stop treating them so well. Moving from early to mid phase enshittification i guess.

[-]

ZarrenR@reddit

I’ve been telling people AI is basically a drug and OpenAI, Anthropic, etc are just dealers.

[-]

revrenlove@reddit

First one's free

[-]

Abject_Parsley_4525@reddit

My recommended approach internally has always been 2). Watching leaders of other org units scramble because we are starting to cancel and pull back on some of these tools is hilarious to me.

[-]

forbiddenknowledg3@reddit

The issue with 2) is using fewer tokens or a worse model ends up not being worth the effort. Reverting to claude sonnet for example is just worse than manual coding for most of my tasks.

[-]

BeABetterHumanBeing@reddit

I'm also an advocate for #2, but maybe for different reasons: I hate asking the random number generator to please pick the number that's in my head. Putting more effort into constraining the agents so that they do what I want with fewer tries makes my life easier.

[-]

forbiddenknowledg3@reddit

There is a 3rd option, but no one will do it. In the third option, you buy everyone workstations that can run open source models and have people spin up and maintain their own instances.

Honestly this makes the most sense. The entire move to cloud is a scam IMO. If companies self host we wouldn't have all these reliability and cost issues. And more broadly, I think society would be better with a more P2P internet where these big companies don't own all the traffic and data.

[-]

slpgh@reddit

At some point they’ll have to train people to use the correct model for the job - complex vs non complex. But of course they won’t. It’ll be tragedy of the commons with some people using the premium for everything

[-]

JuanAr10@reddit

I just got hired. Senior 13 YOE. Part of the interview questions were “how do you use AI” and “how would you deal with a low token situation?”

My answers were in the line of “I use AI as a tool not as an oracle” and “I’d optimize it by using dumber models for cheaper stuff” - they told me later they were quite happy with my approach (we’ll see once I start my position).

My take is these guys are betting for (2) and eventually (3), which seems like a conservative and accurate approach.

[-]

GlobalCurry@reddit

Caveman and dumber models

[-]

Basic-Lobster3603@reddit

I wish I could take this approach I have been told to not open the code base at all anymore. Any questions I have about the codebase no matter how small I should really challenge the llm to provide me the answers to. Need to review a piece of code ask claude, need to write a feature span a full multi agent review/implementation loop. Opus 4.7 is amazing but wait don't use Opus for the code writing part because it cost to much. Like I'm spending more time managing llm then actually providing value at this point.

[-]

Korzag@reddit

Seems like that company is reading the tea leaves well. My company just went full speed ahead on AI (were not a tech company) and im currently popping my popcorn for when the company puts the brakes on it after seeing the AI bill because I've been explicitly told to start using it as much as possible.

[-]

2thick2fly@reddit

Wow that's insightful!

[-]

raddiwallah@reddit

We have unlimited tokens (you might guess the company) and have folks spending upwards of 10,000 USD a month on LLM usage. Its insane. That’s literally the salary of a Junior Engineer.

[-]

joshocar@reddit

The key question is do they generate the output to justify the cost? I honestly don't know and I'm not sure how you would measure that anyway.

[-]

ecethrowaway01@reddit

There's not an aligned definition, but I know people reviewing 150+ susbtantial PRs/wk and think they can only review with heavy LLM assistance.

It's not a perfect system but leadership clearly thinks it's worthwhile. I'm somewhat concerned things will slip the gaps but you have to work off people's expectations

[-]

guareber@reddit

As someone who reviews substantial PRs per week... yeah no way I could do 150+, with or without LLM assistance.

[-]

Colt2205@reddit

There likely isn't a good way to measure it. It's the problem of pressure from above pushing people to work down below and that work has to be defined by expectations, and if those expectations are not met then performance review suffers. If someone meets expectations with AI usage but had to work from 8 am to 8-9 pm to do the tasks, that should be a red flag, let alone people suffering mental burnout.

[-]

raddiwallah@reddit

That’s not being measured. Just the inputs which are primed for gaming the metric.

[-]

Teh_Original@reddit

That's the salary of a mid-level to senior if you aren't on the coasts.

[-]

Hudell@reddit

that's beyond the salary of a staff engineer if you live in south america.

[-]

NotRote@reddit

Depends what kind of company.

[-]

thekwoka@reddit

$10k/month for a junior?

[-]

thephotoman@reddit

In some places, yes. If you're up in the Northeast or around the Bay Area, it's a reasonable starting salary.

Remember: some places have high costs of living.

[-]

yankjenets@reddit

Why is that insane? What if they are gaining as much / more value than an additional junior engineer?

[-]

raddiwallah@reddit

There’s no measure of output.

[-]

yankjenets@reddit

Then how do you know how much a junior engineer is worth?

[-]

JollyJoker3@reddit

Our company has given us 300 premium requests of Github Copilot a month for probably less money than coffee for the office. Now that's insane.

[-]

ADDSquirell69@reddit

How much would a large Fortune 500 technology company be paying for unlimited use?

[-]

raddiwallah@reddit

Our org wide usage is currently 5-6M in this month already.

[-]

sarhoshamiral@reddit

It is more like half the cost of a junior engineer (salary is just one part of the cost).

And if that senior engineer is now producing more work. It may be a good trade off. The teams are getting smaller for sure and I do see the productivity gains by using models.

Most likely productivity will increase further as we start caring less about code quality but more about test quality. Models work best when it has access to a verification tool. Ultimately for most apps important thing is input and output, speed and accuracy not code quality.

[-]

Crafty_Independence@reddit

In a lot of orgs where development supports the business but isn't, that's engineer or senior level salary

[-]

Smallpaul@reddit

I have two questions:

Why would you need to run the open source models locally rather than in the cloud?
Are the open source models actually good enough yet? Which ones are?

[-]

brewfox@reddit

1) because it’s free (once the hardware is paid for), cloud compute has costs.

[-]

Smallpaul@reddit

It’s never free because the hardware depreciates and needs to be replaced. Also because there is an opportunity cost in spending money earlier rather than later.

[-]

joshocar@reddit

The cost of hardware isn't the big risk. It's the cost of training and support as well as the time it takes to get everyone setup and everything in place. Some people in your org are just not going to be able to do it without a lot of help - think HR, sales, etc. Then there is the risk that a frontier model will make a huge leap and you are stuck on the last generation tech while your competitors leap frog you with the new models. Also, the AWS/GCP options are stupidly expensive from what I hear.

[-]

Smallpaul@reddit

AWS offers frontier models at the same price as the frontier vendors and open source at a very competitive cost.

Qwen3 Coder 480B A35B $0.45 $1.80

They tend to lag the state of the art in models though. Qwen is at 3.6.

I would be shocked if Amazon ever raises the price on that model, because I don’t think they are subsidizing it right now.

[-]

Sneerz@reddit

Qwen3 Coder 480B A35B $0.45 $1.80

No one used to using Opus 4.7 for (assuming they are using it for appropriate tasks) will be happy with that as a main LLM. Better solution is model routing based on task.

[-]

Sneerz@reddit

If you compare cloud compute costs compared to direct API access, it's generally cheaper, particularly with quantization. TurboQuant (by Google) is very fast, efficient, and does not degrade models nearly as bas as say GGUF (llama.cpp) quants, imatrix, or exllama3.

If I were in an exec position, I would be looking at providers on OpenRouter rather than relying soely on OpenAI and Anthropic.

[-]

Imaginary-Jaguar662@reddit

It is not free, not in commercial context.

Someone has to make the business case and approve purchase.

Someone has to set up the machinery.

Someone has to track each unit and their maintenance.

Someone has to maintain security documentation for audits.

Someone has to take care of replacing the units.

Someone has to manage the access controls.

Suddenly having a line item embedded in your AWS/Azure/GCP instead starts to look very attaractive

[-]

Sneerz@reddit

Gemma 4 31B-it is not bad at some code tasks and could easily be hosted company wide at a fraction at a cost with an inference engine like vLLM. Though, I would not trust it to refactor my entire codebase so I set up my OpenCode with omo and optimize model routing based on the cost. It's up for the company though to manage the infra and many just want a plug and plan SaaS solution, so token limits are gonna be the new norm. Also tracking who is using what models to do what task. I know people use Opus 4.7 to summarize and write "better" emails. It's gotten out of control, and the companies can't have their cake and eat it too. There has to be a compromise somewhere down the line.

[-]

joshocar@reddit

I have not run them myself, but multiple colleagues of mine are and from what they have told me they are good, maybe 6-13 months behind the frontier models. There are a few open source agent repos also that they use.

You need a video card with enough memory to hold the model, so basically a rtx 5090 ($3k-$4k at the moment). People realized that the RAM on mac minis is unified and could be used to run models, but Apple has started removing the 256, 128, and 64Gb mac minis from their build options.

[-]

Smallpaul@reddit

But most people on the frontier models agreed that they went through a noticeable change in agentic autonomy less than 6 months ago. So six months behind is actually quite significant in terms of usefulness. For a lot of people that was the time they transition from toys to autonomous helpers. The current frenzy is driven by that step change.

It’s has are to know what will happen next. If the frontier models could achieve another step change of that magnitude, it would be astonishing. But it might be valuable enough to pay for. At least for those in competitive industries.

[-]

tenthousandants44@reddit

Cool. How much are you willing to pay?

[-]

Smallpaul@reddit

If the next generation had as much of a development velocity improvement as the last, my employers would happily pay. Delivering an important feature this year is approximately double the value of delivering it next year or two years from now when our competitors have made it commonplace. I understand that there are huge swathes of the industry where this is not true.

[-]

Possible-Pirate9097@reddit

Money. Also incentives working from the office due to power costs so companies will love it.
Yes. gpt-oss-120b, qwen3-coder-next and qwen3.6-27b are all good enough for subagents and run on 128GB RAM. Kimi-K2.6, GLM-5.1 and the latest Xiaomi one are as good as Sonnet.

[-]

open-mind-001@reddit

My company has built apps that are fully LLM driven. Run a skill and it will pull out 1000 pages using mcp, parse, generate dashboards. Again LLM inside these dash.

Basically you run it once and sip coffee for next 10 minutes. I wonder what will happen to all of this once we start paying.

[-]

vexstream@reddit

Dashboard generation seems to be a popular utility for C-levels. Fuckin love dashboards, I guess.

Nevermind that almost all of that dashboard generation is deterministic and you could just change the skill to include a script to generate 99% of it...

[-]

natty@reddit

There is 4. Buy shares data centres with even larger language models so it can be queued and used by many at once with higher capabilities.

[-]

puglife420blazeit@reddit

This is where we’re going to see the Chinese models gaining real traction. Everyone has warned about this. They’re not frontier, but for most use cases frontier isn’t needed. I get by on Opus 4.6 and Codex 5.4 and kimi k2.6 is just about there. I have to work with it a bit more but if Opus 4.6 or Codex 5.4 were suddenly unavailable, these alternatives are going to get major consideration. If they get adoption outside of individual engineers, and within engineering organizations, it’s going to light a fire.

[-]

porest@reddit

What is your opinion on the other Chinese models? You only mentioned Kimi.

[-]

subma-fuckin-rine@reddit

i wonder if they will ever stop offering the older models to force people onto the latest.

they have a sort of a self competing business model currently where you can potentially save $$ by using older models which cuts into their frontier sales

[-]

fsk@reddit

This is why Anthropic/OpenAI are doomed businesses. In order to justify the investor money spent, they need to turn a big profit, which means they have to jack up prices. They don't have customer lock-in. If they jack up prices, people will switch to cheaper good enough models. The free open source models will catch up to the paid ones eventually.

[-]

Ph3onixDown@reddit

I feel like the most likely scenario is definitely just reducing staff and limiting tokens. “We need fewer people because AI. Wait. AI costs the same as a junior/mid level dev. Use less AI, no we won’t be hiring”

I would love to see companies go with option 3, because a workstation beefy enough to run a decent local model for coding is still probably cheaper than all the OpenAI/Anthropic invoices

[-]

Annual_Negotiation44@reddit

I feel like companies (from an equity market perspective) would ironically get severely punished if it was shown that they’re taking this approach….”oh my god, they’re so behind the curve on AI adoption”

[-]

Ph3onixDown@reddit

“How will this look next quarter” is definitely impacting a lot of decisions

[-]

Unhappy-Ladder-4594@reddit

That is exactly how it would work at the moment, until the hype cycle changes which it will eventually.

[-]

nyanyabeans@reddit

Why do you think companies won’t try 3, because of the compute cost power? My company is extremely loosely discussing this.

[-]

bluetrust@reddit

This is interesting. I just had a chat with google's search ai and asked it to find pricing and what it would cost to run deepseek v4 pro. Apparently you need an 8x H200 node which can be rent for about $250,000 a year.

Estimates are that can support somewhere between 10-100 devs without noticeable latency. The low side is for ai crackheads who are constantly running a dozen sessions in parallel constantly. The high side is for normal bursty users who are aware it's a shared resource.

So the math per developer works out to about $200-2000/month plus devops costs administering it.

If the model is actually as capable as something like Claude Opus 4.7 -- well, that's what I don't know. But I could see companies doing the math and saying fuck it, let's get a known stable cost locked in for a year.

[-]

joshocar@reddit

It's the cost of compute and the cost associated with getting things up and running and maintaining things. I would compare it to running a server vs a cloud server, there are costs besides hardware associated with running your own server.

[-]

StatusAnxiety6@reddit

Or self host

[-]

severoon@reddit

Your #3 suggestion makes no sense. The path there is to set up a centralized service everyone can use with unlimited token budget, not trying to have devs maintain their own.

If everyone has MacBook M5s with 64GB unified memory IT could push local models into everyone's machines that are tuned for that hardware, those could handle light work maybe, but then you need them to also handle orchestration so requests are dispatched properly and context is always handed off to the server model … or perhaps the server model could spawn local sub agents when needed.

Right now this isn't really feasible for all but the biggest orgs.

[-]

xt1nct@reddit

It’s the same cycle of enshitification. First get clients, like us devs. Then focus on business clients. Then start turning the service to shit to try to make money. It’s a tale as old as time in software world.

[-]

Pyro919@reddit

Most our devs are using MacBooks with 32-48gb of unified ram anyways, which is more than capable of running qwen locally. Option 3 would work just fine but is hard to manage at scale.

Just last week redhat was pushing ai sovereignty to help reign in token costs and pushing that ai sovereignty is the only way token economics are controllable or scalable long term. It’ll be interesting to see how it all shakes out long term.

[-]

geft@reddit

My Android Studio is killing my 32GB Mac when I run multiple agents with multiple projects open. No way I can run decent open models without at least 64GB.

[-]

Possible-Pirate9097@reddit

Yeah you might have a bad time with those specs lol

Time to think about upgrading everyone to 128GB M5 Max's. Or self-host the open source ones yourselves.

[-]

Pyro919@reddit

128gb would be nice, but it’s overkill for some usecases.

I’ve already been experimenting and running with it on a MacBook Pro m4 pro with 48gb of unified ram and doing just fine (I ran out of disk space before ram or compute resources). I work in the infrastructure automation space and have customers with high security environments asking how they use ai on-prem safely to help automate infrastructure so I decided in my spare time to see what I could do with self hosted models and it’s been working just fine so far.

[-]

Possible-Pirate9097@reddit

Which models because the only one I can think of which works is qwen3.6-35b-a3b. Maybe the smaller Nemotron or latest Gemma(s)?

Do you use the smaller models for everything?

[-]

Few-Philosopher-2677@reddit

Heh in a recent leadership meeting at my company this was being discussed. A director said that at another company he heard the cost of AI per person is around 7000 USD which he agreed is a lot. Leadership here is AI pilled just like any other company but things dont seem to be as crazy as putting people on PIPs for AI usage lol. We are still largely on Cursor's legacy enterprise plan which is billed on the number of requests and not tokens. Everybody gets 1000 requests a month and after that its unlimited auto requests. No on-demand usage enabled.

I have heard a few people have been testing Claude and Codex as well but as far as AI adoption goes we are not exactly the fastest and thats actually a good thing. We also use Google Workspace which gives us effectively unlimited Gemini Pro and I dont think there are any concerns with that.

[-]

Foreign_Addition2844@reddit

7k/mo??

[-]

Few-Philosopher-2677@reddit

That's what I heard.

[-]

Regalme@reddit

They aren’t a whole other level, they are less. You have much bigger problems at your leg than copilot. Good luck.

[-]

forbiddenknowledg3@reddit

this is the worst it will ever be!

Tbh this is a great litmus test for who knows WTF they're saying and who doesn't.

[-]

Foreign_Addition2844@reddit

Lol. Lmao even.

[-]

Relevant_Pause_7593@reddit

As far as I know, GitHub copilot is aligning with prices of the downstream models, but is still cheaper - especially if your company has an azure discount and/or you use the auto model (10% discount).

[-]

Necessary-Focus-9700@reddit

It's a shitshow. And it's only going to get worse.

OpenAI, anthropic... they have no moat that I can see. The chinese or any source can provide LLMs at commodity pricing. Or you can host locally. Sure you won't get the bleeding edge. But few need the bleeding edge.

I'm an older dev, and I've been through several major disruptions. It gets ridiculous. But provided software needs to work and ppl are willing to pay for it eventually the ridiculous calms down the sanity somehow prevails. That can take years. And it's damn painful to be a skilled engineer in the mix.

This disruption is much, much greater. And for me (based in silicon valley) the bs was growing before the AI boom hit.

Within the last 20 years I've worked with maybe 2 companies (out of >10) who were actually building software. The business model for most has been all about optics, appearing to have a great team, the illusion that the company has discovered the holy grail. And the ones not producing code?? they've done OK.

It's become cosplay software development. Pyramid schemes, essentially.

Now with AI.... shitshow multiplied by bloodbath.

What to do if you are a dev who likes to build quality stuff that works? The only thing I can think of is going indie, find clients with real problems and add value for them. Even if it's basic boring stuff there will always be some technical challenge where the advantage of actually knowing stuff gives an edge.

And no matter how difficult or challenging it is to run a small business and deal with clients.... it is much much easier than dealing with a middle manager you wants to "help" you approve a steaming pile of shit because they won't be bothered when you have to face the consequences.

I think where I've landed is like one of those doomsday "preppers", living off grid because the cities will implode. Sounds hyperbolic and strange but not wrong. I read the news and the posts on linkedIn everyday and this is how those dots are connected.

We live in interesting times.

[-]

Muhznit@reddit

Now with AI.... it's shitshow multiplied by bloodbath.

This is a beautiful expression of how I see the situation.

Like I can legitimately visualize a scatter plot where how "how much effort has been spent beautifying a turd" is on the x-axis and "how many heads will roll when people realize it's shit" on the y-axis, and my own company's forays into AI feel like they're in that upper-right corner.

[-]

thephotoman@reddit

If you want to make stuff that works, you can go indie, or you can go industrial.

The coder writing software for AEDs is likely industrial. His code is a component of the product--not the whole, but a significant part. The guy slinging Java in a bank is industrial. The lines between his code and the product are blurry. The guy writing software for the infotainment systems on cars is making a product.

Selling software is a terrible business--that's why the gaming world sucks so much. But using software to affect real world outcomes is a good business.

Social media didn't make the world better. Platform centralization was a mistake--but one we made because spinning up your own forum with blackjack and hookers does cost money. It costs time. It's another chore you have to tend to, because you've got to keep the software up to date. When the purpose fizzled, you took it down because keeping it up was more work than it was worth.

Reddit is easy: one account, one site, one Spez. You don't have to pay the bills. You don't have to worry about software updates and server reboots to apply patches that actually require a reboot. You don't have to worry about the hug of death.

I'm not a prepper. But also, I've been engineering this place to take a real hit to services since the 2021 winter storm and power grid collapse. I don't want to be stuck in that.

[-]

sleeping-in-crypto@reddit

Are you me

[-]

Necessary-Focus-9700@reddit

🙂

[-]

Abject_Parsley_4525@reddit

We're actually cancelling co-pilot at our org for the same reasons. We're going to only use claude going forward and there is a push towards using some local models for simpler requests.

[-]

F2EB@reddit

CC is more expensive now

[-]

Abject_Parsley_4525@reddit

True, and previously we had budget for both. Now that it is more expensive it is under more scrutiny so we cancelled co-pilot for claude.

[-]

F2EB@reddit

Part of mag 7, we have cancelled CC and getting copilot, decision was made due to cost and not what is best for work

No shit, first ask us to use agents to write code as that is what the future is then switch to inferior tool, all other agents internally are also switching to copilot

Can see in next few quarters we go for unlimited to capped limit on users , firing 500 million of worth salary people and burning that much in a month These c suites huh

[-]

GlobalCurry@reddit

At least you can use Claude from copilot I guess.

[-]

chickadee-guy@reddit (OP)

Interesting to hear. Were you using M365 as well in your stack? Curious as to if MSFT fought you guys dropping it.

[-]

Abject_Parsley_4525@reddit

There's some M365 but not much maybe 10% of the stack. We already told the rep we'd be cancelling and they did protest and offer a discount but we said no.

[-]

AmoebaDue6638@reddit

The 15x increase is going to be a brutal wake-up call. Wild that so many companies went all-in on mandating usage without any cost governance, and now they're about to find out what uncapped token billing actually looks like at scale.

[-]

Impressive-Skin9850@reddit

I am in the same boat. Fortune org, forcing all to use AI daily, leaderboard and all that. I am wondering the same thing. I don’t see how cost doesn’t EXPLODE over this. Like what will they say then?

It’s a good question and orgs will have to recon with this price change.

[-]

Sneerz@reddit

My company has something similar and it bit them in the foot. Instead of the "leaderboard" being "who uses AI the most" - they forgot that using AI is not the same as efficiency.

[-]

chickadee-guy@reddit (OP)

Our CTO made a linkedIn post the other day frantically talking about how we definitely arent tokenmaxxing and all AI spend can be directly tied to business outcomes. Almost seemed a little too on the nose.

[-]

Anttu@reddit

I'm also in a Fortune org and I’m tokenmaxxing. I know I could be more efficient with prompts but we have unlimited access and I'm so fed up with the AI this, AI that.. Our VP sent out an email praising AI tool adoption in our org and I got a call out for being #1 power user and #2 multi-tool user (complimentary). That email was written with AI and so long that I missed that it included my name, my colleague told me. I feel like everyone is insane or I'm going crazy.

[-]

chickadee-guy@reddit (OP)

I was in a similar spot and had to tone it down cuz I was being asked to speak at AI events and there were rumors of me being pulled off my core responsibilities onto some "AI tiger team"

[-]

lppedd@reddit

I think they will start attaching your AI spending to your salary. Then you know what happens when they plan layoffs.

[-]

Obsidian743@reddit

Companies are going to start reducing headcount to to pay so be ready.

The other thing is that devs won't be able to tokenmaxx anymore with stupid side projects just to get to the top of the leader board.

Which means you're now going to have to learn how to manage context, memory, skills, and instructions more.

[-]

powercrazy76@reddit

I see this being the inevitable future. The companies heavily pushing AI products are the same companies who have yet to justify their spending on data centers to support said AI. They are purposely discounting the cost to companies like yours to make companies go 'all-in' because they know that is what it'll take at a minimum (even with raised costs) to be profitable.

The real question is, by the time the dust settles and AI resets to a realistic cost model, will it actually be cheaper than paying devs/leads a liveable wage? Or will enough of the industry have left (greener pastures, lack of generational training, etc.) that it won't matter anyway?

[-]

Yukeba@reddit

That is true question. Now no employee will ever again be loyal or view any FAANG as one of role models.

[-]

chickadee-guy@reddit (OP)

Based on what im seeing in these projections a Copilot seat is more expensive than a midlevel dev.

[-]

InnateAdept@reddit

But if your seniors are going at least 2x faster, isn’t it still a savings for the company though?

[-]

chickadee-guy@reddit (OP)

They arent going 2x faster. Its been a net productivity loss due to the mountains of slop being pushed to "review" by offshore

[-]

InnateAdept@reddit

Ah that makes sense. We are just starting to adopt copilot, so the gains are immediate and obvious. But if the culture ever shifts towards “ship more, review less”, then at some point it inverts and you are just lost in the slop

[-]

Optimal-Savings-4505@reddit

The shop I work at is not quite that gung-ho. However, it gave me a seat at their github copilot plan, so I spent tokens shaving down a PR to half the amount of lines though. Does that count?

[-]

Fruloops@reddit

All employees are mandated to use it daily, if you dont, you are put on a PIP.

This is utterly retarded smh

[-]

Ysilla@reddit

And a usage leaderboard. It's like they took the list of shit not to do and decided to implement all the things.

[-]

nukem996@reddit

Many companies that already stack ranked started to put up token leaderboards like op. Engineers absolutely have been gaming it. Where I'm at its made engineers use LLMs for everything just to burn tokens. I used Claude to fix a spelling mistake and push a diff I easily could have done just to burn tokens.

Management needs to stop measuring LLM usage and start focusing on what the actual results are.

[-]

nonades@reddit

Engineers absolutely have been gaming it

When my org started encouraging LLM usage, I straight up told my VP that if we start enforcing it and tracking token usage, the first thing I'm doing is writing the most bullshit script to burn tokens to ensure it looks like I'm compliant

[-]

Dry_Author8849@reddit

Mmm. Not your problem. If your org can't pay it they will mandate other thing or whatever.

On the other hand, you can allways place a budget and stop the service. I think they will continue selling at small prices as a "token pack". If using expensive models you will maybe accomplish one task or expensive models will not be available.

Anyways, it's the same monopoly game. They create the hype. they let you taste it and then they charge whatever they like. Allways the same. Trying to get control of the market and squeeze all they can.

This time costs are really high, but they also have the problem of expensive hardware with a 3 year expiration date. They need to have those 100% all the time. If the number of users decrease too much they will find that they invested in excess and the market is smaller. But that will never happen, is admitting defeat. They will "revive" cheaper plans as per "popular demand" and because "AI should be for all" and "we are socially responsible company" and blah blah.

And this is not github copilot only. It's what's coming for all.

And also they have succeeded in changing the multiplier and models available whenever they like. That won't last long. They will need to hold prices and models for longer periods of time and advice customers at least a month ahead.

Don't worry, you are in a fortune 50. They have money. And they allways can sell insurance to them.

Cheers!

[-]

_mkd_@reddit

Mmm. Not your problem. If your org can't pay it they will mandate other thing or whatever.

Our they'll start swinging the axe at the personnel budget.

[-]

Dry_Author8849@reddit

Yeah, but aren't they doing it now? I mean, they are doing it before confirming any benefit gain from AI.

When costs for AI probe to be expensive and the benefits dubious, perhaps not.

But I've seen wealthy organizations suspend free coffee for employees because it was expensive, so... nothing surprising.

Don't get me wrong, all this is truly sad.

Cheers!

[-]

thephotoman@reddit

Man, there are plenty of days when I don't need AI. Particularly that Tuesday the sprint ends, when I'm usually more concerned with actually doing the live demo to the team and need to rehearse it myself because I, a human, have to do this job. Sure, it's the sum of a bunch of smaller, less formal demos to the team, but this is for a wider crowd of stakeholders.

I am also concerned at the suggestion that we send emails or need meetings summarized. What is this, the 20th Century, when we couldn't just record the meeting and save it off somewhere for reference?

[-]

AStanfordRunner@reddit

I think the copilot price increase essentially eliminated any Anthropic model or (or other frontier model) from being economically viable for small-mid companies. After spamming opus for 4 months and seeing a now 27x, I’ve been playing around with the future 1x models which are so garbage it feels like I will start leaning away from AI for anything that isn’t braindead tasks

Or maybe our company lays off people and gives the rest higher token budgets, who knows lol

[-]

IceMichaelStorm@reddit

Wait, I’m confused. Is Github Copilot not distinct from Anthropic/Claude Opus? Or what am I confusing here? I only use Opus

[-]

AStanfordRunner@reddit

Copilot is the harness, which is switching from request-based to token-based billing. Before you had a bunch of different models available to use with the harness - the primary reason people used it is because 1 request to opus had a 3x multiplier (sonnet was 1x) and you get 300 requests a month standard - so you could get 15 dollars worth of output from a single request and get a ton of value.

Copilots entire selling point was essentially subsidized cost - now it is changing to token-based AND adding a 27x token cost to Opus (9x cost to sonnet)

[-]

IceMichaelStorm@reddit

thanks! makes sense now!

[-]

sassyhusky@reddit

Codex has been just as good as opus and it’s 1x. I’ve been using only codex on xhigh for the past 3 months.

[-]

chickadee-guy@reddit (OP)

Yeah the 1x models are absolute dog water. Its faster and better to just code by hand.

[-]

polypolip@reddit

The c-suits swallowed the bait, the hook, and the sinker. Now they are getting reeled.

[-]

stikves@reddit

It was never sustainable, and they were burning through investor cash... or in case of Github... sweet Microsoft money.

The real cost is enormous. The users will either have to learn to get by running models locally:

https://www.reddit.com/r/LocalLLaMA/

Or be ready to pay hundreds per week per employee to large cloud providers

(The local models of course have limits. Even if you build a $10k nvidia or mac studio system, you can only have around 200k tokens max with good coding models. Qwen3 0.6b won't do it. And Qwen Coder 30B is not "cheap" on the hardware)

[-]

Fidodo@reddit

I'd be looking for new jobs on the side.

[-]

_mkd_@reddit

Have the AI look for jobs for you so you don't fall off the leader board.

[-]

Windyvale@reddit

Another day, another story of AI psychosis in leadership.

[-]

sassyhusky@reddit

Or another Guerilla marketing campaign from Anthropic.

[-]

Ok-Shower6174@reddit

We went from 'AI will replace engineers because it's cheaper' to 'We have to fire engineers because we can't afford the AI bill' in record time. Peak corporate efficiency in 2026.

[-]

Annual_Negotiation44@reddit

I feel like that type of approach would hurt these company’s stock prices….theyre starting to get punished when their AI capex exceeds market expectations (look what happened to Meta after their most recent layoff announcement/earnings)

[-]

headinthesky@reddit

Don't these people understand Goodhearts Law

[-]

nonades@reddit

Damn. Crazy. It's almost like the tech is an unsustainable bubble. Who could have seen that coming

[-]

Annual_Negotiation44@reddit

Dan Ives still says we’re only in the second inning of AI though…

[-]

PopulationLevel@reddit

The leadership at my current company has been more measured in adoption and also looked forward to the possibility of price increases. If you look at the financials of the big AI companies, it’s clear that current token prices are unsustainable, funded by the investors of those companies.

Their plan is to have a variety of models available for use - some closed source, some self-hosted open source, and maybe even some local.

[-]

gburdell@reddit

This is why I went out and requested a ridiculous budget early when VPs were mashing the approve button

[-]

Perfect-Campaign9551@reddit

Your story sounds fake. No

[-]

Necessary-Focus-9700@reddit

If your job will be impossible otherwise or you'll be term'd then "force" is the correct word. Even if they say optional.

Do not underestimate the extremes of surreal ridiculous that happen everyday at many (most?) companies.

An individual manager can be inexperienced, subpar and/or stressed and so make poor decisions. You put just a few of those guys in the same room where they reinforce each other without the backpressure of dissenting peers...

A zoo. With the animal cages open. And cocaine in their food.

[-]

sleeping-in-crypto@reddit

Stories literally came out of Amazon a few days ago about forced usage leading to leaderboard gaming.

[-]

Rakheo@reddit

Sure, no one puts a gun to their head, but when it is tied to performance metrics, you are forced indeed.

[-]

nonades@reddit

One of my best friends works at Cox Automotive as a PM and they're absolutely mandated to use AI and everyone's token usage is tracked.

It's absolutely a thing

[-]

chickadee-guy@reddit (OP)

I wish it was fake. I work at a fortune 50 insurance company

[-]

itix@reddit

This is good. Github Copilot is garbage and is at best a mediocre code completion tool. See this as an opportunity to get a real AI.

If I was in your position, I would let Copilot to review and rewrite PRs. If something breaks, you can put the blame on the AI.

And start polishing your CV. The AI is a great tool but the way your company is using it is only a death spiral. Get out of there.

[-]

thedancingpanda@reddit

You don't know what you're talking about. You can use many different models (including Claude Opus) with Github Copilot.

[-]

itix@reddit

I know and i use the best models in Copilot, but harnessing is not at the same level.

[-]

thedancingpanda@reddit

You can use a different harness as well. We use opencode.

[-]

itix@reddit

That is definitely better than using Copulot.

[-]

Strus@reddit

Harness is as equally important as the model. Copilot has been left behind by CC/Codex/Cursor/OpenCode/Pi/Droid/Amp long time ago. It’s one of the worst tools for agentic coding.

[-]

thedancingpanda@reddit

You can literally use all of those with Github Copilot.

[-]

Strus@reddit

You are confusing models with harnesses. You cannot use Claude Code with GitHub Copilot. Just like you cannot use Codex, Cursor or Amp. Maybe you could use Copilot sub in Pi or OpenCode if that’s not against the TOC.

[-]

thedancingpanda@reddit

We literally do this.

[-]

pretzelfisch@reddit

Have you used copilot lately? Its fine. No one should use ai code completion thats just a mess.

[-]

itix@reddit

I have Copilot Pro. It used to be quite good but now it is left behind. Even with newer models.

[-]

AStanfordRunner@reddit

What does get a real AI mean? Copilot uses other models and is functionally the same (but marginally worse) as Claude code, open code, codex?

[-]

itix@reddit

Claude Code or Codex. It is not just about the model, it is also about harnessing.

[-]

chickadee-guy@reddit (OP)

Those arent AI though. Its just an app that calls an LLM on a loop. Might wanna get your concepts straightened out

[-]

itix@reddit

Looks like yoy didnt understand. Copilot is garbage no matter what model you are using.

[-]

Smallpaul@reddit

You are the one who doesn’t understand. And I say this as someone whose job is to build a harness.

It’s not just calling an LLM in a loop. It’s also prompting, skills and tools. And a foundation model company can and would train their LLM to work really well with their own harness. They also have the researchers who know how to optimize the harness for the LLM.

It’s like saying most websites are just CRUD updates on a relational database. Well sure, that’s basically true. But some do it well and some do it poorly. Microsoft’s harnesses have always been crap.

[-]

chickadee-guy@reddit (OP)

Thats a whole lot of slop to describe an app that calls an LLM on a loop

[-]

Willblinkformoney@reddit

You can just use opencode with your copilot license. It's pretty good

[-]

abrandis@reddit

Exactly, fight fire with fire, stop trying to work by hand in a world where everything is AI slop , you need to play the game and not try to be the good cop, becuase executives at your company don't give a fck so why should you

[-]

levraiponce@reddit

I'm way outside of this AI adoption, only plays with Claude for side projects.

Are you seeing actual value being generated at a commensurate rate?

[-]

chickadee-guy@reddit (OP)

Nope

[-]

RandomPantsAppear@reddit

Honestly? Yes. I want to say no, but the answer is yes.

But this is because I’m absolutely neurotic about what it writes. It’s not uncommon to redo something from scratch multiple times. Using it well, it’s closer to programming using the English language. “I want a model with ID(increment), name, date. When X is updated (post save signal) a task is generated that does Y with Z”

The only time I YOLO is on writing documentation. AI is fine at reading my functions, getting the args and return types, and describing what it does. It writes good readme also.

[-]

tenthousandants44@reddit

You are forgetting the oppurtunity cost of just doing it yourself.

[-]

RandomPantsAppear@reddit

No, I see it which is why I’m not hollering from rooftops about 10x efficiency.

[-]

ElGuaco@reddit

The Reckoning is coming. Very soon.

[-]

invest2018@reddit

As a matter of game theory, these LLM companies are absolutely incentivized to raise prices until the total cost/benefit of the system with AI is barely superior for buyers than the system without AI.

We should expect AI companies to test the absolute limits of pricing until the overall benefit is barely recognizable. Given how AI-pilled some executives seem to be, some may even elect to take a loss just to be able to use AI.

[-]

boost2525@reddit

We're watching the bubble burst in real time folks.

Our leadership already switched from "you are required to use copilot and we're tracking you on this dashboard" to "we're using this dashboard to make sure you don't use copilot too much".

It's absolutely comical. What a shit show

[-]

wxtrails@reddit

Yup. There's no public dashboard, but we got the first email to that effect in January, and the second came Wednesday. The latter was a name-and-shame for the top abusers, just weeks after proudly announcing a contract with a new AI company and rolling out the tool to everyone. We burned through 40% of our yearly budget in those few weeks. Heads are spinning.

[-]

The-Fox-Says@reddit

My company tried doing an AI hackathon and burned through a month’s worth of credits in inder an hour lmao. We’re quickly moving into the Find Out phase as well

[-]

aeroverra@reddit

I’m surprised that’s even a thing. I made a chrome plugin mod for my friends game with codex and got ban from OpenAI for cyber abuse lmao

[-]

dagamer34@reddit

All of this could have been easily predicted, it so clearly shows that C suite people are full of group think.

[-]

GoodishCoder@reddit

I don't think the bubble is bursting, there are still huge AI investments happening. AI companies are just switching to a more sustainable pricing model.

[-]

chickadee-guy@reddit (OP)

Even with these price increases there is still no clear path to profitability. It would need to be a 100-200x price increase to get there based on the current losses theyve incurred

[-]

geft@reddit

Token based pricing is already profitable for them. The losses come from subscription based pricing.

[-]

GoodishCoder@reddit

You're probably looking at it too linear. They're all essentially startups which always start out unprofitable. Eventually the product will level off and costs will decrease making the revenue increase a key part of their path to profitability.

Currently they're throwing more resources at the problem which is expensive but faster than building the product to be more efficient. Eventually that'll swing back in the other direction.

They're currently having to train new models constantly to compete which will also level off at some point.

They're currently building new AI products constantly to see what will stick. Eventually they'll pick the winners and layoff the teams working on the losers. After that they'll adjust pricing for their winners.

[-]

chickadee-guy@reddit (OP)

Im not following your point at all. Training new models will never go away. Theres a reason people stopped using Sonnet. Trainkng is a non negotiable cost that has only ever gone up and wont ever go away.

Theres no evidence anywhere showing "itll swing back in the other direction".

[-]

GoodishCoder@reddit

Training is currently a non negotiable cost. Down the road training can absolutely be limited. Eventually the miniscule gains from training a new model won't be worth the cost.

Theres no evidence anywhere showing "itll swing back in the other direction".

"How things are today are how they will always be". You're thinking in linear terms because you have a specific outcome you want. Businesses as a whole aren't going to go "whelp AI was a failed experiment, let's delete it from existence and stop using it because we can't make it profitable". They're just going to adjust to make it make business sense.

[-]

chickadee-guy@reddit (OP)

Training is currently a non negotiable cost. Down the road training can absolutely be limited. Eventually the miniscule gains from training a new model won't be worth the cost.

This is absolutely untrue. The minute they stop training, Qwen and Deepseek will catch up and everything they built collapses.

[-]

GoodishCoder@reddit

That's not true at all. You're too emotional to have a logical discussion on this topic.

[-]

chickadee-guy@reddit (OP)

A logical discussion would require the person on the other end to actually know what theyre talking about.

You seem to think that OAI and Anthropic can just.... stop training frontier models and things will continue on. Meanwhile "frontier" models from 2 years ago are completely useless for any enterprise work.

What a wildly incorrect and uninformed take. Might wanna revisit the drawing board

[-]

GoodishCoder@reddit

And once again you've reverted to the argument that things will always remain as they are today because you feel that's the quickest way for AI to fail long term.

If Claude 4.6 is working super well for you today, there's no reason to believe you will have to move to a frontier model for the same exact work tomorrow.

Businesses adjust all the time to maximize profitability and have since the dawn of capitalism. For your assertion to work, businesses would have to cease making changes ever again. That's emotion, not logic.

[-]

chickadee-guy@reddit (OP)

Claude 4.6

?????

[-]

GoodishCoder@reddit

😂😂😂 If you can't get any models to do anything in your enterprise, you're either working on something exceptionally challenging or you're horrendous at describing things.

[-]

chickadee-guy@reddit (OP)

If old, outdated models like GPT4o or sonnet can do your job, thats a huge skill issue on your end or a massive lack of scope and responsibilities.

[-]

GoodishCoder@reddit

Oof reading comprehension isn't your strong suit. Nowhere did I say anything about doing the entirety of your job.

Like I said, check back in July and let me know if it's the Armageddon you're hoping for or if it's just minor changes as I said it would be.

[-]

IceMichaelStorm@reddit

Noone here has arguments so far. It’s not straight forward to theorize about this.

E.g. if companies don’t levitate on just investments, they might be able to just sell at face value. If they need to cut costs, then yes, less training needed.

It cannot be zero training, because new technology comes up all the time everywhere which needs to be integrated.

So as for costs, not so clear. Because if operating costs of existing models even without or little training is high enough, it might be hard to give to customers at an affordable price?

It would at equilibrium come down to hardware prices, right?

[-]

GoodishCoder@reddit

I'm not claiming there will be zero training. I'm claiming the need for training will be decreased.

Currently the strategy for companies like OpenAI have been to add more hardware when they hit limitations. I think overtime that mentality will shift when the goal is profitability instead of speed. Eventually they're going to care a lot more about efficiency which will lower hardware and infrastructure costs.

[-]

PigsOnTheWings@reddit

This is incorrect. Training is a non negotiable evergreen cost for model companies that will keep rising over time. The moment they stop training it means we’ve achieved model flattening out, and when that happens all competitors and open source will catch up.

[-]

daguito81@reddit

We’re o viously very close to “AI for Dev” so this seems like make it or break it.

But for example at my company more than 90% of our AI use large is token based because it’s mostly LLM usage in pipelines and projects and such. Not so much GitHub copilot (which we have but nothing close to these companies with leaderboards)

So I think a certain part is bursting , but not “AI” in general. Most of the marketing push incoming for us is to use AI for use cases, processes, etc

[-]

awsaffaswa@reddit

Same thing at my work. Two weeks ago, we moved from codex to Claude, and were told to set whatever budget we want, they aren’t being enforced. Last week, we were told the token budgets are being enforced, and our leaderboard is moving from token usage to a fluency metric.

[-]

Smallpaul@reddit

I don’t see a bubble bursting where I work. Here is how it looks to us, as a product company.

We are pumping out features much faster than before. Some are AI. Some not.
Our customers love our AI features. Also many of the non-AI ones.
Sales are accelerating because of that.

The increase in sales offsets our internal token spend, despite it being eye-watering. So we are going to stay on this loop of increasing sales, increasing token spend, increasing sales, etc. until we overtake other companies in the field.

Our margins on the LLM stuff we actually sell is still reasonable but we do need to keep an eye on it.

[-]

juxtaposz@reddit

🦀🎉🥳🎉🦀🎉🥳🎉🦀

[-]

ButWhatIfPotato@reddit

Protip that still suprises me most adults have not figured out yet: anything involving made up currencies is designed to scam you and bleed you dry.

Has anyone seen their leadership have to reckon with this situation yet?

Regardless of AI, one thing that I noticed while doing this for 16 years now is that there is always a magical money tree ready to be shaken when it comes to paying for the consequences of big corporate pp moves. Paying up the ass for expensive consultants because a large number of employees quit in disgust? Totally worth it because the boss got to gloat in the employee's face when they dared to ask for a raise. Paying 6 figure settlements? That's totally fine because it showed that the boss is not afraid to chase you in the toilet to yell at you why you are not answering your emails at 23:00 on a Saturday. Torpedoing a decades long relationship with a client because you geniuenly thought that with ~~frontpage~~ ~~dreamweaver~~ ~~wordpress~~ ~~no-code~~ AI you will unleash your inner creative demon without being weighted down with those whiny designers, developers and QA testers and their stupid demands to for a fair wage and to be treated like humans? That's just the price of doing business like a proper grindset entrepreneur!

[-]

RandomPantsAppear@reddit

Eventually, someone is going to put tokens onto a realtime bidding/auction model, with dynamic pricing based on demand.

Startups gonna end up working night shifts.

[-]

cockaholic@reddit

It's always night somewhere...

[-]

RandomPantsAppear@reddit

It is, but certain time zones are going to consume more than others, especially from specific AI companies.

[-]

briznady@reddit

Let’s incentivize spending more money! Whoever spends the most money wins!

[-]

pagerussell@reddit

Ed Zitron been banging this drum. AI is heavily subsidized. Heavily.

They want to pull a Facebook or Amazon and pivot to extraction. The problem is those firms could do it because they were able to create network lock in. The big AI companies definitely have not generated lock in and it's not even clear they can.

I am just glad we have a reasonable and competent team in the white house for when this whole house of cards brings down the economy.

/s in case that's necessary

[-]

sleeping-in-crypto@reddit

There’s always an age at which people realize adults not only don’t know what they’re doing, but are often much more stupid and idiotic than children.

Environments like the one we’re in now with AI and the beyond useless government(s) should not leave a single human in doubt that adults have no idea what they’re doing and most of them are absolutely fucking stupid.

[-]

GetmeOutofNowhere@reddit

You have thousand lines pr with emojis?! This sounds too comical to believe. Anyone else have this experience? I’m genuinely curious if companies are going this crazy lol

[-]

chickadee-guy@reddit (OP)

Yup, and if i leave comments they paste them into AI and reply with the AI response copy and pasted

[-]

Strus@reddit

Using Copilot in 2026 is wild.

[-]

gdinProgramator@reddit

Looks like jobs are back on the menu boys

[-]

donjulioanejo@reddit

There was a funny LinkedIn post recently where some company hired a junior to save on AI costs.

[-]

BeABetterHumanBeing@reddit

Yeah, if the jobs coming back are junior, still not so great for us.

But I would be happy to see that for all the people entering the industry.

[-]

410_clientGone@reddit

in the menu for offshore employees

[-]

ADDSquirell69@reddit

A pip? What kind of incompetent leadership does your company have?

[-]

RedFlounder7@reddit

I believe it’s the beginning of the trough of despair. The AI frenzy has been underwritten by free and nearly free tokens. Paying the real price of those tokens is coming and coming fast. It’s one thing when slop is cheap. It’s another when you’re paying a lot of money for it.

[-]

dbenc@reddit

give it a few years and the model-on-chip architectures that give you 15k tokens per second will crater token prices

[-]

Beli_Mawrr@reddit

In a few years we'll get today's models, which CEOs with 2 brain cells bouncing around will think are outdated trash due to the few years they've had to reflect on the current models.

[-]

CorrectPeanut5@reddit

It represents the real costs all this AI is actually generating. And it needs to happen to bring a lot of people back to reality. Not to mention getting MSFT's balance sheet back in order.

Microsoft is allowing enterprise wide pools and this summer enterprise users will get 2x the plan for free for the summer. But I think it's going to his a lot of businesses like a ton of bricks. Especially this fall.

Anyone that's put together a customer facing AI project using something like AWS Bedrock has certainly noticed how quickly it burns money. That's always been way closer to the real costs from the beginning.

Notable outliers will be companies like Royal Bank of Canada (RBC) that's put years of investment into running their own models internally.

Dave Plummer has been questioning 100% cloud AI for a while over these kinds of cost issues. There's an interesting question about pushing AI to edge. So far they can't compete with models like Claude. But there's now enough money involved here that those mini Blackwell machines and 128GB RAM spec Macs start to look interesting candidates to help offset costs.

[-]

RandomPantsAppear@reddit

An interesting dynamic also: AWS bedrock, but running Anthropic models….with startup AWS credits.

Lets you spend without spending for quite some time.

[-]

gravteck@reddit

Your name drop of RBC had me giggle a bit. The way Michael Lewis described RBC culture in Flash Boys, as a conservative culture for relative good behavior in banking, may be alive on the AI side as well.

[-]

Future_Manager3217@reddit

Honestly, the token price increase may end up doing something useful: it makes the hidden review cost harder to ignore.

If leadership only tracks “AI usage” or token volume, they’re measuring the input, not the work. I’d want the dashboard to show accepted PRs after human review, reviewer hours, rework rate, incidents/rollbacks, and how many AI-generated diffs were rejected outright.

A slop PR is not cheap just because the tokens were cheap. It’s only cheap if the total review + fix + ownership cost is lower than a human-written change.

[-]

Visible_Fill_6699@reddit

So the mission, should you choose to accept it, is to survive until July?

[-]

Mundane-Charge-1900@reddit

Oh, yeah. It went from “use as much as you want” to a soft cap to general handwringing about being over budget.

We also have a leaderboard. I try to stay high side of middle of the pack, and make sure I’m having good impact with my token spend.

[-]

chickadee-guy@reddit (OP)

Yup, i learned the hard way if youre too high on the leaderboard you start getting asked to "tell your AI story" to the company and get pulled of your team onto AI shitshow projects

[-]

inthiseeconomy@reddit

this seems like satire

[-]

lppedd@reddit

Why? The billing projections are just incredible for both users and enterprises. People were using 2000$ worth of tokens on a 39$ sub. I imagine companies with tracking systems are encouraging spamming those LLMs.

[-]

chickadee-guy@reddit (OP)

We had a guy spend $20,000 worth of tokens in a month using the new pricing model whereas prior his seat cost $50

[-]

mirageofstars@reddit

Woah, now that is a lot of tokens. How did he hit that number?

[-]

chickadee-guy@reddit (OP)

You can cram a massive amount of instructions into a single prompt and it isnt evaluated on how many tokens were consumed

[-]

chickadee-guy@reddit (OP)

Sadly not. Large insurance company

[-]

HaloNevermore@reddit

Seeing the same. Fortune 50 O&G.

Consultants are going to get us all killed.

[-]

BurberryToothbrush@reddit

I’m not understanding your point - can you clarify what consultants have to do with this topic?

[-]

chickadee-guy@reddit (OP)

Management consultants like Deloitte and Mckinsey are in the ears of every corporations C suite saying that LLMs can replace humans with no issues

[-]

Crafty_Independence@reddit

Deloitte and Gartner have my Fortune 500's ear and every single recommendation in the 6 years I've been here have been awful

[-]

Pumpedandbleeding@reddit

I thought this was my company, but seems it isn’t.

We aren’t as extreme, but they track our usage. Using all your requests and expensive models is rewarded.

[-]

JollyJoker3@reddit

Lol @ rewarding using more expensive models

[-]

inthiseeconomy@reddit

are we cooked

[-]

rocketbunny77@reddit

We are

[-]

liquience@reddit

Yikes… That sounds horrific. Dunno how high up you are but if you’re comfortable approaching the VP or C-suite,

[-]

AStanfordRunner@reddit

Our company is probably less AI adoptive compared to the industry and our AI budget is expected to 6x on June 1st. I can see a full AI adoptive company 15xing

[-]

Smallpaul@reddit

Seems to me that people are going to be very motivated to consider their options to Copilot. I already had a low opinion of it but it seems like its main advantage was the subsidization.

[-]

Prof-Bit-Wrangler@reddit

It’s not. Sadly.

[-]

Oakw00dy@reddit

AI is the tech opioid epidemic. The pill pusher has the mark addicted, now comes the real price. Some will go to rehab, others will OD. Years later, lawyers will get rich.

[-]

Mundane-Charge-1900@reddit

Why do Anthropic and OpenAI have such eye watering potential IPO prices? Why does Allbirds changing from a shoe company to an AI company skyrocket their stock price?

Because the market realizes there are huge profits to be made here if this all pans out.

[-]

norse95@reddit

I was wondering about this since I didn’t get a clear answer from our copilot admin. I’ve been vibe coding different tools just to see what works and what doesn’t and using opus 4.6 with no hesitation. Looks like manual coding is back

[-]

throwaway_0x90@reddit

I think AI flare and topic is only for Wednesdays

[-]

No_Barnacles@reddit

My company had estimated 450k for the AI budget this year. It's been re-estimated to 4.5 MILLION. And they're still saying "but don't worry about the cost! The board wants us to spend on AI!"

They could literally double everyone's salary AND pay 450k for AI if they wanted to (I've seen the financials), but instead we're dumping it all into AI and rewarding the people who do even the simplest tasks on the most expensive models.

[-]

campbellm@reddit

The term I've heard for gaming this absurd idea now is "Tokenmaxxing"

[-]

bingeboy@reddit

Wild. I’m an old school GitHub user that ignores much of the modern tooling they have since MSFT acquired them.

I’m solo for the most part and have major trust issues. I’ve been working on patterns that work for me and my agents that avoid this. I just cli everything and agents use that in a way that works for my free account. Very interesting post!

[-]

xSaviorself@reddit

This is an experience everyone seems to be going through right now.

I work for a small company that has allowed and enabled AI adoption but not forced anyone to do it. It's entirely up to the engineer, and it's been good that way. This is the first time our company has begun seriously discussing AI budget because the costs are absurd.

I've been using it fairly frequently since I've moved away from IC work and back to leadership roles, and May was the first month I ever needed a budget increase outside the $50 in tokens we have by default per user. I'm over $230 in spend in barely 2 weeks. This is not sustainable. I'm not even using the 15x model opus model.

Some companies are cool with this, some may consider it an engineering investment and pull additional resources away from hiring/other needs. The squeeze is coming.

I think the next phase is a race to localize the cost to hardware and run models internally where possible. PI is considering the cost to bring servers back in-house while still using cloud infrastructure for everything else. I'm old enough to see the cycle coming. hardware prices are only going up from here, and also another blow to personal PC products as more companies stop making things like graphics cards and focus more on building AI infra onsite.

[-]

csueiras@reddit

My problem in the past working with any offshore dev shops is that the “engineers” tend to be brainless/no critical thinking skills whatsoever…. So in the age of AI, if you arent coding and you arent even able to think critically then… wtf are you useful for?

[-]

clearasatear@reddit

Could you link or name the Microsoft tool that you've used to calculate the costs coming June?

[-]

chickadee-guy@reddit (OP)

Its called the github pricing calculator. You have to be an enterprise admin

[-]

clearasatear@reddit

I have seen the GitHub pricing calculator which is open to all and probably an immensely down-played version of what you've used. It takes only two parameters: number of devs and overage fees and does not strike me as a helpful tool for realistic projections

[-]

interrupt_hdlr@reddit

My coworkers throwing 1200-page PDFs at LLMs to ask a simple question will learn a valuable lesson.

[-]

dronz3r@reddit

Well it depends on your current usage, if firm as a whole is underutilizing current requests, then the increase in costs wouldn't impact them much financially.

[-]

Few-Impact3986@reddit

I would think that the number 1 line on the it budget is salaries.

No I have not had to deal with it. See you in July when I assume your company gets their first bill.

[-]

GoodishCoder@reddit

Their company will be fine lol, GitHub copilot has enterprise level controls that include monthly limits. You can allow overages for the limits or not and you can set it on a per person level allowing for people who use more to cut into the amount granted to people who use less capping it at the enterprise level.

[-]

Condex@reddit

I wonder how something like per user controls is going to shake out for PIPs. We fired this person for performance reasons might be problematic when they can argue that it's because they were not given enough tokens to do their job. (With clear evidence that their peers who aren't piped are getting more tokens)

And on the other hand if you are given a bunch of tokens and SMART goals, then why wouldn't you just plug it into the agent and see what it gives you (has anyone published a "get me off this pip" skill yet?)

Seems like you pretty much have to setup a productivity metric that can tell you how much to expect per token. And it has to be something that isn't gameable by an AI.

[Although, I'm not really fearing for the date of HR. I'm confident they'll think of something.]

[-]

GoodishCoder@reddit

It won't play a role. The basic way it works is each user gets $x allocated to them at the start of the month. If you choose to enable it, you can allow the people who use more to use some of the surplus from the people who use less. So essentially the argument would be you were given the same amount of tokens and chose not to use them.

[-]

chickadee-guy@reddit (OP)

I shoulda been more clear, Salaries wasnt part of that equation.

Prior, #s 1,2, and 3 were Azure, AWS, and GCP.

[-]

thehouse1751@reddit

What’s the total bill looking like and how many devs and how many projects do you have?

[-]

GoodishCoder@reddit

Your company will just utilize the limit functionality no different than premium requests before. Its not going to be the downfall of AI that people are hoping for.

[-]

chickadee-guy@reddit (OP)

Putting limits in place would fly directly in the face of all the "AI first" work being shoved down our throat for the past 2 years. There will be some tough decisions on whether to can projects and teams who are reliant on tokens to do their day to day work, and whether or not the absurd costs of the tool actually are worth the paltry gains.

[-]

GoodishCoder@reddit

They'll still want you taking an AI first approach, they'll just want you to do it within your limits which will most likely be more than enough for most people.

[-]

chickadee-guy@reddit (OP)

Not sure im following at all, such a limit would completely hamstring AI first work and based on what im seeing any reasonable monthly limit would be hit in days by most of these slop artists.

[-]

GoodishCoder@reddit

You've never seen your company's copilot admin tools have you? There are some super users and a large amount of people who don't even use up half of their limits in a given month.

People who can't succeed without copilot will hit their limits stop delivering and will be rotated out for other engineers. It's not going to be this Armageddon you think it will be.

[-]

chickadee-guy@reddit (OP)

I have full access to it. As i mentioned in the post, daily use is mandated. Essentially the entire company is capping their premium requests monthly and for most people it happens within a week or two.

[-]

GoodishCoder@reddit

Seeing a dashboard and having actual admin access are entirely different things.

But hey check back in July and let me know if the entire company collapsed.

[-]

Dry_Author8849@reddit

Mmm. Not your problem. If your org can't pay it they will mandate other thing or whatever.

On the other hand, you can allways place a budget and stop the service. I think they will continue selling at small prices as a "token pack". If using expensive models you will maybe accomplish one task or expensive models will not be available.

Anyways, it's the same monopoly game. They create the hype. they let you taste it and then they charge whatever they like. Allways the same. Trying to get control of the market and squeeze all they can.

This time costs are really high, but they also have the problem of expensive hardware with a 3 year expiration date. They need to have those 100% all the time. If the number of users decrease too much they will find that they invested in excess and the market is smaller. But that will never happen, is admitting defeat. They will "revive" cheaper plans as per "popular demand" and because "AI should be for all" and "we are socially responsible company" and blah blah.

And this is not github copilot only. It's what's coming for all.

And also they have succeeded in changing the multiplier and models available whenever they like. That won't last long. They will need to hold prices and models for longer periods of time and advice customers at least a month ahead.

Don't worry, you are in a fortune 50. They have money. And they allways can sell insurance to them.

Cheers!

[-]