Augment Code’s pricing is unsustainable

[-]

grauenwolf@reddit (OP)

The user message model also isn’t sustainable for Augment Code as a business. For example, over the last 30 days, a user on our $250 Max plan has issued 335 requests per hour, every hour, for 30 days, and is approaching $15,000 per month in cost to Augment Code. This sort of use isn’t inherently bad, but as a business, we have to price our service in accordance with our costs.

[-]

grauenwolf@reddit (OP)

And keep in mind that newer models generally mean higher costs.

[-]

Mysterious-Rent7233@reddit

And keep in mind that newer models generally mean higher costs.

GPT 3 (Davinci) cost 0.02/1000 = $20.00 / M tokens

gpt-4-0125-preview = $10.00 / million

gpt-5: $1.25 / million

[-]

grauenwolf@reddit (OP)

Input price (per 1M)

GPT-4.1 $2.00

GPT-5 Chat $1.25

Output price (per 1M)

GPT-4.1 $8.00

GPT-5 Chat $10.00

https://langcopilot.com/gpt-4.1-vs-gpt-5-pricing

And as others have pointed out, the newer versions use far more tokens.

[-]

Mysterious-Rent7233@reddit

You are comparing the 2025 costs of two models released in 2025. Why? Are you just hoping that readers will be too dumb to notice?

We are asking about the pricing OVER TIME.

"When GPT-4 was released, it came with a high price—$60 per million tokens for output and $30 per million tokens for input.

https://www.nebuly.com/blog/openai-gpt-4-api-pricing#:\~:text=March%202023:%20GPT%2D4%20Launch,2024:%20GPT%2D4o%20Price%20Cut

"Overall, both models (GPT-3.x and Embeddings) saw a 95% drop since December 2022!"

https://medium.com/@boredgeeksociety/openai-model-pricing-drops-by-95-3a31ab0e04e6

In general, I am fascinated what drives you to be so passionate about AI as to want to spread lies about it. Do you think that you're somehow going to defeat it in the marketplace by doing so? Does clouding your own understanding of the technology advance some actual goal you have?

I have many concerns about AI myself, but decided a few years ago that the best thing to do about that was to understand it deeply and therefore be able to make rational and knowledgable decisions about it.

[-]

grauenwolf@reddit (OP)

You are literally trying to tell us that prices are going down in response to an article about how prices are going up so much that they have to change their business model.

An actual costs were going down then companies like OpenAI would be shouting about it from the rooftops.

You are comparing the 2025 costs of two models released in 2025. Why?

To demonstrate that the new model is not more efficient than the old model in terms of price per token. I'm comparing two models, not the pricing subsidies of last year versus this year.

[-]

Mysterious-Rent7233@reddit

You are literally trying to tell us that prices are going down in response to an article about how prices are going up so much that they have to change their business model.

Liar. Please quote any words of the Augument AI section that says that their costs have gone up.

The only thing that they said that relates at all to "new models" versus "old ones" is:

State of the art reasoning models are increasingly designed to stop and ask clarifying questions, effectively penalizing customers because they consume more messages despite achieving a better, more aligned outcome.

Which is to say that their old per-message pricing model was not appropriate for newer, smarter models that were designed to waste fewer tokens.

An actual costs were going down then companies like OpenAI would be shouting about it from the rooftops.

You mean like this

2. The cost to use a given level of AI falls about 10x every 12 months, and lower prices lead to much more use. You can see this in the token cost from GPT-4 in early 2023 to GPT-4o in mid-2024, where the price per token dropped about 150x in that time period. Moore’s law changed the world at 2x every 18 months; this is unbelievably stronger.

https://blog.samaltman.com/three-observations

And this?

DARIO: So we're able to produce something that's, in the same ballpark as GPT 4 and better for some things. And we're able to produce it at a substantially lower cost. And we're in fact excited to extend that cost advantage because, we're working with Custom chips with various different companies, and we think that could give an enduring advantage in terms of inference costs.

And this?

Alphabet, the parent company of Google, announced on Dec. 13 that it plans to slash the cost of a version of its most advanced artificial intelligence (AI) model, Gemini, and make it more accessible to developers.

According to reports, the company said the price for the Pro model of Gemini has been cut by 25%–50% from what it was in June.

https://cointelegraph.com/news/google-slashes-price-of-gemini-ai-model-opens-up-to-developers

To demonstrate that the new model is not more efficient than the old model in terms of price per token. I'm comparing two models, not the pricing subsidies of last year versus this year.

But wouldn't it make sense (i.e be more honest) to look at models released more than 3 month apart if we want an actual TREND

[-]

DrunkMonkey@reddit

Cheaper tokens isn't the same thing as a cheaper model. The newer models use a lot more tokens.

[-]

Mysterious-Rent7233@reddit

If you instruct them to. If you instruct them to use fewer, they'll do that too. Measuring this is literally my job and I'm doing it right now in another window.

Gemini-2.5 Pro doesn't allow you to turn off reasoning tokens totally (although you can budget them minimally), but GPT-5 does allow you to turn them all of the way off.

{"reasoning_effort": "minimal"}

          "completion_tokens_details": {
...
            "reasoning_tokens": 0,
...
          },

[-]