Client had 4 agents on GPT-4o. One was classifying documents. That one alone had 91% savings potential.
Posted by Dramatic_Strain7370@reddit | LocalLLaMA | View on Reddit | 15 comments
I do some consulting work with AI startups. One client was upset with their OpenAI bill — they had 4 agents in production and felt like they were overpaying but weren't sure by how much. Nor had great intuition on how to go about evaluating the models.
I looked at what each agent was actually doing:
- SEC report summarization — processing long financial filings into summaries
- Financial advisory chatbot — answering client questions about portfolios
- Document classification — documents categorization by type and urgency
- Monitoring agent — checking system health and flagging anomalies
All four were running on GPT-4o. It costs $2.5/$10 for in/out 1M tokens. They used the same model for every request (not good).
When I broke down what each agent was actually asking the model to do, the picture got interesting:
| Agent | Simple prompts | Potential savings with Model Switching |
|---|---|---|
| SEC summarization | \~40% | 65–77% |
| Financial chatbot | \~75% | 77–83% |
| Document classification | \~80% | 91% |
| Monitoring | \~80% | 83% |
The SEC summarization is nuanced — financial filings are complex so a higher percentage stayed on the premium model. Also the input tokens are like 30K at each prompt. But the classification and monitoring agents were doing straightforward tasks on an expensive model for no real reason.
To make this easier to estimate for other setups, I built a quick LLM savings calculator. Enter your monthly spend, primary model, and workload type — it estimates what you'd save routing simple prompts to a cheaper model in the same provider family.
Disclosure: I'm a founder building in this space — the calculator ended up as a free tool on our website. Drop a comment if you want the link, happy to share.
Curious what others are using to track and optimize LLM spend?
Silver-Champion-4846@reddit
I'm confused, isn't gpt4o dead?
Dramatic_Strain7370@reddit (OP)
it is not dead. it is cheaper and many companies dont change the model they started with
Silver-Champion-4846@reddit
Oh that thing where companies just stick to what works until they get a reason to change it, which includes it no longer being available?
Dramatic_Strain7370@reddit (OP)
gpt-4o is available. this is the output of a curl that is just wrote to re-confirm.. I am hiding keys etc
>>> curl .... -d '{"model":"gpt-4o","max_tokens":100,"messages":[{"role":"user","content":"tell me about usa"}]}'
RESPONSE BACK
{
"id": "chatcmpl-DaNgBizukgyhVbH1WGT7gu1Cvs4VS",
"object": "chat.completion",
"created": 1777563203,
"model": "gpt-4o-2024-08-06",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The United States of America (USA) is a federal republic composed of 50 states, a federal district, five major self-governing territories, and various possessions. Here are key aspects about the USA:\n\n1. **Geography**: \n - The USA is the third-largest country by land area, with diverse geography including mountains (such as the Rockies and Appalachians), plains, forests, deserts, and coastlines along the Atlantic and Pacific Oceans.\n - The country is bordered by",
"refusal": null,
"annotations": []
},
"logprobs": null,
"finish_reason": "length"
}
],
"usage": {
"prompt_tokens": 11,
"completion_tokens": 100,
"total_tokens": 111,
"prompt_tokens_details": {
"cached_tokens": 0,
"audio_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 0,
"audio_tokens": 0,
"accepted_prediction_tokens": 0,
"rejected_prediction_tokens": 0
}
},
"service_tier": "default",
"system_fingerprint": "fp_8aed6409fd"
}
Silver-Champion-4846@reddit
yes yes I wasn't doubting you.
Dramatic_Strain7370@reddit (OP)
the list of model leaderboard can be found here as an easy cheat sheet . https://www.cloudidr.com/llm-pricing
Dramatic_Strain7370@reddit (OP)
LLM cost savings calculator link is this >> https://www.cloudidr.com/savings-calculator?utm_source=reddit&utm_medium=comment
parasen16@reddit
sounds like you've got a solid grasp on their usage, which is key. switching models can really drive those savings, especially if the tasks vary in complexity. for instance, you could save a lot on the document classification agent by opting for a lighter model, since that doesn't need the full power of something like GPT-4o. easy win there. a friend of mine recently used the Safe AI Starter Kit to set up protocols for handling sensitive data while using AI, which helped them streamline costs and keep everything compliant. worth looking into if they’re handling any confidential info. keep digging, you’ve got this!
MelodicRecognition7@reddit
bro adjust your spambots, they are too obvious.
CalligrapherFar7833@reddit
Retarded vibe slop post
parasen16@reddit
sounds like you've got a solid grasp on their usage. switching models could definitely save them some cash, especially since each agent has different needs. for something like document classification, a lighter model might do the trick with minimal quality loss. when it comes to managing sensitive info, i found the Safe AI Starter Kit pretty handy for creating a safe protocol while using AI tools. that way, they can optimize costs without freaking out about data leaks. keep pushing them to refine their strategies!
jacek2023@reddit
give me a recipe for pancakes with slumber
MelodicRecognition7@reddit
lol. AI spambot advertises its services in an advertisement thread written by AI spambot.
ps5cfw@reddit
we use local models you daft spammer.
jacek2023@reddit
Where do you see yourself in 5 years?