TheaterFire

GPT-4.5 cost

Posted by Timotheeee1@reddit | LocalLLaMA | View on Reddit | 121 comments

GPT-4.5 cost

Reply to Post

121 Comments

reza2kn@reddit

this is the best use-case for distillation!😁 give it a few days and we'll get GPT-4.5 generated datasets on HF that would get any model do as well as GPT-4.5 if not better (at the covered tasks) 🀘
View on Reddit #49796384

trajo123@reddit

You are talking about synthetic data. Distillation is something else.
View on Reddit #49802008

reza2kn@reddit

No, it doesn't have to be. You give a model some questions , and it gives you some answers. Both of these could be synthetic data. When you use these as training data to train another model, that's distillation.
View on Reddit #49802458

trajo123@reddit

>When you use these as training data to train another model, that's distillation. No. What you are describing is just training on synthetic data. Distillation in deep learning refers to a special kind of training where a smaller model aims to reproduce activations of some internal layers of a larger model. Typically it is done by matching the next-to-last layer - the"logits" in case of classification, and it involves the use of a special loss term, usually cosine distance or KL divergence between the teacher logits and student logits. Distillation is also usually done with real data, but can be done with synthetic data as well.
View on Reddit #49827292

MorallyDeplorable@reddit

that was called fine tuning before Deepseek co-opted the term
View on Reddit #49803047

reza2kn@reddit

Not really.. fine-tuning is when you train a model on any given data. If that data came specifically from asking a specific model, you're fine-tuning by distilling the features of that specific big model into your model.
View on Reddit #49803429

No_Afternoon_4260@reddit

You can't know gpt4.5 logits, that's why it's not truly distillation but just fine tuning on a synthetic dataset. Look at medius supernova paper. He distilled llama405 into some random qwen. I don't remember the details but he did a great explanation on what he did because qwen and llama don't have the same tokenizer
View on Reddit #49807496

reza2kn@reddit

What is this 'true distillation' definition coming from my, brother? 😁
View on Reddit #49817654

No_Afternoon_4260@reddit

From where I seat my friend xD Just from where I seat I don't think anybody owns the rights for the definition of distillation in ML anyway but happy to be prouved wrong. But as I look at it, in llm space using the end text or the logits are two vastly different approaches and it's worth mentioning. To me it seems that by training on the logits you are just closer to the latent feature space, thus copying it with more fidelity (or just shorter dataset?).
View on Reddit #49820355

reza2kn@reddit

i wasn't claiming to own a defition! just said why do you think that's 'real distillation'? where's the source? also, there are various ways of performing distillation, some are better than others in some cases, but i don't think any of them could be classified into 'real distillation' and not. that's all I was saying.
View on Reddit #49821273

No_Afternoon_4260@reddit

No you are right I've thought about it and I completely agree with you
View on Reddit #49822224

ColorlessCrowfeet@reddit

Yes, and it's still called fine-tuning by people who want to understand and communicate technical knowledge rather than imitate statistical patterns in their recent reading data.
View on Reddit #49805582

aurelivm@reddit

distillation refers to training a smaller model with the same tokenizer on output logits, not SFT
View on Reddit #49806449

ToTallyNikki@reddit

Distillation refers to heating a liquid to concentrate something.
View on Reddit #49812586

Aischylos@reddit

At this point they have become synonymous, but distillation was originally a technique for training off the output distribution, not just the tokens. Now it's been used interchangeably a lot which sucks because it would be nice to have a good term for 'true' distillation.
View on Reddit #49805585

Regular_Boss_1050@reddit

At these prices, distilling is gonna be expensive AF. Need someone to take the bait.
View on Reddit #49805028

reza2kn@reddit

So you see my plans πŸ˜πŸ€ŒπŸ»πŸ˜‚
View on Reddit #49806682

medialoungeguy@reddit

And then d sack will cry and call distillation a national security issue
View on Reddit #49797179

reza2kn@reddit

Let 'em cry 😁
View on Reddit #49797534

Relative-Flatworm827@reddit

And that's still super efficient compared to anything I can run on my PC lol. I have 16gb vram and every model I use is garbage. So. Price is worth it I guess.
View on Reddit #49812752

Comfortable-Rock-498@reddit

This is insane pricing. I hope the present and even more so the future will look back at this moment in mockery
View on Reddit #49795843

HellsNoot@reddit

If OpenAI had a moderate run, leading to a much larger model with some better performance, you'd rather they don't offer it at all? Or have them lose money on offering it to customers? I don't really understand what the grift is here.
View on Reddit #49811866

The_frozen_one@reddit

That's such a weird take. They have no vendor lock in, and they are losing money with each token. Nobody is being compelled to spend money on a model that most people didn't know about 6 hours ago.
View on Reddit #49806792

Balance-@reddit

Comparison: https://preview.redd.it/qwgw6xasyqle1.png?width=3567&format=png&auto=webp&s=a377ac36e5549d95ffc0d0b15a1a2231c3affbac
View on Reddit #49797562

shakespear94@reddit

I’m okay gpt-4o-mini. Although, i’m developing with gpt-3.5-turbo. God. I almost want to have my own infrastructure and run on that *later*.
View on Reddit #49798538

wen_mars@reddit

I have 4o-mini in one tab, V3 in another (it's smarter but slower, also cheap) and I escalate to o1 on tough problems but I have to be careful not to use up my quota.
View on Reddit #49808793

shakespear94@reddit

I go for DeepSeek R1 for critical analysis and approach refining. Then Grok 3 to help me build that part, context is kind of infinite to me, all things considering, and chatgpt to troubleshoot minor things. But that’s me being a script kiddie. I’m sure people are using these models for breaking into the quantum real.
View on Reddit #49810622

Comfortable-Rock-498@reddit

you see, scaling is not dead
View on Reddit #49799348

harrro@reddit

Scaling profits is at an all time high
View on Reddit #49801946

thereisonlythedance@reddit

Must be a monster of a model, size wise. Proof scaling hit a wall.
View on Reddit #49795155

differentguyscro@reddit

They said that scaling up only two of (parameters, compute, data) doesn't do much [compared to scaling up all three together] in their paper "Scaling Laws for Neural Language Models" from January 2023. We need 10^2 or 10^3 internets worth of quality synthetic data to really tell.
View on Reddit #49809385

PermanentLiminality@reddit

The scaling laws appear to be exponential increase in compute for linear increase in performance.
View on Reddit #49796006

Enfiznar@reddit

So logarithmic in performance
View on Reddit #49796114

yur_mom@reddit

I like my binary trees to have log logarithmic not my llm performance
View on Reddit #49808636

121507090301@reddit

Might be might not be. It could just as well have been that they made a model too big for their dataset and it either didn't even begin getting good or the dataset was pretty bad or something like it, or something else. Who knows...
View on Reddit #49806208

thereisonlythedance@reddit

Sam just referred to it as a β€œgiant, expensive model”, confirms it’s a big one. I’d love of know how it stacks up to the original GPT-4 in parameter size. It’s very slow via API at the moment too, though that may be extreme load.
View on Reddit #49807208

mikael110@reddit

OpenAI themselves hints at this in their news blog: >GPT‑4.5 is a very large and compute-intensive model, making it moreΒ [expensive⁠](https://openai.com/api/pricing/)Β than and not a replacement for GPT‑4o. Because of this, we’re evaluating whether to continue serving it in the API long-term as we balance supporting current capabilities with building future models. We look forward to learning more about its strengths, capabilities, and potential applications in real-world settings. If GPT‑4.5 delivers unique value for your use case, yourΒ [feedback⁠(opens in a new window)](https://community.openai.com/)Β will play an important role in guiding our decision. If it is so big that they might not even keep serving it on the API it must be quite *chonky* indeed, which incidentally is what one of the presenters nicknamed it during the announcement presentation.
View on Reddit #49802941

Dayder111@reddit

So, it's released as a research preview indeed then, and to show that some of the scaling did indeed hit a wall, at least with the current hardware (that they have, H100 and H200 I guess). Kind of giving us the last taste of the old-school naive architecture pretraining scaling maybe? Maybe even close to what some people thought GPT-4 will be, 100 trillion parameter rumours and such.
View on Reddit #49803764

TheThoccnessMonster@reddit

More like proof they know that their competition will generate output for DeepSeek and they’re gonna pay a lot for the privilege.
View on Reddit #49803018

a_slay_nub@reddit

What exactly is this model's niche again? At this point you're better off paying for a reasoning model. I guess scaling really is dead.
View on Reddit #49794575

Ok_Landscape_6819@reddit

Is it though ? I mean Grok 3 base performed better and probably cost way less since free-tier has access
View on Reddit #49795823

MindCrusader@reddit

Isn't grok 3 using TOC under the hood? Sonnet is using it beside not being named reasoning model
View on Reddit #49798300

Dudmaster@reddit

> TOC Thought of chain?
View on Reddit #49808721

MindCrusader@reddit

Yup
View on Reddit #49808956

differentguyscro@reddit

"She has a really good personality"
View on Reddit #49808860

Comfortable-Rock-498@reddit

This TC article made me laugh out loud [https://techcrunch.com/2025/02/27/openais-gpt-4-5-is-better-at-convincing-other-ai-to-give-it-money/](https://techcrunch.com/2025/02/27/openais-gpt-4-5-is-better-at-convincing-other-ai-to-give-it-money/) Of course it is lol
View on Reddit #49797184

as-tro-bas-tards@reddit

Combine this with the CFPB shutting down and....hoo boy we are in for some dark times.
View on Reddit #49808720

acc_agg@reddit

This was a moonshot to see how well a monolithic non-reasoning model could be trained. It's the bet they made with gpt3 that paid off, here it seems like it may not - I've not tested the model and can't say for sure.
View on Reddit #49808182

throwaway2676@reddit

Sounds like the thought process here was basically "This is kinda obsolete because of reasoning models, but we put a ton of compute into it, so let's just release it and move on"
View on Reddit #49807942

M44PolishMosin@reddit

Apparently good at writing mean texts to your friends
View on Reddit #49796527

eimas_dev@reddit

i was looking at stream and thinking am i the only one dumb or the example is absurd when you try to present sota model
View on Reddit #49799521

MerePotato@reddit

That's cause its not SOTA, and they say as much - it isn't intended to push the frontier
View on Reddit #49801265

Danteg@reddit

It probably was intended to be SOTA at some point, but disappointed.
View on Reddit #49801937

hudimudi@reddit

I’d rather say they cooked this up quickly as a response to Claude’s new model and DeepSeeks models. They just had to release SOMETHING to take over the spotlight again.
View on Reddit #49806004

BaysQuorv@reddit

Knowledge cutoff is oct 2023, this has been in the making long before r1Β 
View on Reddit #49807487

jm2342@reddit

Hardening Altman Sam's sick, if course
View on Reddit #49795596

sourceholder@reddit

What's the BIG-Bench Hard score?
View on Reddit #49798013

NikBerlin@reddit

now wrap 4.5 with reasoning..
View on Reddit #49794719

4sater@reddit

One dollar per token
View on Reddit #49795251

AutoModerator@reddit

Your submission has been **automatically** removed due to receiving many reports. If you believe that this was an error, please send a message to modmail. *I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/LocalLLaMA) if you have any questions or concerns.*
View on Reddit #49808925

linkcharger@reddit

πŸ˜‚πŸ˜‚πŸ˜‚πŸ˜‚πŸ˜‚πŸ˜‚
View on Reddit #49808554

cmdr-William-Riker@reddit

If they are going to make it 5 times as much as the best Anthropic model it had better be 5 times more capable
View on Reddit #49808492

crack_pop_rocks@reddit

People be sucking dicks soon for tokens
View on Reddit #49795704

aprx4@reddit

Not for long, we'll have AI-powered humanoid robots doing that.
View on Reddit #49796535

cultish_alibi@reddit

well what the hell is left for humans??
View on Reddit #49808017

Foreign-Beginning-49@reddit

Those robots aren't leaving the home they are currently in.
View on Reddit #49798084

mattjb@reddit

We must teach them how to remotely work via Zoom.
View on Reddit #49804042

Cergorach@reddit

But just like with self driving cars, you can rent them out when you're not using them... ;)
View on Reddit #49799767

weird_d0lphin@reddit

If you follow his logic, we might have to suck AI-powered humanoid robots' dicks for tokens...
View on Reddit #49802632

mosthumbleuserever@reddit

...You guys are getting tokens for money?
View on Reddit #49807209

drwebb@reddit

lol no thanks, I'll stick with my R1 based models thank you very much.
View on Reddit #49798854

TheRealMasonMac@reddit

I bit the bullet to see how well it would do for creative writing. Holy shit. It is shit.
View on Reddit #49807484

megadonkeyx@reddit

So what's the point of stargate if scaling is over
View on Reddit #49806346

evia89@reddit

Well they can host sonnet 37 like amazon ))
View on Reddit #49807356

andrew_kirfman@reddit

This is not a great look from OpenAI IMO. Worse than Claude 3.7 at programming tasks and insanely more expensive. Makes me wonder what’s going on with model scaling and how many parameters we’re looking at to produce this result. I can definitely understand why they didn’t release this as GPT-5.
View on Reddit #49807323

You_Wen_AzzHu@reddit

We need to cancel our plus account to make a point. Claude is now a better solution.
View on Reddit #49796206

bonobomaster@reddit

Out of curiosity, has Claude a Deep Research mode? Because that mode is in the plus account and it fucking blows my mind. My first "Holy shit, I'm in the future" moment I had with any LLM.
View on Reddit #49806741

Cergorach@reddit

Left Chat due to an affair with Claude... ;)
View on Reddit #49803657

panic_in_the_galaxy@reddit

I don't think most people here have a plus accountΒ 
View on Reddit #49798445

Reason_He_Wins_Again@reddit

I would rather have a GPT+ account than most other subscriptions at this point. It's my google replacement.
View on Reddit #49799726

panic_in_the_galaxy@reddit

Google actually gives you free access to their LLMs
View on Reddit #49800350

MorallyDeplorable@reddit

Yea but then you're stuck using Google's LLMs
View on Reddit #49803128

AriyaSavaka@reddit

I'd never subscribed to begin with, either local or API.
View on Reddit #49799379

brahh85@reddit

A new gpt-4 with up to date datasets to distill into a new set of models.
View on Reddit #49795407

jiml78@reddit

I think the knowledge cutoff is Oct 2023. So not really
View on Reddit #49797306

Cergorach@reddit

Erm... The free version of ChatGPT knows who sits on the Danish throne (that changed in 2024) and that's with websearch turned off.
View on Reddit #49803615

LevianMcBirdo@reddit

yeah it was updated in January, this includes more uptodate knowledge.
View on Reddit #49806540

3D_TOPO@reddit

Pretty hilarious considering R1 stomps it
View on Reddit #49804545

PlaneTheory5@reddit

Bullish on Deepseek Bearish on OAI
View on Reddit #49803526

Distinct-Target7503@reddit

wait is that higher than claude opus?!
View on Reddit #49796934

MorallyDeplorable@reddit

> wait is that higher than claude opus?! by a _lot_ even
View on Reddit #49803156

kldjasj@reddit

Are they putting the price high now to release a new model with a not-so-expecing pricing later?
View on Reddit #49802007

SuuLoliForm@reddit

Two months later: "Check out this totally new model that was totally not at all finished when we released GPT 4.5, GPT 4.5Turbo! Now at the low low price of ten dollars per million token input!"
View on Reddit #49802767

EridianExplorer@reddit

HAHAHAHAHAHA
View on Reddit #49802580

ARVwizardry@reddit

If you wanted to voice-chat with gpt-4.5, you can do it for no additional cost with [ClickUi.app](http://ClickUi.app) Although I highly recommend using 4o-mini lol
View on Reddit #49796819

Dogeboja@reddit

who asked?
View on Reddit #49800452

ARVwizardry@reddit

No one, I just got the website live and trying to get users/collaborators Surprised there's so many downvotes on mentioning a just-launched, free, and open source tool that brings AI to your computer
View on Reddit #49801933

vertigo235@reddit

OMG lol
View on Reddit #49801148

FastDecode1@reddit

wrong sub
View on Reddit #49800471

Glittering-Bag-4662@reddit

JesΓΊs
View on Reddit #49799620

Tailor_Big@reddit

At least 5 trillion parameters, largest llm ever on earth!
View on Reddit #49799530

RexyIsSexy@reddit

The price chart shown is per 1 million tokens for each column.
View on Reddit #49799343

Comic-Engine@reddit

That's insane
View on Reddit #49799145

AriyaSavaka@reddit

Now DeepSeek will have something to leverage for their R2.
View on Reddit #49799120

ttkciar@reddit

By comparison, Tulu 3 405B is only $5 per million input tokens, $10 per million output tokens. I suppose GPT4.5 might be more compelling if it's better-suited to your task, but for my use-case Tulu is a better fit (and I could infer locally with it, if I upgraded one of my servers with more memory).
View on Reddit #49798490

Johnny_Rell@reddit

At this point it's cheaper to hire someone to do the task you're trying to achieve
View on Reddit #49796472

Hydraxiler32@reddit

the person you hire will use a cheaper model
View on Reddit #49798014

MLHeero@reddit

Not even close even with this prices πŸ˜€
View on Reddit #49797564

mr_happy_nice@reddit

LMAO, the point of paying for something is to get value out of it. Does this really provide that much value considering other options?
View on Reddit #49797892

OriginalPlayerHater@reddit

ez money
View on Reddit #49794540

offlinesir@reddit

it's per 1m tokens
View on Reddit #49795726

OriginalPlayerHater@reddit

pretty sure its per request, Altman is pretty greedy
View on Reddit #49795912

Enfiznar@reddit

per 1m tokens per request, as with all their models
View on Reddit #49796186

OriginalPlayerHater@reddit

yup i get it, i'm just poking fun
View on Reddit #49797560

NikBerlin@reddit

per million tokens I guess?
View on Reddit #49794687

mxforest@reddit

No guesses. It's definitely for a million.
View on Reddit #49794980

OriginalPlayerHater@reddit

if it is 500k you have to eat a worm raw
View on Reddit #49795236

ttkciar@reddit

We always knew OpenAI would have to raise their prices, eventually. They've been operating at a net loss since day one, burning through investor funding to keep the lights on. Turning a net profit requires them to charge higher prices. It's as simple as that.
View on Reddit #49797500

maddogawl@reddit

I still can't believe this is real, its gotta just be for rollout right? right?
View on Reddit #49797364

ComprehensiveBird317@reddit

Holy hell that's a lot of dollars. So it's not good for chat, but then what?
View on Reddit #49796999

2053_Traveler@reddit

gee thanks Sam!
View on Reddit #49795107