I tracked GPU prices across 25 cloud providers and the price differences are insane (V100: $0.05/hr vs $3.06/hr)

Posted by sleepingpirates@reddit | LocalLLaMA | View on Reddit | 52 comments

I've been renting cloud GPUs for fine-tuning and got frustrated tab-hopping between providers trying to find the best deal. So I built a tool that scrapes real-time pricing from 25 cloud providers and puts it all in one place.

Some findings from the live data right now (Jan 2026):

H100 SXM5 80GB: - Cheapest: $0.80/hr (VERDA) - Most expensive: $11.10/hr (LeaderGPU) - That's a 13.8x price difference for the exact same GPU

A100 SXM4 80GB: - Cheapest: $0.45/hr (VERDA) - Most expensive: $3.57/hr (LeaderGPU) - 8x spread

V100 16GB: - Cheapest: $0.05/hr (VERDA) — yes, five cents - Most expensive: $3.06/hr (AWS) - 61x markup on AWS vs the cheapest option

RTX 4090 24GB: - Cheapest: $0.33/hr - Most expensive: $3.30/hr - 10x spread

For context, running an H100 24/7 for a month: - At $0.80/hr = $576/month - At $11.10/hr = $7,992/month

[-]

SlightedMarmoset@reddit

You need sort and filtering my man.

[-]

sleepingpirates@reddit (OP)

Filtering is already there. Which sort do you think we need?

[-]

SlightedMarmoset@reddit

Well just think about how a user is going to approach this, they have a use case in mind. If I want 80gb+ of VRAM and comfyui on a one click install? Can I find that?

And you need to rank by actual cost per hour, not by GPU then have the actual price in fine print. If there is a way to change that it is not obvious, therefore it is not good UI.

[-]

sleepingpirates@reddit (OP)

I mean we are ranking by actual cost per hour try but I can think about also supporting which ones have one click installs for specific templates..

[-]

SlightedMarmoset@reddit

You seem to be ranking on cost per gpu hour, not total cost per hour.

[-]

sleepingpirates@reddit (OP)

Ah okay we can add that thanks for the feedback

[-]

indicava@reddit

This is actually very helpful. I do have some concerns though, I just did a search for an H200 NVL, got 2 results on your site, the second (RunPod) was significantly pricier than the current cheapest H200 NVL on vast, which weren’t anywhere on the list. Any idea why it missed that?

[-]

sleepingpirates@reddit (OP)

We fixed this bug thanks for pointing it out

[-]

sleepingpirates@reddit (OP)

Let me get back to you on this I’ll take a look on what’s up with the b200

[-]

EbbNorth7735@reddit

No Google or Microsoft?

[-]

sleepingpirates@reddit (OP)

Nah

[-]

Pvt_Twinkietoes@reddit

Do they provide the same level of service? Stability? Resilience? What about data retention?

[-]

sleepingpirates@reddit (OP)

They are all different I need to think of a way to show all of this information

[-]

PretendFox8@reddit

You actually have to benchmark their performance. You are getting virtualized GPUs not bare metal ones. They are not “identical”.

[-]

sleepingpirates@reddit (OP)

This is in the plan eventually as well

[-]

TeraUnit_Dev@reddit

Hmmm

[-]

sleepingpirates@reddit (OP)

HMMMMMMM

[-]

Icy_Pen_4690@reddit

IONOS offer NVIDIA H200 on-demand https://cloud.ionos.co.uk/cloud-gpu-vm maybe you could include in your comparison - currently have lots available for use :)

[-]

sleepingpirates@reddit (OP)

Thanks, We'll add this

[-]

sleepingpirates@reddit (OP)

Update: built a deployment platform on top of this. You can now actually deploy GPU instances directly through DeployGPU, not just compare prices. Still early but it works. Appreciate everyone who gave feedback on the pricing tool, it directly led to this.

[-]

ResidentPositive4122@reddit

Could use a filter for on-demand vs spot. Prices now include spot, which is sometimes hit and miss (it really depends on what you want to do. If you have background processing tasks, spot works. If you need to train something long term, or you want to test stuff in interactive sessions, spot doesn't work).

[-]

sleepingpirates@reddit (OP)

Added this

[-]

AutomaticAbility2008@reddit

at least at Verda, with spot when you are evicted, the system disk is detached so you dont lose any data but obviously if you want stability for long-term training then on-demand would still be the best even though it's more expensive

[-]

sleepingpirates@reddit (OP)

Good call out thanks! I’ll work on adding this. Also working on adding disk space and net speed

[-]

KallistiTMP@reddit

Also infiniband/RoCE.

[-]

Lyuseefur@reddit

Yeah - the spot thing ... ugh. I've tried it a few times and if I can't have it for an hour (or some defined length) it doesn't work for me. As they say, evicted without notice and it's happened to me.

[-]

carl_peterson1@reddit

Glad we made the list, thank you for including Thunder Compute

[-]

paulahjort@reddit

There are abstraction layers for people who want to provision... https://pypi.org/project/terradev-cli/

[-]

DarkPrinceOfLight@reddit

Well done! Thanks!

[-]

Narwal77@reddit

I’ve been running workloads on GPUhub.com and the experience has been smooth. Fast deployment, solid availability, and straightforward container-based setup. Great option if you’re looking for cost-effective GPU infrastructure.

[-]

snapo84@reddit

thanks for putting this together... never heard of verda. strange thing about verda is they dont provide any pricing for traffic occuring on their servers (assume i download from huggingface kimi k2 which is nearly 2TB) ... is that free on verda ? because couldnt find anything related to traffic pricing...

[-]

AutomaticAbility2008@reddit

Hi, I work at Verda and can confirm that downloads are totally free and same thing with uploads

[-]

snapo84@reddit

Hi, thanks for the reply.... do you have a direct peering to huggingface? Because i would intend to convert models and those use a lot of traffic (1 download, multiple uploads) ... do i realy just pay the hourly price nothing else?

[-]

AutomaticAbility2008@reddit

https://docs.verda.com/containers/tutorials/deploy-with-tgi-indepth
hope this helps

[-]

sleepingpirates@reddit (OP)

I’ll take a look into this but adding the amount of traffic/traffic cost is also on the road map

[-]

ClearML@reddit

This pricing spread is exactly why GPU cost optimization is becoming a control problem, not a hardware problem. Worth calling out though, tools like ClearML aren’t competing with those $/hr numbers directly since they don’t sell GPUs. They sit above providers like VERDA, Vast, RunPod, etc., and help you actually use the cheap GPUs consistently instead of accidentally burning expensive ones.

In practice, teams lose way more money to idle GPUs, people picking the wrong tier “just in case," no visibility into who used what and why, etc. -- more than they ever spend on orchestration. One misused H100 for a month (even at the cheap end) already outweighs the cost of most control-plane tooling.

The irony is that as GPUs get cheaper and more fragmented across providers, orchestration and policy become more valuable, not less. Without guardrails, people will keep paying $11/hr when $0.80/hr would’ve worked just fine. Just saying.

[-]

Major_Border149@reddit

The pricing fragmentation is real, but the operational cost of using cheap GPUs consistently is what actually burns teams. Failed starts, retries, people over-provisioning “just in case” that’s where the money leaks.

Once GPUs are fragmented, placement and guardrails become a control problem, not a hardware one. Totally agree.

[-]

robberviet@reddit

As usual: min and max are useless. Find median.

[-]

FullstackSensei@reddit

How many of them have lower prices but no available capacity? For ex: lambda prices generally look decent but I've yet to see any capacity available.

There are also other things, like how much system RAM, how much storage, networking fabric you get with the GPU. The big hyperscalers can generally give you large clusters, well into the thousands of GPUs, while smaller providers don't have the same level of networking and storage fabric.

The price you actually pay to the big players if you're business is almost never the advertised price. All the companies I've worked at in the past 7 or 8 years have at least a 30% discount vs advertised price without any commitment requirements. If the business is willing to make long term commitments (the minimum I've seen was 6 months), they get further discounts. With 3 year commitment, I've seen prices go to 30% of the advertised price. If you factor in the storage and networking you have access to, it's not as big of a difference as it initially seems

Of course, if you're an individual or small team/start-up looking for short term rentals to fine tune a model or train a small custom model, it makes no sense to consider a hyperscaler.

[-]

Pingmeep@reddit

Is that 30% in addition to the no contract 30% or the same 30% orr advertised prices?

[-]

FullstackSensei@reddit

30% for having a contract (usually to spend a minimum of X annually) and another 30% for reserving resources long term. So 60% below advertised price.

[-]

Fentrax@reddit

Suggestiion: include price tracking the actual cost of each gpu for direct purchase. It would be useful to see the crossover point where "You should have just bought one" begins.

[-]

Ztoxed@reddit

I am new, why would a person use online GPU's ?
I would think the ISP limit would be to great to benefit.
But I know nothing about it seriously.

[-]

Emotional-Baker-490@reddit

You are in locallama, you have context clues.

[-]

Ztoxed@reddit

That doesn't help. Reference please.

[-]

Ztoxed@reddit

Gees down voting a question, thanks for the update and responses.

[-]

Available_Canary_517@reddit

These gpu are very expensive and cannot run with home electricity in many places , cloud gpu allow us to use them for our work( like using llm which takes a lot of gpu power) and to cloud provider we only pay money for the time we used so lets say gpu is priced 50k dollars but i need it for two months for a project , i can get it from cloud per hour for two months and save my money. ISP limit is not major issue in workflows because data processing takes time but data taken from phone or laptop to gpu server(and vice versa) is not much so isp is not a limiting factor

[-]

harrro@reddit

You're not just using a random GPU directly with your computer - you have remote access to a full server that happens to have an expensive GPU attached to it.

So the bandwidth requirements aren't high - its just whatever is input/output from the models you're running on them (the remote server downloads and runs all the software and you just download the output whether its text / image / video)

[-]