First time GPU buyer. Got a RTX 5000 Pro. Was it a bad decision compared to two 3090s?
Posted by Valuable-Run2129@reddit | LocalLLaMA | View on Reddit | 97 comments
I’ve run models exclusively on apple silicon up until now, but wanted to up my inference game.
I bought a slightly used RTX 5000 Pro Blackwell for a bit more than twice as much as two 3090s.
I’ve read of people saying that the 5000 doesn’t provide a big performance improvement over the 3090s. That is making me doubt my choice. But it is also true that electricity cost where I live is 0.40 euros per KWh. A 5000 Pro would probably burn a third of the electricity of a dual 3090 build. Right?
Also, if you have a 5000 Pro, what type of speeds do you get in PP and TG with qwen3.6 models?
segmond@reddit
Doesn't matter, you already bought it. You're also not telling us how much you paid, so it doesn't help. 2 3090s will cost $2000 here. I'll pick a 48gb rtx 5000 over it. However rtx 5000 go for $5000. I'll buy 4 3090s (96gb) over the 48gb and use the extra $1k for PSU, cables, etc.
Vaguswarrior@reddit
Where the heck do you guys get a used 3090 for 1000? I've literally been looking for months in Canada and I can't find anything, even after currency conversion, under 1500
cleversmoke@reddit
I got my Strix RTX 3090 OC 24G for $840 USD in Vietnam right before the Qwen3.6 and Gemma-4 drop, luckily. Now, due to the two models, 3090s jumped to $1000 for Strix, TUF and Suprim X while the EVGA, Auros, and Zotac can be found for $900 with enough patience. Barely anyone is selling them now though.
kevin_1994@reddit
EBay is like 30-50% markup compared to marketplace/kijiji. I got all my 3090s by patiently waiting for local sellers to realise they arent getting 1500 for their used 3090.
I got a 3090 for 600 CAD on fb marketplace once. It was super sketchy transaction lol. He wouldn't verify it works. Wouldn't even send a pic. But he claimed he was a doctor at the university and got me to pick it up from his office in the hospital. And turned out he was an actual doctor who had apparently invested like 30k into crypto mining in 2022. He was selling like 8 of these. If only I had the money then lmao. OK sorry for going off topic.
In short, eBay sucks. Risk it IRL. Its more fun that way
Numerous-Aerie-5265@reddit
lol, the sketchy marketplace deals are the best. I drove 40 mins to buy a 3090 for $400, it belonged to the dude’s brother who was in jail
samandiriel@reddit
Oddly enough I just picked one up for US $1K last week on eBay. Was from Edmonton AB. EVGA 3090 rtx ftw ultra. Runs beautifully, great benchmarks
Vaguswarrior@reddit
Hilarious as I'm in Edmonton
samandiriel@reddit
Hey there comrade! I originally hailed from Bonnie Doon/Whyte ave myself, before getting sidelined by a job offer in the US.
FWIW this is the ad I got mine from, and the guy said he sells them off and on - you might follow his him and see if he has any more listed in the future. The one I got was grade A.
https://www.ebay.com/itm/127790141286
Boricuakris@reddit
I found a 3090 on eBay for $950
Vaguswarrior@reddit
This is what I see when I look.
Boricuakris@reddit
I put an offer in on the ones that say “or best offer”. That’s how I got it for $950
Fantastic_Tell_6787@reddit
From one of the many zero feedback accounts on there? 🤣
NotYourMothersDildo@reddit
Marketplace in Vancouver has them going for 1300+ cad.
Valuable-Run2129@reddit (OP)
Paid $4700
1 kw running 24 hours a day costs $4300 a year. It is a factor I have to compute.
LTJC@reddit
There are advantages to going with newer architecture. Power usage, resale value, driver shelf life. Try to use models that fit in your vram and you'll be ahead of multi-3090 gpu systems which aren't using nvlink.
Prudent-Ad4509@reddit
Well, almost that. If you use smaller models, then multi-3090 can work fine in parallel without TP. 4x3090 with massive parallel use would be roughly comparable to one 96gb blackwell, except for power requirements, especially if you do not need prefix cache.
I've skipped a lot of caveats. We all know them. But small models are a special use case anyway.
SteppenAxolotl@reddit
FP8 & NVFP4
LTJC@reddit
Yup. I dont have a Blackwell setup yet but I have both Amp and Ada so far. Im working on a 2nd 4090 48gb and a 48gb Blackwell so I have 3 pathways for customers.
darktotheknight@reddit
Driver shelf life will be good. Nvidia is reportedly bringing back 3060 for gamers this year (https://www.pcgameshardware.de/Geforce-RTX-3060-12GB-Grafikkarte-279277/News/Wiederauferstehung-verzoegert-sich-Juli-1530014/). I don't think they'll cut driver support anytime soon for RTX 3000.
LTJC@reddit
Nice cause I just put a P40 or some other 8g card in my plex server. Lol
Maybe they won't kill it soon.
rpkarma@reddit
Yeah preach. Blackwell beats Ampere pretty handily.
a_beautiful_rhind@reddit
Are you going to inference 24/7? The cards idle around 30w after they've been disturbed. Your blackwell is probably the same.
epicskyes@reddit
Blackwell idle is 15w
a_beautiful_rhind@reddit
Even after you load/unload models? On linux my idle starts around 15w too, then it grows.
epicskyes@reddit
Depends what you have your bios PCIe settings and dynamic speed link settings set to
a_beautiful_rhind@reddit
In my case I have to use the P2P driver that maps a lot of memory to gpu memory. I turned on everything I could. The non rebar 2080ti in the system idles much better.
phido3000@reddit
A reasonable price for a capable card.
3090's aren't getting any younger or faster, and 24Gb is limiting. For a single GPU setup its way more capable. Value for money, its not as good, but it is in a different category. It will likely have decent resale value should you want to pass it on.
4 way GPU is always expensive. Does your current setup have a 2400w PSU and 4 x triple slot x16 slots? No, well that costs money.. Bigger case.. or some sort of janky bitcoin miner setup with wires, extenders, splitters etc.
People always say they would buy 4x3090s, but then what? Spend $1000 converting them to two slot coolers? 2400w PSU costs how much? Cabling? An eypc motherboard and CPU and 512GB of ram? How much is that? $3000?
Even then 48Gb of continuous memory beats 2x24Gb memory.
Also there is nothing stopping you from having 1 x rtx5000 and 3 x 3090s. Nothing at all. Or a RTX 5000 Pro and a 5090. Or a 5070Ti. So the equation becomes more complicated.
For coding and running smaller models, or creative work, the 5000 is better, for sure. You will get higher throughput.
tomByrer@reddit
I own 1 RTX3090 (& RTX3080), & would get the 5000 over 2x if I had the cash for that. You have access to NVP4 quants I don't.
You can undervolt GPUs to lower power, reduce heat & wear & tear.
mister2d@reddit
You can limit the power of the GPUs to something manageable. I would. I prioritize VRAM over t/s.
LargelyInnocuous@reddit
Yup anything over 20tk/s is usable, 50+ is great and 100+ is amazing. My target is 50tk/s with good models like qwen3.6 27B Q8+. That being said I have an RTX 6000 Pro on loan and it’s amazing.
king_of_jupyter@reddit
I crunched the numbers recently and if you compare int8 on 3090 vs nvfp4 on Blackwell, it is more or less equal
oxygen_addiction@reddit
There are few "proper" nvfp4 models in that parameter size range.
king_of_jupyter@reddit
Eh, fp4 is the future is now.
Everyone is running frakenquants anyway
Current_Ferret_4981@reddit
Even fp4 isn't nearly the speed of nvfp4 on true (server) Blackwell
randoomkiller@reddit
you never run it 24/7, the actual usage is possibly less than 5-800
Client_Hello@reddit
Bad math there. Dual 3090 may draw 300w more than 5000 pro, you will not run them 24/7/365, and your nighttime rates should be less than daytime. Difference is closer to $430 per year.
MarcusAurelius68@reddit
Are you pegging the GPU 24 hours a day though? When idle it should consume a LOT less.
Antoniethebandit@reddit
I have picked up two EVGA RTX3090Ti Hybrid water cooled used but with 1 year warranty still, for 1800 USD but it was pure luck indeed
starkruzr@reddit
llama.cpp is probably better at doing TP for one model and a single requestor across 4 cards than vLLM, right? (just asking bc of your flair)
wu3000@reddit
I dont have exact numbers but the GPU is really fast. 72 or 96 gb would be nicer, but that comes with a price tag. vllm is much better than llama.cpp in my coding use case with Blackwell. Qwen 3.6 35b is incredibly fast (220 tg/s), 27b in fp8 needs some parameter fiddling (spec decoding with n=3, 80 tg/s). the 27b is my daily driver now and after 600 mio tokens generated, i am still happy with the purchase.
Valuable-Run2129@reddit (OP)
When it arrives I’ll definitely pm you to ask for advice if you’re ok with it!
codehamr@reddit
Obviously more bandwidth and lower VRAM, so if your LLM fits, great deal!
alrojo@reddit
What apple sillicon are you using? The new M5 Max with 192GB unified memory is quite potent.
Valuable-Run2129@reddit (OP)
M1 ultra 128gb, but prompt processing was atrocious
alrojo@reddit
128gb is great! There has been significant development in both CPU and GPU between M1 and M5. You might also have to dabble a little in MLX for the speedups. However, with the new CEO of Apple I'm confident it will be a good investment of time.
More-Curious816@reddit
Yes, he is a hardware guy, and they already building a beast AI server-grade accelerator chip, baltra, hopefully that knowledge trickled down to the consumer hardware or even the workstation hardware [their ultra edition].
I'm totally fine with the studio version get double the size and double the draw power to get more raw compute. It's supposed to be desktop not portable computer a person take to Starbucks
unjustifiably_angry@reddit
The 3090 (and 4090, and 5090) owners are coping, you made a good purchase that puts you in the category of being able to competently run many of the best mid-sized models, beyond which the size gets much much larger with rapidly diminishing returns in terms of output quality. That's a fantastic card for Qwen3.6-27B or 35B.
There are many ways a split VRAM pool is limited where a single large VRAM pool isn't. For many tasks aside from LLMs, you need all your VRAM in a single card. Image generation, video generation, etc.
The situations in which two cards are equivalent (or better) are often on paper only, such as greater concurrency - which serves little or not practical value. Virtually all LLM tasks are single-worker. This will eventually change but probably not anytime soon, at least not in an especially useful way.
Each additional card you add decreases output speed versus a single card with the same performance but more VRAM; the two GPUs don't directly share the workload, they hand off the job when it's half done.
If AI is important to you, then if at all possible I would suggest not using the 5000 Pro as your full-time display GPU. Keep its VRAM completely clear so you know down to the megabyte exactly how much space you have to work with. You spent a lot of money to get 48GB of VRAM, don't waste any of it rendering your desktop, browser, etc.
sleepy_roger@reddit
No one has asked but which 5000 pro? Assuming the 48gb, but the 72gb also exists.
For 5k for the 48gb one honestly I'd opt for 2 5090s it 4 3090s.
For 7k for the 72gb version... Is probably still opt for 5090s
Valuable-Run2129@reddit (OP)
The 48 gb for $ 4700. It’s just $1100 dollars more than a single 5090 atm
sleepy_roger@reddit
Ah ok yeah I'm spoiled and got my 5090s for msrp (2k) but now the cheapest is like 3,200. You did well.
I still prefer the 3090s but you can't go wing either way.. only thing to keep in mind if you go to but another you might as well just sell the one you have and go for an rtx 6000 pro since your getting more performance along with more vram
relmny@reddit
3200? I can't find a 2-slot one for less than 3900...
sleepy_roger@reddit
$3200 is the cheapest I have available at Microcenter, 2 slot 5090 tmk is only 5090fe but those don't really exist at MSRP unfortuntely.
Organic-Thought8662@reddit
*Raises hand* I'm a silly-billy that bought an RTX PRO 5000 48GB new.
Do i regret it? No.
I have it paired with a 3090 in am AM4 system.
For models that fit exclusively in 24GB, the PRO 5K wipes the floor in PP and TG. (however with TG on something like a Q8_0 quant, its a little closer)
Below are direct benchmarks done with the latest as of 3 May 2026 build of koboldcpp.
3090:
PRO 5000:
The painful part when doing agentic coding is mainly from the PP speed, and at nearly double the throughput, its a very nice upgrade considering its only a 300w card vs 350w for the 3090.
michaelsoft__binbows@reddit
i feel absolutely spoiled over here with 3x 3090s acquired at $600 each and 5090FE for $2k from best buy (with a warranty and everything)
abnormal_human@reddit
I would much rather rather have a 5000 pro than 2 3090s.
michaelsoft__binbows@reddit
would rather have one than three 3090's...
Bootes-sphere@reddit
The 5000 Pro is genuinely a solid choice for local inference. Better memory bandwidth, tensor performance, and it'll handle larger models more efficiently than dual 3090s despite lower raw FLOPS on paper. That said, real-world gains depend heavily on your workloads (batch size, model size, precision). For pure single-inference speed on smaller models, you might see the 3090s competitive, but the 5000's architecture wins on scaling. Have you benchmarked it yet on your typical models? That'll give you the clearest answer on whether the investment paid off for your use case.
Eyelbee@reddit
Rtx 5000 pro is basically a scam for current prices. You could get four r9700s. I understand 3090s got too expensive but I honestly still can't justify it.
And no, they would not burn 3x the electricity, you could set a power limit at 250W per card at no loss. Which would be like 500W total compared to 300W.
I would simply just buy one r9700 unless you really need the extra 12GB for your specific workflow, you could pretty much do the same stuff with it.
DeepOrangeSky@reddit
Does this mean that the 3090s can be power limited more severely than the 5k can? Like, can he also just power limit the 5k to a similar degree as what you suggested about the 3090s, for a similarly minimal performance drop in equivalency? Or are the 3090s able to be power limited a lot more, ratio-wise per amount of performance drop compared to the 5k?
Eyelbee@reddit
Exactly, 5000 Pro are strictly engineered to be power efficient and sit at the absolute sweet spot, if you drop the power limits it'll hit the performance a lot. On 3090, you get the best t/s at around 270W to begin with, at 250W you are sacrificing like 1-2%, probably can go lower too.
relmny@reddit
I think it was a good decision (although I'm partial because I'm trying to decide between the pro 5000 and a 5090)
Power consumption, cooling, newer architecture, being able to run bigger Diffusion models, etc make it a good decision...
Valuable-Run2129@reddit (OP)
Thanks for taking the time to write this comment. It’s the type of information I needed. It’s comforting.
I think it was the right decision at the end.
Organic-Thought8662@reddit
As a PRO 5000 owner, i was hesitating for weeks before i finally bit the bullet. Mainly because there seems to be very few reviews or benchmarks of the card online. Hope my post from earlier helped 😄
PassengerPigeon343@reddit
I’d go with the RTX 5000 Blackwell any day over my 2x3090s. It’s on my watch list to hopefully pick up one some day. Newer architecture, higher memory bandwidth, more efficient. It’s excellent across the board. A good buy on your part.
Clear-Ad-9312@reddit
Yeah, but for double the price you could have gotten a rtx pro 6000. Which is why 3090 is still great buy, decent performance at half the cost of the newer GPUs. the consumer "blackwell" GPUs use similar sm89 instruction set that the rtx 4090 has. The performance gains that make the blackwell powerful are only included in the b200 and b300.
I guess you already bought it. You will stick with it for a really long time, at least that is a plus, but that nagging feeling will always be in the back of your mind about the price and waiting for something better.
cicoles@reddit
The power saving and cooling is real. The RTX 5000 Pro is good. I sold my dual 3090 (with SLI) as well for a RTX 6000 and everything runs a lot cooler.
MentalStatusCode410@reddit
It was a very fortunate and wise decision - you have native FP4 acceleration.
It will be approx 7x faster when running an optimised NVFP4 model.
Thrumpwart@reddit
Sweet GPU. The twin 3090's narrative is being driven by people trying to unload their 3090's on Ebay I suspect.
MexInAbu@reddit
Unfair. Is still the cheapest 48GB Cuda option available.
Thrumpwart@reddit
CUDA ain't the moat it used to be though. The 7900XTX is a killer GPU.
MexInAbu@reddit
Perhaps for LLM inference. But if someone wants to delve into AI and ML beyond that they are swimming against the current.
a_beautiful_rhind@reddit
I wish.. the prices keep going up. P40s/P100s crashed I think.
munkiemagik@reddit
The RTX 5000 Pro is objectively better than 2x3090 for my use, but is it +$3000 better? I definitely wouldn't see a worthwhile return for that extra spend. But then not everybody is me.
Hot_Turnip_3309@reddit
you're GOOD! Because of the cost of electricity, it'll work in your favor. You can even undervolt it a lot and get the same similar performance, and run it more guilt free. Good job!
Long_comment_san@reddit
No, it was a correct decision. You get a lot of VRAM with NATIVE 4 bit support. And it's a lot less hot and loud.
4 bit is a big deal nowadays. 6 months ago I did recommend 3090s myself and said that "hey you may want to consider Blackwell, 4 bit is gonna be a big deal..."
DeepOrangeSky@reddit
Well, you're definitely sitting pretty. Even right now, I think it is significantly better than the alternative setup. But as local vid gen (and whatever other non-LLM stuff of a similar nature that can't be split so nicely) keeps getting stronger, I think the decision will become even more and more correct, rather than less correct, over time.
Btw, how does it work when you buy a used card in regards to warranty, if it breaks, if it is still within the timeframe? Do you still get to send it in, or if you are the 2nd owner, are you just totally screwed, even if it happens while it is under warranty? I've never bought used hardware before, since I'm more paranoid than a schizo OD-ing on Datura and bath salts combined, if such a thing is possible, but I've been thinking about maybe doing some meditation and saying "ommmmm" and all that zen shit until I calm the fuck down enough to maybe buy some used cards and try building a rig and not just permanently be a mac n00b forever, but I don't really know anything about used hardware yet.
I'll probably spend so much time looking for tiny hidden microphones, or hidden malware piggyback chips or whatever that by the time I snap out of it and look up and am like "wait, wtf was I even doing? Oh yea, trying to set up an AI rig or something" it'll already be post-AGI and I'll be like, some elderly guy by then just floating around in a blue orb like Dr. Manhattan wondering wtf happened while I was busy under my rock with my magnifying glass that whole time.
Sorry, I'm a bit sleep deprived and I forget where I was going with any of this, but something about used warranties or some shit, lol
f5alcon@reddit
Warranty vs no warranty, could add an second 5000 pro later. You're choice is good
I-cant_even@reddit
5000 Pro always the better choice if the price is equal.
LA_rent_Aficionado@reddit
Not always, running model or tensor parallel across 2x 3090s for smaller models would likely be faster single stream and concurrently
traveddit@reddit
The rtx 5000 has more bandwidth so does the tensor parallel overhead with nvlink overcome that difference?
LA_rent_Aficionado@reddit
Training the pcie bandwidth may pose problems however not for inference I would suspect
I-cant_even@reddit
That's fair, I'm assuming OP wants the run as large as possible a model.
EbbNorth7735@reddit
2x 3090's in many systems is the max you can expand. Now you have the ability to add another 5000 or a 3090 when you catch the LLM bug
kaliku@reddit
If You're not making money with either setup, the spend is only hobbyist spend. So the 'worth' extra money or not is a stupid question because even the 2x3090 is not worth it, in the utilitarian way.
But as you got it for hobby, the 5000 is more powerful and flexible than the two 3090. And it gives you a nicer upgrade path.
So you did well, I'd say. Provided it's not your last money. Good for you, enjoy it.
hurdurdur7@reddit
I agree. If he feels happy with the 5000 - by all means, go for it.
dinerburgeryum@reddit
From my chair, the 5000 Pro is the far, far better choice.
Equivalent_Job_2257@reddit
There are pros and cons. You are not stupid. Some of the things I (multi rtx 3090 owner) cannot do - put all of the cache onto single gpu, or save for another fly and have 96gb vram in two slots.
CreamPitiful4295@reddit
You made the right choice on all counts. Faster. Single memory. Less expensive to operate. Relax
Double_Cause4609@reddit
Hm...
A) A single RTX 5000 Pro has about \~30% more memory bandwidth than a single 3090. If you're running in the LCPP ecosystem (LlamaCPP, Ollama, LM Studio, etc) you generally don't really get a speed improvement from multiple GPUs (you just pray not to lose speed from sharding the model), so you'd expect single GPU to be a bit faster, particularly for single-user
b) The 5000 Pro is more efficient in electricity, full stop.
C) The 5000 Pro supports better quantization schemes. If you ever want to branch into vLLM (totally viable for this GPU; you can run 32B coding agents at 8bit quantizations, like FP8 etc) you get pretty large effective speedups (And juicy NVFP4 support).
D) The RTX 5000 Pro has a better compute model; it'll scale to high compute bound scenarios a lot better than the 3090 I believe, and two 3090s don't perfectly compose to offset this.
Overall, the 5000 Pro has a lot of advantages and while the RTX 3090 route can work, it has its own disadvantages. It doesn't really matter which route you went; you'd have your own advantages and disadvantages on each (the grass is always greener), but I'd say from my perspective you picked a good option and I may honestly pick up one myself fairly soon here.
VoiceApprehensive893@reddit
yes its a bad decision now give me it
__JockY__@reddit
Great card. It’ll win against a pair of 3090s more than it’ll lose, it’s 300W vs 750W, it’s quieter, cooler, and fits in a smaller space. You made the right choice.
sputnik13net@reddit
I like the simplicity of a single card when it’s viable. The detail people leave out about multi gpu setups is there’s a setup and configuration cost
henk717@reddit
I have dual 3090's, for the Qwen3.5-27B I get 30t/s gen speed and 1082t/s prompt process speed.
My system doesn't lend well to llamacpp's tensor parralism mode though, so this is the single GPU performance.
Signal_Ad657@reddit
Raw performance one versus the other? You made the right call. If the economics don’t bother you the performance won’t.
Herr_Drosselmeyer@reddit
Overall, the 5000 PRO will come out ahead in the majority of scenarios.
tecneeq@reddit
You are better off with the 5000 Blackwell.
Icy-Pay7479@reddit
3090 is a meme but that Blackwell card isn’t 5 years old.