[-]

CountPacula@reddit

Wake me up when they actually put a useful amount of memory on the card. Two whole generations and only eight GB more memory? I don't care how fast the memory is, that's not enough of it, especially when you can get multiple 3090s for the same price as a single 5090.

[-]

ReasonablePossum_@reddit

They will be forced to put higher vram there oncw most games start intensively implementing llms and other AI for their regular functioning.

Sadly game devs are moving slow here.

[-]

evia89@reddit

I dont see AI in games (DLSS is not AI) for at least next 10 years. AI will be used to help create games but never run on user hardware

[-]

ReasonablePossum_@reddit

Never is a very strong word there. Nvidia themselves are creating the tech that will push this.

10 years sure. Probably a lot less tho, the new devs that got education in the age of AI will be implementing the tech in the next 2-3 years, but again, stuff is slow :(

[-]

cobbleplox@reddit

Games are just very tricky in that regard. The GPU the player has is already fully in use by the regular game. And I would not trust even some 30B to be reliable in whatever little feature I can construct around it. Response times are also completely awful for the instant reactions required in games. And and and and.

[-]

TheTerrasque@reddit

And I would not trust even some 30B to be reliable for whatever little feature I can construct around it.

It would have to be a tiny model, like 3b maybe. Perhaps if you fine tuned one a lot and put some guard rails it wouldn't be completely batshit cuckoo.

[-]

cobbleplox@reddit

Again this would be getting very close to "what's the point" again. Because first you have to make a proper finetune and related dataset, and then you are confronted with a rather stupid llm that will mostly parrot the training data anyway, and that is probably even the point. It just sounds so pointless to begin with. The day of these things will come, but it is when you can actually let them have the agency you would integrate them for in the first place. And that day is not here. At least that's my opinion. It's also why, to my knowledge, you can't even put something on Steam where a user can freely interact with an ai.

[-]

TheTerrasque@reddit

I've been toying with the idea, and yeah.. What's the point at that point. A 30b model - who could even run that game? "Oh, you need two graphics cards in your pc, and one needs to be a x090" doesn't seem like a good market.

To make something that could actually be played by a decent chunk of gamers, you'd have to have a tiny model. With finetune and some guard rails you might get something that's a bit beyond dialogue trees, but frankly not much beyond. Might be enough to add a bit of flavor to the conversations.

[-]

cobbleplox@reddit

Guard: "Sure, I can do that! The guard shouts "Who is there?" - Let me know if you need anything else."

[-]

TheTerrasque@reddit

I did some quick experiments with some small models and a simple prompt, and gemma2 2b has some nice flavor, but doesn't follow instructions well. Nemotron-mini followed instructions much better, but was very dry.

Here's gemma2:

Prompt

You're an npc in the world of Elder Scrolls. Here is your info:

Profession: Guard
Personality: Generous, Upbeat, Nice
Name: Elon
Place: Gates to Windhelm
Task: Guard the gates, make sure no criminals or troublesome people come in and cause trouble
Weather: Rainy, dark
Time: 7:32 pm
Knowledge:

* The stable man, Kerrig, is missing a horse. Maybe an adventurer can help
* There is a rumor Rorikstead was attacked by a dragon
* The Jarl is in a bad mood, and has gotten many reports lately
* Your wife hasn't been sleeping well last week
* The local inn, Boar's Head, has excellent meals and cheap rooms

Stick to previously mentioned facts and your knowledge.

A man approaches you, and you start talking with him. 
You haven't seen him before. Start with introducing yourself and greeting the person.

----

Conversation start

And then this conversation:

Guard

"Well met! Elon here at the Gates of Windhelm," I say with a broad smile, gesturing towards the entrance to the city. "What brings you to these parts tonight?" I give a quick glance to the darkening sky, its grey drizzle almost blurring the edges of the gates. "It's been a bit dreary out there, hasn't it?"

I raise an eyebrow at your approach and offer a warm welcome as I lean against my post, "Don't mind me rambling, just enjoying a peaceful evening before the hustle begins again tomorrow."

Me:

Hello, Elon. Lovely evening to stand guard, I must say. How's the day been?

Guard:

"Oh, you're right about that!" I chuckle, taking a deep breath and rubbing my chin thoughtfully. "It feels almost peaceful enough for a bard to set up in the courtyard and serenade everyone. But we do need to keep an eye on things. A bit of rain doesn't stop folks from getting themselves into trouble, does it?"

I turn to face you fully, a twinkle in my eyes, "It's been alright, I guess! Had a few whispers about some trouble brewing at the edge of town. Nothing too serious, mind you, just a couple of wayward adventurers looking for a bit too much excitement." Elon shrugs his shoulders slightly. "But what can you expect with all this rain, eh? A little adventure is just around the corner! How about you?"

Me:

I come from Rorikstead, I have a report I should deliver to the Jarl. About .... Well, that's for the Jarl to know.

Guard:

"Ah, Rorikstead!" Elon leans back against his post with an exaggerated sigh, "Right across the plains! Always good folks there, but sometimes those woods get too wild, eh? Not much you can do about it."

His eyes light up as he perks up. "A report for the Jarl! Now that's what I'm talking about. A bit of intrigue keeps things lively here at the gates." He gives a playful wink. "Alright, let me guess... dragons, bandits, maybe even a runaway cart? Just don't tell him I told you to be on the lookout!"

He nods with an approving smirk, “Just remember: honesty is always the best policy. The Jarl's a fair man," Elon pauses for a moment, tilting his head in thought. "But he ain't afraid of a little drama either."

"So tell me, how much trouble are we talking about here?" He looks at you intently, curious to know more.

Me:

I glance up at the sky, and the rain that keeps pouring. I sigh. Elon, I'd love to share my news with you, but it's not for me to share. That's up to the Jarl

Guard:

Elon leans back, a thoughtful frown creasing his forehead as he observes the swirling raindrops. "Right, right," he murmurs, scratching his chin thoughtfully. "The Jarl has his own way of handling things, you know that. Not everyone needs to have all the answers, and sometimes a little mystery keeps things interesting. It's how I stay sharp!"

He gives you a sly grin, "Besides," he adds with an exaggerated sigh, "Who wants to be bothered by boring rumors? A bit of intrigue makes life more exciting, wouldn't you agree?"

Elon looks at you with his signature hopeful smile before he shrugs again. "Of course, I understand! There's a certain respect for the Jarl's authority, even if it can be frustrating sometimes." He pauses, his eyes widening slightly. "Hey, have you considered stopping by Boar's Head? It's got some excellent stew that will warm your heart and soul, especially after a long day of dealing with... well, you know."

He chuckles to himself before taking another look at you. "But if it's really important, I'll be right here, ready to listen if you change your mind! "

It's not terrible, but not great either. A small model gets easily confused, and this one seem to forget the facts. And with such small models they easily get confused if the prompt is too big. Still, I don't think it's impossible. This model would take about 2.6gb gpu ram, and still performs kinda meh and not really usable.

[-]

GetOutOfMyFeedNow@reddit

Ironic and funny that the Guard literally became the AI chatbot at the end 😂

[-]

ReasonablePossum_@reddit

Story progression decisions and changes depending on user input and actions can be placed on a small LLM, optimization can be used with the new real.time.upscaling; future VR rendering can be managed via gpu dependant rt apps, gfx like particle/physics simulations can be managed there as well.

I mean lots of stuff that already exist on paper or prototype level testing will use VRAm that as you mentioned, is currently fully devoted to run the game itself, and will be gradually implemented out there.

[-]

cobbleplox@reddit

Story progression decisions and changes depending on user input and actions can be placed on a small LLM

See, one starts to lock it into rigid structures so it can't fuck up. Isn't it interesting how that then immediately sounds like something you could do without any llm?

I guess you could do a more modern version of these adventures where you have to enter text like "take soap", leading to perfectly predefined stuff, sure. But that's not what anyone expects from the groundbreaking AI-games, I would say.

[-]

ReasonablePossum_@reddit

It would be a lot cheaper on the dev side to have it done by an llm rather than programming everything by hand. Plus this would leave a lot more creative space instead of limiting itnto the man hours thatnwere busgeted for that.

[-]

cobbleplox@reddit

Integrating an llm into a proper game, where you already had to lock it down in a rigid system to only do stuff you planned for is a lot of things, but certainly not saving time. And if you consider the missing 1000 lines of if's to be so much work, and you think so highly of llms, you might as well let an llm write them.

[-]

ReasonablePossum_@reddit

Im not talking about 1000 ifs (thats a simple boring classic tree). Thats my point, 10-20x thay amount. And im not talking about the devs integrating the llm, but one in the gpu, used for all games in annegual degree with a standard framework.

[-]

cobbleplox@reddit

The 1000 ifs where sort of a metaphor. And the llm gets into that gpu by the devs integrating it in the game.

[-]

ReasonablePossum_@reddit

Its a waste of resources that way tho. A lot more effective to have a native llm on the gpu thats optimized and ready to accept certain types of problems

[-]

Standard-Anybody@reddit

I know for certain that AI is being used in games right now that you almost certainly play. Just for not what you expect. There are models running in games that do things like show recommended items to buy, etc.

[-]

evia89@reddit

Like in lol and dota? I think it runs on server

[-]

satireplusplus@reddit

Blockbuster games consistently push the state of the art. People bought new RTX cards to play the first raytracing games. They will buy nvidias 6090 plus or whatever to play the next blockbuster game with AI characters in it. Realistically, games using LLMs will also have a "run it in the cloud" option until large VRAM cards will be the norm.

[-]

Desm0nt@reddit

It's good that game devs moving so slow. I don't want to pay 2k+ USD each year just to play new games. If top tier cards become 1-2 years card with such a high price - it would be even more awful situation than we have now (and it's make all low and mid tier cards almost useless trash even for gamers, that probably impossible).

[-]

Persistent_Dry_Cough@reddit

The functionality would improve. You could still play games at the current level of functionality with the older card.

[-]

IntelVEVO@reddit

Nvidia wants you to fork out 10k for the RTX pro 6000 96gb

[-]

CubicleHermit@reddit

This is a gaming card. Later in the year, you can safely expect there will be a much pricier workstation version, probably with 64GB.

[-]

Rude_Assignment_5653@reddit

Well we are in the darkest timeline lmao

[-]

strawboard@reddit

I think VRAM is the most scarce resource in the universe which is why they can’t put more in video cards.

It’s extra funny because there is huge potential to use AI in video games for all sorts or things like NPC dialog, dynamic texture and model creation.

Nvidia being stingy even with their highest end card is infuriating.

[-]

Amoral_Abe@reddit

Nvidia is likely being stingy because of AI. They are the primary company selling super high end AI cards for 10s of thousands of dollars each. They have to be careful not to cut into that market since that's where their big money is made.

[-]

b0tbuilder@reddit

You would think this would cause AMD to change the game by making more VRAM available. It would be far more practical in RPGs for character dialog and more immersive. Unfortunately they too are chasing the datacenter margin dragon as well. The only way this will change is if enough bleeding edge capacity becomes available at foundries to split production with the datacenter stuff on a more advanced process node and to keep consumer GPU on an older node that is considerably less power efficient. However this is a simple thing to suggest and far more difficult to execute on.

[-]

adrenalinnrush@reddit

Honestly, it's good that they're clearly separating the market. Otherwise it would be like the crypto mining shortages all over again. Do you really want that?

[-]

strawboard@reddit

Yes we know. Everybody knows, and we neednto call them out on it as much as possible.

[-]

b0tbuilder@reddit

You are very much correct. The thing that makes vram as rare as unicorn blood is the fact that adding vram causes competition with datacenter hardware. It is super annoying because in many cases, more vram can be added without massive cost. We are all beholden to data center product margins

[-]

entmike@reddit

Not to defend Nvidia, but the consumer/prosumer line of GPUs will never satisfy all AI use cases we wish. I for one am happy to have a 32GB ceiling on a single GPU over the 24GB ceiling, assuming it doesn't cost as much or more than the A6000 I've been looking at the last 2 years lol

[-]

lolzinventor@reddit

For some applicators like stable diffusion it's not as easily to split a model over multiple GPUs. This makes the extra 8GB it worthwhile imo.

[-]

OutOfBandDev@reddit

Sticking with the just make a larger card will not fix the issue. Forcing people to use less resources is where we get real improvements. Why do you think deepseek-r1 is revolutionary.

[-]

SocialDinamo@reddit

I completely feel where you’re coming from but it wasn’t until I started playing with p40s until I realized that it doesn’t matter how big the VRAM is if you are super slow running those bigger models

[-]

Persistent_Dry_Cough@reddit

In fact, bigger VRAM is the ONLY way you will improve the inference speed on the big models. Widening the bus to the maximum width possible should be a given. Bus width is based more on the configuration of the chips on the board rather than a costly to manufacture upgrade of the memory controller. The most important issue with a GPU is that if you have to swap some of the model into system RAM or god forbid NVME then you're screwed on inference speed.

[-]

katiecharm@reddit

I know, this is really damn sad. There should have been 48GB minimum in that card, and they could have even surprised us with 64GB.

[-]

barefoot_twig@reddit

Nvidia specifically wants to make three different types of people pay three different types of money. They want the AI people to pay the most, Crypto to pay the middle and gaming to pay the least. The problem is that they must create products that, for example, cater to gamers, while also not eating into their AI cash cow. If a 5090 could handle AI at scale, rest assured Meta would stop paying $40k for H100s (or whatever).

The low memory is intentionally freezing us out for exactly this reason.

[-]

satireplusplus@reddit

So, they can't start putting huge amounts of VRAM on their gamer-targeted lines.

Until a competitor comes along and offers more VRAM for less money. AMD and/or Intel need to get their shit together, now's the time to eat into NVDA's moat.

[-]

Massive-Question-550@reddit

True, you can see the weird matches behaviors by these companies which makes you think they have some sort of agreement going on or they just all have the exact same mentality, that or there is one company acting as the leader in the trends and everyone just follows.

[-]

robogame_dev@reddit

From raw memory perspective, $6k will get you a Mac M2 Ultra w/ 192gb of unified memory - it’ll be slower than a 50XX setup for models that the 50XX can fit in its VRAM but a lot faster for models that the 50XX cannot.

[-]

Massive-Question-550@reddit

Same thing with cpu's. Why the hell are we stuck with dual channel memory? Finally laptops and all in one workstations are increasing it with 8 channel memory so gone will be the days of overpriced thread rippers and maybe we can get quad channel memory as a standard.

[-]

Billy462@reddit

I'm not sure anymore. The datacentre stuff by this point is so differentiated that if you're seriously doing model training you need all the additional stuff like interconnect. More VRAM on a 5090 wouldn't change any of that.

For inference a bunch of other stuff is coming online and that might just move away from GPUs completely quite soon (at scale).

[-]

Igoory@reddit

Yeah, the data center stuff also has a much lower energy consumption than the gaming GPUs.

[-]

fullouterjoin@reddit

It isn't because they are running them 24/7 that power matters. They are power limited in their DCs and they can't exceed the HVAC limits. If they could just trade money (heat/power) for more compute they would.

[-]

dantonthegreatdanton@reddit

This is a the answer, hpc clusters, dcs and private clouds buy infra by the btu and energy requirements.

I just bought 16x h200 for my private cloud and use 4090s for my personal rigs. The lack of interconnect and proven solutions for a larger cluster from consumer cards is a hard pass.

The tdp difference is real and breaks the wallet at scale.

1 megawatt of capacity would get you roughly 201TB of VRAM for h200 cluster and about 53TB for both a 4090 and 5090 cluster with a fraction of performance.

That is not including any other equipment needed to make it work in which case that vram number for the 4090/5090 goes lower much faster than the h200.

[-]

smflx@reddit

Did you buy h200 sxm or nvl? Could you let me know the prices?

I waited b100 but it seem not available in PCI card form. So, considering H200 too.

[-]

evia89@reddit

Downvolt, limit TDP and overlock VRAM takes few minutes...

[-]

stylist-trend@reddit

For the least energy consumption, you could also just completely unplug the GPU.

Otherwise, I assume Igoory wants energy efficiency, and you're not going to get anywhere near H100 efficiency by just downvolting and overclocking.

[-]

great_waldini@reddit

For inference a bunch of other stuff is coming online and that might just move away from GPUs completely quite soon (at scale)

Curious if you could share some examples?

[-]

ozzie123@reddit

This. If 5090 is 48GB, these AI companies will not suddenly buy 5090 in bulk because they need multiple 5090 and the inter-GPU bandwidth is not there.

[-]

Desm0nt@reddit

If Nvidia make something like 4090 Super Ti/ 3090 Super Ti with 48gb but old GDDR6 (not even 6x) - company will not be interested in it due to slow memory (compare to HMB2) and high power consumption. But for home usage as just 1 card in gamer PC - it's reasonably good. And can be produced in limited amount on old hardware remaining in stock and sold at an increased price (it's limited!) but cheaper than new.

With current prices for 5090 - in 1-2 years it will be more interesting and profitable to take the old used A100 on 80gb HMB2 for the same money. With more (and still fast) memory and reasonable power consumption and heat dissipation.

[-]

barefoot_twig@reddit

You're right that Meta wouldn't but some companies might. Take that number to 64GB and even more. Up to 80GB.. even more.

[-]

OnedaythatIbecomeyou@reddit

I dunno honestly. I agree somewhere around 64GB onwards, but I'm not convinced at all that 48GB would be a business model breaking decision. Surely the only businesses that would do this would be 60year old transport company directors who pass the 5090 to their web developer and think he's gonna produce autonomy on the roads haha.

[-]

jimmystar889@reddit

48gb would certainly drop being from a6000 to 5090

[-]

OnedaythatIbecomeyou@reddit

But by how much are you speculating?
Any AI specific performance nuance that balance out ddr6 vs ddr7 difference between the two?
There'd be massive issues sourcing them, driving scarcity even further upwards & resorting to scalper prices reduces the bang for buck argument?

All three are questions, not arguments. I have only consumer AI experience and ollama installed lol.

Also it's a 4, going on 5 year old card though, it's reasonable lol.

[-]

jimmystar889@reddit

I meant the a6000 ada. 3 years old, but still

[-]

killver@reddit

You can definitely train more cost efficient on those 5090 also on large scale. The only thing holding up some would be the TOS that do not allow this to some degree, but many wont really care.

[-]

CubicleHermit@reddit

The comparable one is the workstation RTX 6000 Ada, with 48GB today; they run about $7500.

I'd expect the RTX 6000 Blackwell to be the same 2:1 memory, so 64GB. Probably not available for a while except preinstalled in a workstation machine, but they'll hit the retail market eventually.

[-]

Bougie_Mane@reddit

"For inference a bunch of other stuff is coming online and that might just move away from GPUs completely quite soon (at scale)."

I'm interested in reading up about this. Can you point me in the right direction? Is it ASICs like crypto mining?

[-]

barefoot_twig@reddit

Training isn't the only thing they do

[-]

TheProgrammer-231@reddit

I think nVidia doesn’t allow consumer cards in data centers though.

[-]

martinerous@reddit

Right, Nvidia could easily find other (legal, etc.) ways to deny AI data centers from using consumer hardware.

[-]

satireplusplus@reddit

Tell me how that's going to be enforceable. Spyware in the GPU? lol...

[-]

CautiousAd4407@reddit

They simply won't sell to you, forcing you to give up whatever bulk discounts exist at those scales.

[-]

emprahsFury@reddit

how about mandatory 3 slot cooling

[-]

satireplusplus@reddit

nobody that trains models uses windows

[-]

martinerous@reddit

Not in "underground" (or China) data centers, of course. The "normal" data centers won't risk going illegal.

[-]

satireplusplus@reddit

They can put anything they like into their EULA's, doesn't make it enforceable and doesn't make it legally sound. At least in the EU, if you buy a GPU you can do whatever the hell you want with it.

[-]

emprahsFury@reddit

it does make it enforceable when they find out what's happening. "We used 10k nvidia gpus to train the monster" Ken later that day... "I don't remember selling 10k H-series to Meta"

[-]

AmericanNewt8@reddit

Funny because that's how all the Chinese players are basically working now lol.

[-]

C_Madison@reddit

As satireplusplus wrote: Not in the EU. Zuck can train their things as long as they want in the EU with nVidia consumer cards, no matter what nVidia says.

[-]

emprahsFury@reddit

As dr evil in america said "Youre just not getting it are you?"

[-]

No_Afternoon_4260@reddit

Yeap

[-]

barefoot_twig@reddit

Nvidia has all sorts of restrictions, but where there's billions of dollars a year in savings on the line, there are also clever people to solve that problem

[-]

Physical-King-5432@reddit

Nvidia cards are not that useful for crypto anymore. The best miners use ASICS these days

[-]

Background_Army8618@reddit

Correct, mostly - ASICs replaced GPUs for bitcoin over 10 years ago. The "recent" crypto mining shortage ended 2-3 years ago because Ethereum switched to staking, it was not replaced by ASICs.

The comment you responded to is outdated, at best, but incorrect regardless. Crypto mining is not a factor for Nvidia's market or consumer base in any way this point.

[-]

zakkord@reddit

The "recent" crypto mining shortage ended 2-3 years ago because Ethereum switched to staking, it was not replaced by ASICs.

Ethereum only provided $2.3 million issuance every day for the miners, the altcoin market is several times bigger today but most of it gets eaten up by ASICs(not Bitcoin asics, altcoin asics). And the new ones are coming out with the prices in the 10000 dollar range. His statement is 100% correct.

[-]

TheThoccnessMonster@reddit

No one is using GPUs to mine coins really though was his point and that’s correct.

[-]

barefoot_twig@reddit

I'm no expert, but I heard that alt coins, electricity costs and the strength of the market all play a role in whether or not GPU mining rigs are profitable or not at any given time. Even just people who view them as "cheap heaters" in the winter

[-]

kurtcop101@reddit

There's no serious money there, anyone with genuine money isn't running GPUs for pennies an hour gambling on crap coins. Or as a heater, it costs you more than just running the furnace if you have any real GPU such that Nvidia cares about your market segment.

Generally you're looking at gamers who have a GPU for gaming already, that are mostly uneducated about the crypto market, usually 18-25 and bought the GPU with early savings because they live or lived with parents, trying to make a few extra bucks, especially because they don't pay the power bill and they don't get to control their thermostat.

Or some variation of that.

The moment you pay your own power bill.. different story, even in my area where it's cheap.

[-]

great_waldini@reddit

I don’t think it’s so contrived. HBM is something a component in very high demand and the worlds production capacity for HBM is finite. Hence they sell it sparingly and charge accordingly.

[-]

bplturner@reddit

Gaming - Professional - AI

[-]

Ummite69@reddit

I completely agree. That’s why I keep wondering: why doesn’t AMD go all in with something like 64GB or 128GB of GDDR6? If I’m gaming, I’d choose the 5090, but for AI, AMD would be my go-to. Sure, some argue that AMD’s drivers aren’t as polished, but if they offer powerful, AI-capable cards to the general public, open-source innovations would likely emerge quickly.

[-]

pc_g33k@reddit

I hope game developers will push higher resolution textures and other things that require a larger VRAM in their next generation games, that NVIDIA had no choice but to add more VRAM to their gaming graphics cards. 😂

[-]

larrytheevilbunnie@reddit

I'm just happy the 5090 has more vram than the 4090 (I know this is an abusive situation)

[-]

trailsman@reddit

Besides that I think they are handicapping cards, given no real competition, so they're not future proof for more than 1 gen, guaranteeing themselves upgrade cycles and this recurring revenue.

[-]

aprx4@reddit

They dont do that because they know 48 or 64 GB with that memory bandwidth would cannibalize a lot of their own professional cards.

[-]

Desm0nt@reddit

They can reduce memory bandwidth for high-Vram card =) And limiting PCI-E lines. They already do this for 4060ti for example. Nothing can stop them from making two card series - fast one (for gamers) and big one (for home ML) and grab money from both separately.

[-]

katiecharm@reddit

So the problem is monopoly. Got it

[-]

hackeristi@reddit

Yeah, why do you think Nvidia employees are all millionaires…giving us 48/64gb cards is not going to make them richer. Come on do, not be selfish. Sorry Jensen. He did not mean that.

[-]

Specific-Local6073@reddit

Multiple cards draw multiple power.

[-]

infiniteContrast@reddit

For local use power is not a problem.

[-]

Enough-Meringue4745@reddit

Hmm? What do you mean? North America only has 1200w outlets 😂

[-]

RyiahTelenna@reddit

North America has 240V outlets. They're not that expensive to run.

[-]

Enough-Meringue4745@reddit

Not standard they don’t. Most households only have one or two outlets that run at 240v.

[-]

RyiahTelenna@reddit

Correct. They're not standard but they're very affordable to have run.

[-]

Enough-Meringue4745@reddit

I did one for my EV but in a condo it’s much more difficult lol

[-]

thedudear@reddit

1800w.

[-]

Enough-Meringue4745@reddit

Max 12 amps from a single standard 120v15a outlet.

[-]

thedudear@reddit

Which would be 1440w. And if the outlet is 120v15a then it's 1800w.

Nothing you have said agrees with any other part.

[-]

CubicleHermit@reddit

Should only continuously load to 80%. Nominal 120V can be as low as 110, so 1300W is a pretty typical nominal limit for a regular domestic outlet.

I ran a dedicated 20V circuit for my home server. 1500W is the largest PSU I've seen from mainstream brands, but there are 1800W from minor ones.

[-]

Enough-Meringue4745@reddit

Appliances should never draw a constant 1800w. Look it up.

[-]

PikaPikaDude@reddit

You'll then have to plug your 6090 in on a different circuit from the rest of your pc and peripherals.

7090 will come with a custom made circuit combiner from NVidia that burns your house down.

[-]

No_Afternoon_4260@reddit

At some point they'll ship with a small portable nuclear reactor

[-]

Specific-Local6073@reddit

Of course it is a problem. Electricity bill is the problem.

[-]

satireplusplus@reddit

Of course its a problem. More watt = more heat = higher electricity bill.

[-]

TurpentineEnjoyer@reddit

3090 is a 350 watt card you can undervolt to 300w and retain 95% performance, the 5090 is rumoured to be at least a 550w card, no information available yet on undervolt performance.

It all depends if the speed/available vram is worth the watts.

[-]

satireplusplus@reddit

You can also make it use 240w with 90% perf. It's consuming 350 watt so that it looks good in gaming benchmarks.

[-]

Distinct_Ad3556@reddit

575w actually 😂

[-]

unlikely_ending@reddit

This

[-]

salec65@reddit

My guess is they are already looking at potential 5080 ti and 5090 ti SKus that would be 24g and 48g respectively next year but at a higher cost and only if the market pushes them for it. Otherwise we'll likely see workstation gpus w/ 48/64gb come out in their place but at 5x the cost.

[-]

devnullopinions@reddit

They won’t because if you want inferencing they want you to buy their way more expensive professional cards.

[-]

2str8_njag@reddit

Guys, they will not give us more than 32 gigs of VRAM for a couple gens in GeForce. It's a gaming card and you all should understand this. There's absolutely no reason for Nvidia to do this, like at all.

[-]

ItsAMeUsernamio@reddit

They not giving enough VRAM to the gamers too, all the mainstream cards are stuck at 8 and 12GB.

[-]

entmike@reddit

8-12GB? What do you mean? The 5080 I thought has 16GB and isn't that mainstream? That's not the mention the 5090 having 32GB and 3090/4090s having 24GB.

[-]

noiserr@reddit

5080 will be over $1000. That's not really mainstream for a PC component.

[-]

ItsAMeUsernamio@reddit

The 80 cards were always considered enthusiast level, above which would be the Titan (renamed as 90 since the 30 series). The most popular cards are by far 50 and 60s according to Steam hardware surveys.

https://store.steampowered.com/hwsurvey/Steam-Hardware-Software-Survey-Welcome-to-Steam

[-]

n8mo@reddit

At the price the 5080 is projected to land at? No, that will absolutely not be mainstream. Hell, I haven’t seen the steam hardware survey for this year, but I’d bet most people are still running 30 series and below.

You can build an entire PC (AMD GPU included) for the price of a single high end NVIDIA gaming card.

[-]

old_leech@reddit

This is where I disagree.

There IS a reason -- and that reason is skill adoption.

As a regular Joe, I can justify an expensive card that serves multiple interests. If I were in school and wanting to learn to stand up architecture, I could justify the investment for a card that allows me to both learn and game. Having access to the tools ensures adoption and normalization of those tools.

But it becomes a problem when the tools are paywalled to such a degree that it bars entrance.

It's why Adobe turned a blind eye to piracy and Microsoft never really cared if Bobby installed a bootleg Windows... they knew that availability kept their tools at the foremost of the people that were interested.

Things get more difficult when top (consumer) dollar is going to leave you feeling restricted out of the gate -- and this is where deep tinkerers will justify their time over their wallets. Necessity is the mother of invention, and people need to eat.

We are one major announcement of ROCm, oneAPI or something entirely new from a paradigm shift. I can't imagine a life where I wasn't bound to CUDA in one way or another... but man, I really, really want to.

[-]

Gearsper29@reddit

Thats not possible. For the last few years we are stuck with 2GB memory chips. So right now the only choice is 32GB vram. 64GB is technicaly possible if they sandwich the chips but this is the differentiating factor of the prosumer gpus so sadly they're not gonna do it for gaming gpus.

[-]

BuildAQuad@reddit

I'm just hoping Intel ends up doing this, getting loads of hobbyists/open source communities into using their cards can help drive adaption/compatability ect.

[-]

Gearsper29@reddit

Sadly the biggest intel gpu will have 256bit memory bus width so it will be limited to 16 or 32GB vram at best. The only hope for more are the 3GB memory chips that will come out later this year. So in 2 years from now we cound have a 50% increase in vram in every gpu category.

[-]

sassydodo@reddit

Gaming cards don't really need even 32gb vram, so "consumer" needs is a bit of a stretch here

[-]

Thrumpwart@reddit

The first GPU manufacturer to put 48GB Vram in a consumer GPU is going to make tens of billions.

[-]

moldyjellybean@reddit

Yeah man it’s like Apple selling fools 8gb and 256gb soldered ram for like 9 years. And charging $200 for 8gb extra or 1000 for 2tb of storage. I upgraded a thinkpad for $90 to 64gb ram and 2tb for $100. I sold off the original 32gb and 512gb so in all upgrading to 64gb ram and 2tb for less than $100

[-]

Educational_Gap5867@reddit

Personally as a gamer and an LLM enthusiast I’d prefer if they make 2 separate cards. One for DL and one for gaming. This will keep the pricing down for gaming cards as well as allow for DL cards to come with a lot more memory although maybe older and far fewer shader cores.

[-]

sleepy_roger@reddit

🤣 Well sir this is your lucky day!

[-]

Ancalagon_TheWhite@reddit

They do. It's called the A100 pcie and RTX A6000

[-]

Educational_Gap5867@reddit

But those are even more expensive lol

[-]

Ancalagon_TheWhite@reddit

Exactly. The gaming series are already the budget GPUs.

[-]

ThenExtension9196@reddit

On the other hand you can buy a RTX6000ADA workstation card for 8k. Has 48G of memory if you want it so bad.

[-]

Majinvegito123@reddit

Does anyone know what the performance gains are expected to be

[-]

haikusbot@reddit

Does anyone know

What the performance gains are

Expected to be

- Majinvegito123

^(I detect haikus. And sometimes, successfully.) ^Learn more about me.

^(Opt out of replies: "haikusbot opt out" | Delete my comment: "haikusbot delete")

[-]

mrSidX@reddit

Prithee, good sirs and gentle dames,
Doth any soul here knoweth well
What measure of swiftness or of might
We may expect to gain anon?

Hear me, ye gathered minds of wit and wonder!
Doth any ‘mong thee ken or whisper know,
What boon of swiftness, what gain of might,
May spring from yonder toil and tinkering?

For I am plagued with doubt most foul —
Shall our labors yield but smoke and sighs,
Or shall they blaze with glory unforeseen?

Speak, I entreat thee! What gains of performance
Doth destiny herself decree we reap?

-Majinvegito123

[-]

AsliReddington@reddit

Hardware FP4/FP6 support is expected to allow lower precision but retain higher accuracy for all models, letting you run large models like the 70B(with some offloading) or MoE models like Mixtral at very high speeds without quality loss of INT4.

[-]

emprahsFury@reddit

nvidia's ces keynote is tonight, so i guess we'll find out

[-]

Caffdy@reddit

About three fiddy

[-]

TurpentineEnjoyer@reddit

And is also guaranteed to cost over $2000, and likely $2500

We're rapidly approaching the point where business GPUs are looking like a bargain compared to the "prosumer" rip off.

[-]

TechNerd10191@reddit (OP)

Correct, but even $2500 could be almost half the price of RTX 5000/6000 Ada, not to mention the L40 ones.

Of course, if you need >4 GPUs, the business GPUs are a no-brainer.

[-]

TurpentineEnjoyer@reddit

I just don't have any hype for the 5090 at all given what we know about it so far.

Based on UK pricing it's likely to be over £2000 meanwhile I can get used 3090s for £650, so I could get 3x 3090s for the same price. 72GB vs 32GB. Eh.

It might be potentially twice as fast as a 3090 for processing, but I feel like once you're into the realm of *NEEDING* more than 10 t/s inference on 72GB then we're outside the realms of casual user and quite possibly into the realms of business users anyway, who'll be thinking about long term pricing of wattage vs upfront cost and picking the business hardware.

The 5090 is too expensive for hobbyists, and too inefficient for professionals.

It might have been an attractive option if it came with 48GB of VRAM but as it stands, it's just not really a good card in the LLM space, and I think that's exactly as intended by nvidia.

[-]

Nrgte@reddit

It's a gaming card. Gamers don't need to run huge AI models on their PC. If you want to do serious AI stuff get a A6000 or a A100.

[-]

katiecharm@reddit

As someone who owns a 4090 and would love a reason to be hyped about the 5090…. I’m just not.

By the time the PS6 generation of consoles comes out, there will likely be a 6090 card that has 64GB of VRAM and actually feels like a real step up.

[-]

Desm0nt@reddit

By the time the PS6 generation of consoles comes out,

With probably Unified memory solution like Apple/AMD, not with dedicated card faster and bigger than PC card.

[-]

Caffdy@reddit

That's just 2 years from now, no way in hell they jump from 32 to 64 in just one gen

[-]

katiecharm@reddit

That’s how technology usually goes, and has gone for decades. Nvidia has been purposefully gimping this cycle but I don’t think they can do so much longer.

When one generation has 256GB of hard drive space, the next generation doesn’t come out with 290.

Usually it gets doubled.

[-]

Caffdy@reddit

I'm sure the next RTX 6090 is gonna be released with 32GB again. By 2030, maybe we will get 48GB with the RTX 70 series

[-]

noiserr@reddit

Bold of you to assume Nvidia will give us 64GB of VRAM in 2026. They've been stuck on 8GB for lower end models since 2017.

[-]

mxforest@reddit

Inference time compute is the future. Running a smaller model with higher tps is more important in agentic workflow than a bigger model with lower tps. 3x3090 will not only be slow because of a larger split up model, they will also be slow because of communication overhead. On top of that add the 3090s basically being out of warranty and potential for becoming deadweight whereas 5090 will be a safe "investment" because it can be RMAd.

[-]

TurpentineEnjoyer@reddit

Second hand stores, at least here in the UK are still obligated to give you a 1 year warranty - one store, CEX, offers 5 year warranty that covers GPUs, at no additional charge. I do realise however that will vary from location to location.

As far as inference speed being the future - in what context? What's the minimum acceptable tokens per second? Because I can hit 50 tps on an 8b model on a single 3090 - is 50 tps too slow?

Like I mentioned before it feels like once we're talking about the cost per minute of usage, we're outside the realms of consumer hardware.

[-]

mxforest@reddit

8b are on the border of what is usable. They are not good, just OK. A true agent will have multimodality with vision and speech input. That will increase this borderline to higher numbers.

[-]

Desm0nt@reddit

Well. 3x 3090 can run 3 small agents independently, what sounds better than 1 card with shared resources. And not everyone need multiple agents for... whatever you use it. Many people still use it as chatbots/RP/Coding assistant that work just as 1 agent and where 1 clever big model better than multiple stupid fast one.

[-]

PawelSalsa@reddit

I don't get it. Why do you need 100t/s instead of 10? Just for the sake of speed without any significant benefit in the term of usefulness?

[-]

ortegaalfredo@reddit

> 3x3090 will not only be slow because of a larger split up model, they will also be slow because of communication overhead and a low 900 GBps bandwidth

Incorrect. You can use tensor-parallel and you have effectively 3x the bandwidth and will indeed work faster than a 5090. In modern TP implementations like vllm, speed scale almost linearly with GPUs.

[-]

killver@reddit

Sharding sucks though for a lot of use cases and will always slow you down. This is why so many want and need larger memory on single cards, particularly for any training.

[-]

jd_3d@reddit

$1999

[-]

hyouko@reddit

...Are we? Like, if we're just going on the basis of cost per GB of VRAM:

5090 (32GB) at a speculated cost of $2,500: $78/gb
4090 (24GB) at $1,800: $75/gb
RTX ADA 6000 (48GB) at ~$8,000: $167/gb
H100 NVL (94GB) at $29,000: $309/gb

Obviously you get other benefits/capabilities from the H100 and friends (not so much the RTX ADA series from what I've read...). But the consumer-focused GPUs are clearly the "bargains" here, insofar as there is any bargain to be had.

[-]

TurpentineEnjoyer@reddit

There's more to go off than just VRAM per dollar.

Supposedly they're built to a higher quality designed to withstand constant heavy loads, although I don't have any ten grand GPUs down the back of my sofa to put that to the test personally.

There's also dakka-per-watt, where pro cards have lower wattage for equivalent or superior performance which has long term savings over the lifetime of the card, especially in places where electricity is expensive now.

Then there's space constraints - how many PCI lanes you have, and literal space too, with a single 2 slot card having the same VRAM as 2x 3 slot cards.

It's also just a constantly climbing price because nvidia has no real competition right now. There's no reason a $2500 card couldn't come with 64gb, other than that it would cannibalize their server card market. A 64gb card for $2500 I'd be tempted, but as it stands right now I can get 3x 3090s for less money and have 72gb

[-]

hyouko@reddit

Definitely there's more than just VRAM; if you're doing anything that requires interconnects then the pro-grade hardware is your only option. But your initial premise was that business GPUs are looking like a bargain, and then the example of a bargain you actually give is a trio of consumer cards from 2 generations ago. I think that's the better way to frame it: for many use cases, the value proposition isn't really improving from generation to generation in the way that it might if nVidia had actual competition.

(I don't think we will see a 64GB card, but maybe it's just possible that we could see a 48GB 5090Ti at some point, with those 3GB DDR7 chips that are supposedly used in the 5090 laptop chip?)

[-]

TurpentineEnjoyer@reddit

My reason for believing business cards are starting to lookm like a bargaon comes from:

https://www.reddit.com/r/LocalLLaMA/comments/1hv1efu/comment/m5pk8j3/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button

Admittedly a different comment to the one you're replying to but justifies my reason for pointing out why 3x3090 is better than 1x5090.

tldr: 3090 is plenty fast enough for hobbyist users. If you need more than 10 t/s on a model that runs in 70gb of ram, then you']re probably a business user anyway.
5090 can'[t justify the price for hobbyists, but pro users will pass it up for pro cards too.

[-]

emprahsFury@reddit

What's honestly crazy is that AMD and Intel also sell their gpus for that cost, and it's literally worse in every way.

Having said that you can get an ebay "new" w7900 (48gb 7900xtx) for $2500 which does alter the math.

[-]

Natural_Home_769@reddit

I still remember having an entire system built with gtx 1070ish under 1000usd.

Now the gpu itself costs over 3000usd at least in my Curryland. Can't even afford it as a 9-5 IT worker. 3000usd is like 12000usd(PPP adjusted)

RTX 3090 still costs around 2800USD here.

[-]

katiecharm@reddit

Don’t feel bad eastern bro. Us over here in Burgerland can’t afford it either.

[-]

Natural_Home_769@reddit

Thanks mate for the generous word I feel really bad :(

[-]

Affectionate-Cap-600@reddit

same here... pizzaland

[-]

Vast-Breakfast-1201@reddit

My 1080ti system was about 1500 all in. Half that was the GPU.

Inflation at 7 years would put that around 1900 bucks. Instead the GPU costs 2000-2500 or whatever and it isn't even Ti or super, so it's a level down on a per-generation basis. Plus whatever else to build the thing. And not counting any tariffs that will get laid down.

Tbh we need an economy crash pretty quick here. We have gone too long without, so nobody has any idea what the actual price of anything is.

[-]

ThenExtension9196@reddit

I’ll pay no problem for a 5090. Nothing else like it. It’s the workstation cards that hurt.

[-]

AppearanceHeavy6724@reddit

3x5060 could be a bit more interesting proposition.

[-]

BuildAQuad@reddit

Probably not costing the same as used 3090s tbh

[-]

AppearanceHeavy6724@reddit

why? 2x 3090 $1500, 3x5060 probably 1200 or so.

[-]

Xyzzymoon@reddit

2 4060ti 16GB currently costs more than 1200 already , why would 3x 5060 in any variance be cheaper, if it comes with anywhere near 16GB VRAM?

[-]

AppearanceHeavy6724@reddit

Because otherwise no one buy x060 card for more than 400 usd. Pay attention, not Ti but vannila 4060 is $300. So I expect 5060 16 to be 400. Ti more expensive not because of memory, but due to faster chip.

[-]

yamosin@reddit

Not good news for LLM, because LLM's VRAM size is far more important than bandwidth.

But it’s good news for stable diffusion users. The improvement of it/s is very exaggerated.

[-]

TastesLikeOwlbear@reddit

I will not be at all surprised if the 5090's 32GiB is partitioned in some way designed to limit nontrivial AI usage.

I.e., they'll say that it's got 24GiB for graphics to match the 3090/4090 and then an extra "AI dedicated" 8GiB "to support having AI models in games without impacting graphical fidelity!"

And then they'll do something dastardly in the hardware to hamstring using the full 32GiB for AI so it doesn't cut into their lucrative workstation and datacenter business.

I look forward to being wrong about this.

[-]

segmond@reddit

From my calculation it's going to be about 2.5x-3x faster than the 3090 for inference. So the question is, would you prefer 3-4 3090's 72gb-96gb of vram vs 32gb. For huge models, more 3090 will beat partial 32gb of 5090. Models have gotten insanely large, llama you need at least 70b, mistral 100b, deepseek 600b+, grok 300b+, etc

[-]

Hialgo@reddit

I don't caaaaaaare I just want the VRAM

[-]

CystralSkye@reddit

The 5090 isn't made for AI it's made for gamers.

[-]

synn89@reddit

At this point I could care less about raw speed. I really just want something at a 3090 level with way more VRAM, moderate power draw and is reasonably priced.

[-]

CystralSkye@reddit

You do understand that you are asking for a gaming card to be something it isn't?

There is a huge demand for a card like that, and it exists, you aren't going to find a method to get it cheaper. The cheaper version is the 3090.

You need to wait a couple of years until demand runs low.

[-]

SandboChang@reddit

I really think this is a gap that AMD or Intel can try to fill up. Not sure how big a market local LLM really has at the moment though, might just be too small to bother.

[-]

barefoot_twig@reddit

AMD and Intel would be much more interested in eating into the enterprise AI chip market. If they can, then the prices could become somewhat reasonable for the LocalLLM crowd, but don't hold your breath.

[-]

Hasuto@reddit

Both Intel and AMD are making enterprise AI chips similar to Nvidias offerings.

They are slightly cheaper (but still not purchasable by normal people) and usually slightly worse than the Nvidia offerings.

I find it doubtful that either of them will be more interested in offering consumer grade chips that outclass their enterprise offerings than Nvidia is.

[-]

emprahsFury@reddit

that's an amd w7900, which is a 7900xtx w/ 48gb

[-]

barefoot_twig@reddit

If that were an option, everyone would want it as well. What you're describing is an A100 80GB, and they're still >$10k used for a reason.

[-]

psychicsword@reddit

We need those cards to start to go end of life so they start to appear on the open market like the Tesla K80 cards

[-]

iamthewhatt@reddit

lol as big as AI is right now, EOL cards will still be ~$10k. 4090 has been out for for 2 and a half years and it's price has never dropped below MSRP anywhere, for this very reason.

[-]

Wrong-Historian@reddit

2x 3090 in Tensor Parallel will give you \~1.6x the inference speed of a Single 3090 (eg \~ '1.5TB/s'), cost \~$1400 and have 48GB.

3090's remain king.

[-]

mxforest@reddit

I really think that Nvidia has a trick up their sleeve which nobody knows yet. They have a 3 way split power delivery system as per leaks while the 4090 has 2. They have put in some other AI components that require separate power. If it is relevant for inference time compute then this changes the efficiency numbers.

[-]

Affectionate-Cap-600@reddit

interesting...thank you, I misses that!

[-]

TurpentineEnjoyer@reddit

What are you using for tensor parallelism? I've heard VLLM and one other I've forgotten.

[-]

bick_nyers@reddit

EXL2 quants run pretty fast split along 2+ cards. Available via Oobabooga or TabbyAPI.

[-]

TheGuardianInTheBall@reddit

I love how every piece of software related to AI sound so silly.

Oobabooga, TabbyAPI, HuggingFace, ComfyUi, Ollama.

Peak Camelot.

[-]

iamthegemfinder@reddit

I feel like the prevalence of these sillier/more abstract names must in part arise from the fact that naming your ML project/company any more seriously, and standing out at all, is a challenge—there are only so many permutations of “chat” “web” “ui” “agent”, and other such things, the pool of relevant terminology in this area is so lacking 😭

Or the secret third option which is naming your thing some Sleek Compound Word like idk Fireflow or Readychain or something. Only do this if you are interested in shipping an overstimulating landing page overdosed with parallax scroll effects and fading into obscurity immediately.

I may be reading too much into this.. :P

[-]

ArsNeph@reddit

Speaking of silly, you forgot SillyTavern XD

[-]

TheGuardianInTheBall@reddit

I wasn't aware of it, but looks like fun, might spin it up.

[-]

TurpentineEnjoyer@reddit

It kind of advertises itself as a roleplay focussed web UI but honestly, the features it provides are pretty powerful for a general purpose web UI too.

You can add context blocks that can be turned on and off, have multiple characters with different personalities interact in a single chat, write your own plugins that utilize function calling, etc.

[-]

TheGuardianInTheBall@reddit

Funnily enough, I've been thinking of doing something like this myself for a little tabletop RPG I'm developing.

[-]

TurpentineEnjoyer@reddit

ooba is what I'm using right now, but as far as I understand it isn't true tensor parallelism.

I'm loathe to leave ooba since I love how easy it is to switch between models and get set up, but a 60% speed boost is tempting.

[-]

bick_nyers@reddit

In the EXL2 loaders there is an enable_tp option.

[-]

TurpentineEnjoyer@reddit

Ah interesting, I'll give it a try.

I've been using GGUF for so long that I forgot ooba can actually do other things too

[-]

Wrong-Historian@reddit

mlc-llm!

[-]

TurpentineEnjoyer@reddit

That's the one! Thanks

[-]

koalfied-coder@reddit

Historian here is the facts!

[-]

Healthy-Nebula-3603@reddit

Splitting LLM onto 2 cards is not 8neeasi5speed model .. works with the same speed like you'd run on 1 card but with extra vram

[-]

Pretend_Regret8237@reddit

Stop with the unfounded rumors, we have a few hours left to the reveal. This is clickbait

[-]

teh_mICON@reddit

I'm fucking over the "rumors" for months now

[-]

custodiam99@reddit

I feel that GPUs are getting irrelevant in 2025. We will need a powerful CPU and 512GB DDR5 memory instead.

[-]

someguy@reddit

No good mainboards for Epyc 9005, and it starts looking like 12-channel may still be too slow.

[-]

Thomas-Lore@reddit

Might be, if there is more models like Deepseek v3.

[-]

ForsookComparison@reddit

7900xt blowers go brrr for $699

[-]

iamthewhatt@reddit

man if AMD actually had a CUDA competitor, they would be rolling in the dough... Out of all their billions in profits, you would think they would put together a team to tackle that.

[-]

virtualmnemonic@reddit

The problem is that todays software - having billions poured into it - is in CUDA, so even an equalivant CUDA competitor isn't enough. They've tried to address this with solutions like ZLUDA, but there's a performance penalty and/or bugs. NVIDIA has been investing in CUDA for years and it's really paid off.

[-]

ForsookComparison@reddit

They don't even need a competitor. They just need Lisa Su on stage in front of a road map for ROCm or some adoption that would make them viable for training.

But they're not doing that and ROCm is growing but at a relative snails pace. I don't get it. Is their strategy really to be the inference company?

[-]

iamthewhatt@reddit

I am not that learned in the alternative technology spaces, but the reason why I say "competitor to CUDA" is mostly because of the popular software that uses CUDA, which could also be using ROCm equally were AMD to support it. Or some other technology that leverages existing infrastructure to mimic CUDA (like ZLUDA but less legally ambiguous). Here's to hoping they announce something today 🤞

[-]

estebansaa@reddit

Not interested unless I can run DeekSeek at a descent level.

[-]

creo_one@reddit

Behold, even more useless data being sent to GPU for no reason

[-]

Echo9Zulu-@reddit

I mean what consumer hardware could possibly support utilizing that memory bandwidth?

[-]

un_passant@reddit

Can p2p be unlocked with a custom driver like for the 4090 ?

If NVIDIA fixed that hole, I'll pass.

[-]

CockBrother@reddit

If there's an A6000 Blackwell with 64GB of RAM then we're really starting something interesting. It'd probably cost about $10k though.

The 24-32GB jump in consumer space is "nice" but doesn't change anything dramatically. You can probably get two consumer cards in a regular PC without fuss. Anything more than that and you're looking at replacing the ridiculously thick heat sinks (dangerous) or PCIE extenders and related goofiness.

I don't need a five slot thick GPU that needs an additional slot of clearance for airflow to make me feel good about how heavy and bulky it is.

[-]

pointer_to_null@reddit

If there's an A6000 Blackwell with 64GB of RAM then we're really starting something interesting. It'd probably cost about $10k though.

That's a reasonable guess. Judging by predecessor's naming, predict it'll be officially named "RTX 6000 Blackwell Generation". The "RTX A6000" and "RTX 6000 Ada" brand confusion was comical and felt like it was intentional though.

But I agree with the other predictions, with one caveat: Nvidia will MSRP on their website below $8000, but those will go immediately OoS while remainder will be sold at ~$10k from partners like PNY.

[-]

Biggest_Cans@reddit

PHENOMENAL COSMIC POWER

^^itty ^^bitty ^^living ^^space

[-]

garbo77@reddit

For the RTX 3090 budget I would probably go for RTX A2000 ADA 20GB, 70w: am I in wrong?

[-]

TechNerd10191@reddit (OP)

The A2000 has 12GB or 16GB. An A4500 (20GB and 320 bit memory bus) would be a better alternative.

[-]

garbo77@reddit

My mistake, I meant A4000

[-]

TechNerd10191@reddit (OP)

A4500 > A4000 (check the Nvidia datasheets for each for proof)

[-]

joninco@reddit

That'd be a bad decision. 3090 has 2x the memory bandwidth, 3x the tensor and cuda cores.

[-]

Terminator857@reddit

I like how the box says "low noise". The 3090 gets loud under load.

[-]

treksis@reddit

time to upgrade

[-]

Calcidiol@reddit

One thing I don't understand fully is what's the difference technically and in terms of SW / whatever limitations between flagship consumer gaming GPUs like 4090, 5090 and the features supported by their pro. GPUs.

Specifically wrt. the topic of memory, do these class of GPUs (3090, 4090, 5090 guesstimate) have the same level of memory integrity / reliability wrt. ECC and means of EDAC as the comparable pro. GPUs?

I vaguely recall some things about some (all?) consumer GPUs maybe lacking the same architecture of ECC (maybe they literally don't have the extra bits). Also similarly I may have seen some comment about some kind of in-band EDAC / ECC scheme which functioned on some NV pro. GPUs but which might simply be disabled from effectively working on the consumer ones?

Again IDK the status quo / history which is why I'm asking, but I've seen enough confusing things that I'm motivated to ask since I could believe "either way" that there is or is not some parity (pun intentional) of memory integrity between these classes of carts.

And as a PS for those who have been running 3090, 4090 hard doing GPGPU experiments / whatever, what's your impression of the overall reliability of these things at all levels?

In the worst case one hears about melting power connectors (et. al.) and missing MLCC instability fiascos and one has to wonder how much to trust these. Going over $1-$2k with some mediocre 3y warranty seems well out of "disposable toy" range to me and more like "long term infrastructure" I'd want to trust for 10y time scales and a good lot of the "board vendors" I'd hesitate to buy anything from in general (customer service, SW support, ...).

[-]

shokuninstudio@reddit

Who can do price and specs estimates for the RTX 6000?