AMD preparing RDNA4 Radeon PRO series with 32GB memory on board

[-]

gfy_expert@reddit

Radeon pro 7000 48gb owners, are old model any good ?

Reply

[-]

SmellsLikeAPig@reddit

These are fp16 this one can do fp8, seems a lot faster for AI as well

Reply

[-]

gfy_expert@reddit

yeah, but it's about getting an idea before new models hitting shelves, how good rocm is, if it's possible to run on win11 at decent speeds etc.

Reply

[-]

b0tbuilder@reddit

Run on Linux. Why would you run on Windows?

Reply

[-]

SmellsLikeAPig@reddit

I wouldn't buy fp16 cards at this point

Reply

[-]

I just try to run digital waifu, gguf file, image generation, TTS and trying talk llama fast. 4060ti can do all this, but not all of these at once. koboldai+silytavern for roleplay and stability matrix/comfyui for images generation with models from civitai. for video generation 16 gb vram is enough on framepack but don't have 64-128gb ddr4/5.

Reply

[-]

CarefulGarage3902@reddit

But it can’t even do fp4? the rtx 5000 series can do fp4. Maybe they’re like not even trying to sell us ai enthusiasts this card and are just targeting gamers/video editing etc.

Reply

[-]

SmellsLikeAPig@reddit

I don't know how useful fp4 is really. Aren't models quantised to 4 bits to lobotomized?

Reply

[-]

CarefulGarage3902@reddit

I think the idea with having fp8 and fp4 support is that the gpu will have to do less calculations to go from fp16 to 4 bit for some layer. I’m real impressed by the dynamic quants like gptq that keep some layers at higher bits and then put other layers at lower bits like 4 since those layers affect the performance/accuracy less. Instead of quantizing a whole model to 4 bit we may have some layers at 4 bit, others at 8, others at 16, and so on and end up with real good performance for the amount of compute. I imagine fp4 support would mean better performance/less compute on the 4 bit layers, but I’m not too knowledgeable on the subject yet.

Reply

[-]

ResponsibleTruck4717@reddit

Can someone explain me why Intel / Amd not making some mid / high range card with absurd amount of vram like 128gb just to flood the market.

Reply

[-]

EugenePopcorn@reddit

Because these firms are all run by business goons obsessed with market segmentation.

Reply

[-]

ResponsibleTruck4717@reddit

Correct me if I'm wrong currently Nvidia is the one controlling the market right? wouldn't be better for amd / Intel get a foot hold os more tools will works with their cards.

Reply

[-]

EugenePopcorn@reddit

That would be a way to deliver massive value for customers, but the business goons have their hearts set on delivering massive value to shareholders by selling data center GPUs instead.

Reply

[-]

crantob@reddit

And who created that surge in demand? The people in DC and their friends with the printed money that you and I do not have access-to.

Reply

[-]

Bandit-level-200@reddit

32gb following nvidia as always

Reply

[-]

Medium_Chemist_4032@reddit

I swear AMD feels like NVidia's controlled opposition

Reply

[-]

grady_vuckovic@reddit

No need to compete when there's only two choices in the market and you can simply match your competitor rather than undercutting them on price aggressively.

Reply

[-]

crantob@reddit

But that IS competition. This isn't a charity. You're always compromising per-unit profit versus total profit in your pricing. And you're always trying to get the best selling price you can. Right now there's a flood of institutional, corporate and government money (which flows into institutional and corporate) buying away resources from we, the people. That's a real problem that takes some learning to understand.

Reply

[-]

emprahsFury@reddit

its crazy how far behind AMD is. Nvidia is releasing 96 gb cards to the consumer (and the $/GB is the same as a 5090).

Reply

[-]

KontoOficjalneMR@reddit

> Nvidia is releasing 96 gb cards to the consumer (and the $/GB is the same as a 5090). Huh? What card is that?

Reply

[-]

frankchn@reddit

RTX Pro 6000

Reply

[-]

KontoOficjalneMR@reddit

I was going to say that's bullshit but then I checked how much 5090 costs now.

Reply

[-]

frankchn@reddit

Yeah it is 33% more per GB based off MSRP pricing, but I am not sure how available the $2000 5090 FE is — realistically if you want a RTX 5090 today you are going to spend $3000+. Meanwhile, previous generations of RTX workstation cards are generally available at MSRP.

Reply

[-]

KontoOficjalneMR@reddit

I checked nd it's available for 2k on best buy USA website. I found several others around 2200$ as well. So I think if you try you can get it fro MSRP. And 8500 is still a speculated/leaked price AFAIK not MSRP.

Reply

[-]

frankchn@reddit

I just checked the Best Buy website and there is a product listing for the Founder’s Edition at $2000, but it is “Sold Out” and apart from occasional stock drops have been that way since launch. If you search on Newegg for stock available to ship, it is all priced beyond $3000.

Reply

[-]

KontoOficjalneMR@reddit

Electronics prices are a general shitshow than;s to Trump's tariffs. Like I said we'll see what'll be the price of RTX Pro 6000 once it's actually available to order.

Reply

[-]

frankchn@reddit

Yeah, no disagreement there.

Reply

[-]

Hunting-Succcubus@reddit

Full disagreement here, tarrif only apply to us border. Other countries should have original pricing

Reply

[-]

avinash240@reddit

Got a link to this 2k Best Buy 5090? I'll buy it right now.

Reply

[-]

Hunting-Succcubus@reddit

Why cuda core not multiplying with 3? Are vram cost that much? Thats silly. I need 60000 coda core for 6k$. And 3x vram.

Reply

[-]

emprahsFury@reddit

the cheapest 5090 on newegg is 2500. 3 of them is 7500. That means there is an extra 1000 premium for the vram on an rtx pro 6000. Which is an extra $10/GB. So sorry for the egregious lie. I'm sorry the price of a fast food meal too big a lie for you to countenance.

Reply

[-]

KontoOficjalneMR@reddit

No. That's the price of 96 fast food meals. And 30% difference in price. So quite the bullshit. You were wrong - own it, instead of shifting goalposts.

Reply

[-]

thrownawaymane@reddit

Lol what consumer The 96gb card is 1000% enterprise

Reply

[-]

emprahsFury@reddit

if you can buy it from consumer channels it's available to consumers. You can order it the same way you can order a 5090.

Reply

[-]

kb4000@reddit

I don't see any consumer facing listings anywhere in the US from an official retail partner.

Reply

[-]

Bandit-level-200@reddit

> Nvidia is releasing 96 gb cards to the consumer enterprise and don't mistake it for goodwill, extra vram does not make it worth its 8k price tag memory modules doesn't cost 1k a piece like nvidia seems to try to tell us

Reply

[-]

emprahsFury@reddit

no one said it was based on goodwill.

Reply

[-]

frankchn@reddit

It is not worth it to us consumers, but that’s not their target market. It is for companies who won’t blink at spending $30k a computer for their ML engineer. After all, what’s $30k if you are already paying the engineer half a million a year, especially if they are more efficient.

Reply

[-]

My_Unbiased_Opinion@reddit

This thing is DOA at anything above 1500. At some point, people would rather just buy a 5090.

Reply

[-]

HugoCortell@reddit

If they make it 1000-1200, it'll be great. Otherwise, stacking old 3090tis will still be king.

Reply

[-]

Xyzzymoon@reddit

How?

Reply

[-]

custodiam99@reddit

Well the price is the most important factor.

Reply

[-]

FastDecode1@reddit

Not for enterprise users. "Pro" means it's a professional card for people who use it to make money, so even if it costs thousands (which it does), the card pays itself back in no time. The last Radeon Pro card with 32GB VRAM (W7800) had an MSRP of $2,500.

Reply

[-]

BusRevolutionary9893@reddit

He obviously meant for us.

Reply

[-]

FastDecode1@reddit

"Us" referring to whom exactly? The only obvious thing here is that this is an expensive card aimed at the professional market, not the home/hobbyist user. I'm sure there's plenty of enterprise/pro folks here who want to run models locally for the same reasons that home users do. Being able to better guarantee data privacy and security because you're not sending it over the internet (potentially to another country) to be processed on someone else's computer is very valuable in the professional space, not just for home users. The most important for the target audience of this card is availability and the quality of support, not the price.

Reply

[-]

HugoCortell@reddit

Us refers to we, comrade. The people demand bread and graphics cards.

Reply

[-]

CarefulGarage3902@reddit

There’s an nvidia verified gamer/creator program now for getting to buy an nvidia 5080/5090 on the nvidia marketplace at msrp. If they think I would pay $500 more for a card with the same specs and no CUDA then they some dumb dumbs. Maybe the exception here would be if someone was wanting to buy multiple for making a multi gpu rig, but even then I imagine CUDA with some 4090’s or 3090’s would be better. I suppose there’s the possibility that they’re going to surprise us with some CUDA like new software that justifies the msrp, but I doubt it. Given the lack of CUDA, what is the most yall would pay for this gpu? Comment below

Reply

[-]

Ninja_Weedle@reddit

800$

Reply

[-]

nostriluu@reddit

$801

Reply

[-]

bblankuser@reddit

So we're paying more for the same amount of vram?

Reply

[-]

Such_Advantage_6949@reddit

If this card has higher msrp than 5090, it can be quite dead on arrival especially If it has same bandwidth, sam vram.

Reply

[-]

PorchettaM@reddit

Enterprise cares about all the certifications and support you don't get with consumer cards. Nvidia is still selling 32 and 24 GB Pro cards even though the 5090 exists.

Reply

[-]

b3081a@reddit

Probably $1000-$1200 at most.

Reply

[-]

BusRevolutionary9893@reddit

The RDNA 4-based card with 32GB is likely to be a successor or comparable to the W7800, given the similar memory capacity and professional focus. The W7800’s $2,499 price sets a baseline.

Reply

[-]

Such_Advantage_6949@reddit

Yea knowing them, that is wht they will fo. Then they can wonder why the card not selling

Reply

[-]

resnet152@reddit

Well that and CUDA

Reply

[-]

custodiam99@reddit

We are talking about inference.

Reply

[-]

Rustybot@reddit

Sadly, it will be Market Price.

Reply

[-]

gfy_expert@reddit

As well availability

Reply

[-]

512bitinstruction@reddit

I would actually prefer if they added ROCm support to their uma iGPUs.

Reply

[-]

Healthy-Nebula-3603@reddit

Why only 32 GB !

Reply

[-]

b3081a@reddit

It's already the max that is possible for a 256bit GDDR6 bus. If they opted for GDDR7 then they could go 48GB and eventually 64GB.

Reply

[-]

Conscious_Cut_6144@reddit

Can you not double up on ram like you do with dram, like 2/3 sticks per Channel? No bandwidth increase just additional ram

Reply

[-]

b3081a@reddit

Desktop/server DDR can do this because they have chipselect pins so they can support multiple ranks per channel. GDDR don't have them, so all they can do is clamshell rather than increasing ranks. 32GB per 256bit GDDR6 is already using the highest available capacity GDDR chip and combining them with clamshell so there's no further chance of doubling the capacity

Reply

[-]

Conscious_Cut_6144@reddit

Someone figured it out... [https://www.reddit.com/r/LocalLLaMA/comments/1j6i1ma/comment/mgp30xg/](https://www.reddit.com/r/LocalLLaMA/comments/1j6i1ma/comment/mgp30xg/)

Reply

[-]

b3081a@reddit

That's obviously faked. It's over a month since then but we haven't seen any availability.

Reply

[-]

Healthy-Nebula-3603@reddit

I don't understand why producers do not make multilayer VRAM memory like HBM or FLASH.

Reply

[-]

Hunting-Succcubus@reddit

They make, it’s called HBM and it’s expensive.

Reply

[-]

Alphasite@reddit

Isn’t that literally HBM??? AMD actually helped invent it and shipped a few consumer cards with it. It’s just more expensive than VRAM.

Reply

[-]

KontoOficjalneMR@reddit

It starts wih Mo and ends with ney

Reply

[-]

Healthy-Nebula-3603@reddit

Lol......ehhhhh I hope they finally start building multilayer VRAM as we finally have reason for it know.

Reply

[-]

AmazinglyObliviouse@reddit

That's my favorite impressionist painter

Reply

[-]

relmny@reddit

Isn't the "upgraded" rtx 4900 48gb GDDR6? How come some people can make a 48gb with GDDR6 and ADM can't?

Reply

[-]

eding42@reddit

You would need a fatter memory bus. This is the max possible under 256 bit assuming you’re not using 3 GB modules

Reply

[-]

relmny@reddit

Still, why do they limit themselves? Is AMD, not some random very small business with a hand full of people that take some "old" 24gb GPUs and turn them into 48gb... Yet those very small businesses manage to do it and AMD don't. Some are even sold for about $3000

Reply

[-]

Txt8aker@reddit

Blame the system. See, high demand = high cost. That means high cost for us and high cost for the manufacturer. Memory chip is used everywhere and the particular one used on GPUs are very special kind. It's also not about why they can't but they decide to do it for business reasons (gotta milk the consumers to make as much profit as it can)

Reply

[-]

Allseeing_Argos@reddit

It's because AMD execs all have Nvidia stocks. so if they release a product that is too good they will personally lose money. They're gimping themselves on purpose.

Reply

[-]

eding42@reddit

They limit themselves to the smaller memory bus for cost / yield reasons, memory controllers are more sensitive to defects + they don’t scale as well with smaller nodes. AMD 100% could make a 512 bit version of the 9070 XT die LOL but that would cost a LOT of money per chip (in addition to the fixed cost of the tape out, which is usually in the tens of millions of dollars) The 24 GB to 48 GB conversion is possible probably bc whatever GPU that was has a bigger memory bus.

Reply

[-]

asssuber@reddit

AMD makes the 48GB W7800 with a $2500 MSRP. Partners used to be able to put more VRAM in GPUs in the past, but they are forbidden now by AMD and Nvidia, and I guess Intel too. The reason is to not canibalize that professional market where they charge absurd premiums for the extra VRAM. W7800

Reply

[-]

Nexter92@reddit

They will run what ? ROCm ? LOL. The only way to make them usable is to sell them for 380/400$ MAX, that is gonna be good card for LLM but not with ROCm but Vulkan.

Reply

[-]

custodiam99@reddit

I have an RX 7900XTX and I'm running ROCm on Windows 11 and LM Studio. It's speed is 92% of Vulkan but with better DDR5 memory management. I have no complains. What am I missing?

Reply

[-]

Nexter92@reddit

Linux ROCm here. Almost every image generation or video generation is compatible with CUDA not ROCm or have problem with ROCm due to shitty code. For LLM text generation on linux, vulkan do not require anything, no LTS version of Ubuntu or what so ever. ROCm require LTS version, it's a problem on linux. Vulkan work without installing anything. Vulkan is faster than ROCM. Vulkan is non LTS locked. Vulkan is supported on 99% of Linux distribution.

Reply

[-]

MikeLPU@reddit

I use fedora, no LTS shit.

Reply

[-]

Nexter92@reddit

Fedora is not in the official compatible list of distro, one update > goodbye your working distro :) [https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/system-requirements.html#supported-distributions](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/system-requirements.html#supported-distributions)

Reply

[-]

rusty_fans@reddit

Official support isn't necessarily better if the community keeps up with updates.

Reply

[-]

Nexter92@reddit

L-O-L. even if that was true, performance is still shit : [https://github.com/ollama/ollama/pull/5059#issuecomment-2816882002](https://github.com/ollama/ollama/pull/5059#issuecomment-2816882002) CUDA or Vulkan, other stuff is currently shit. I love my AMD GPU, but for AI... Amd really need to wake up.

Reply

[-]

InsideYork@reddit

I’ve blocked your useless posts

Reply

[-]

MikeLPU@reddit

I just added a rhel9 rocm repo and everything works fine. It's officially supported.

Reply

[-]

AppearanceHeavy6724@reddit

Vulkan has issues with flash attention.

Reply

[-]

giant3@reddit

From some posts on llama.cpp, flash attention is only available on GPUs with **coopmat2** extension. It has nothing to Vulkan AFAIK. On other GPUs, if you enable flash attention, it swaps data to RAM and uses the CPU which makes the performance go down as there is constant swapping from RAM to VRAM.

Reply

[-]

AppearanceHeavy6724@reddit

Flash attention works fine on 3060 CUDA but not with Vulkan.

Reply

[-]

giant3@reddit

Can you check with `vulkaninfo | grep coop`?

Reply

[-]

Nexter92@reddit

Lol, i use flash attention everyday, no issue at all (llamacpp, gemma 3 12/27B, Q4\_K\_M).

Reply

[-]

AppearanceHeavy6724@reddit

On Nvidia with Vulkan prompt processing massively slows down (compared to CUDA), esp at Q8 quantised cache, 1/2 to 1/4 of cuda PP.

Reply

[-]

Nexter92@reddit

CUDA is well written, ROCm is not, and AMD card have very very great support with vulkan on windows or linux 😉

Reply

[-]

AppearanceHeavy6724@reddit

what is your prompt proccessing speed on say LLama 3.1 8b at Q8 cache on AMD?

Reply

[-]

plankalkul-z1@reddit

> ROCm require LTS version, it's a problem on linux. So do many CUDA[-based] libraries, and yet they do run fine on my Kubuntu 24.10. I agree that Vulkan seems to be a better solution than ROCm -- at the moment. As a side note, I'm yet to see a hardware company, any HW company, that is good at software. UI always looks like it was designed by their marketing alone... Thankfully, we no longer have NVIDIA-styled green bitmapped buttons that stuck like sore thumbs, but it still leaves a lot to be desired.

Reply

[-]

custodiam99@reddit

In Windows 11 it worked after I refreshed LM Studio and installed HIP. It was like 5 minutes. No problems yet.

Reply

[-]

WolpertingerRumo@reddit

NVIDIA superiority complex. Right now NVIDIA **is** superior in software support, by far, CUDA enjoying default status, ROCm is an addon. But I have a feeling this will change, and then it will be good to already have looked into alternatives.

Reply

[-]

custodiam99@reddit

Sure, I bough my GPU recently because only in 2025 was I sure that ROCm will be painless for me. AND it works now. I hope it will get better.

Reply

[-]

DrBearJ3w@reddit

W7900 was 48GB. RDNA doesn't have GDRR7 chips. Yes, architecture is better,but it's not that good. If those cards have HBM3e, then it's another story. Because I don't really care about cuda

Reply

[-]

HistorianPotential48@reddit

haha no these guys pulled a funny against zluda

Reply

[-]

Ok_Top9254@reddit

32GB is literally nothing for a workstation gpu... Nvidia starts at that capacity and currently goes up to 96GB lol.

Reply

[-]

Freonr2@reddit

32GB for workstation class GPU when NV is delivering up to 96GB on Blackwell Pro is fairly weak. I'd hope to see 48/64/96GB cards to be competitive. 48GB Blackwell is ~$4600. In theory the 5090 32gb is $1999 (admittedly, good luck on that). Pricing has to make sense in that context along with some discount to make up for the software stack and variant on actual availability on cards moving forward. They could try for $1999-$2499 if they actually deliver and if 5090s remain elusive maybe, but even that is a bit of a stretch. If they offered some sort of NVLink-like interface between cards that could add value since NVLink disappeared from everything outside datacenter class. A bit underwhelmed. AMD could really capture market by offering better $/GB even if all other specs are a bit behind. GDDR6 already means bandwidth is likely going to be a bit lame unless they've got some space magic, like a huge SRAM cache and prayers the software can utilize it effectively.

Reply

[-]

mindwip@reddit

Odd way to write 48 or 64 or 96!

Reply

[-]

Sicarius_The_First@reddit

2 little, 2 late

Reply

[-]

beedunc@reddit

Sounds expensive. I don’t know how that helps us.

Reply

Reply to Post

107 Comments