TheaterFire

AMD preparing RDNA4 Radeon PRO series with 32GB memory on board

Posted by noblex33@reddit | LocalLLaMA | View on Reddit | 107 comments

Reply to Post

107 Comments

gfy_expert@reddit

Radeon pro 7000 48gb owners, are old model any good ?
View on Reddit #54237681

SmellsLikeAPig@reddit

These are fp16 this one can do fp8, seems a lot faster for AI as well
View on Reddit #54238693

gfy_expert@reddit

yeah, but it's about getting an idea before new models hitting shelves, how good rocm is, if it's possible to run on win11 at decent speeds etc.
View on Reddit #54239844

b0tbuilder@reddit

Run on Linux. Why would you run on Windows?
View on Reddit #59535119

SmellsLikeAPig@reddit

I wouldn't buy fp16 cards at this point
View on Reddit #54251890

gfy_expert@reddit

I just try to run digital waifu, gguf file, image generation, TTS and trying talk llama fast. 4060ti can do all this, but not all of these at once. koboldai+silytavern for roleplay and stability matrix/comfyui for images generation with models from civitai. for video generation 16 gb vram is enough on framepack but don't have 64-128gb ddr4/5.
View on Reddit #54253276

CarefulGarage3902@reddit

But it can’t even do fp4? the rtx 5000 series can do fp4. Maybe they’re like not even trying to sell us ai enthusiasts this card and are just targeting gamers/video editing etc.
View on Reddit #54249929

SmellsLikeAPig@reddit

I don't know how useful fp4 is really. Aren't models quantised to 4 bits to lobotomized?
View on Reddit #54251816

CarefulGarage3902@reddit

I think the idea with having fp8 and fp4 support is that the gpu will have to do less calculations to go from fp16 to 4 bit for some layer. I’m real impressed by the dynamic quants like gptq that keep some layers at higher bits and then put other layers at lower bits like 4 since those layers affect the performance/accuracy less. Instead of quantizing a whole model to 4 bit we may have some layers at 4 bit, others at 8, others at 16, and so on and end up with real good performance for the amount of compute. I imagine fp4 support would mean better performance/less compute on the 4 bit layers, but I’m not too knowledgeable on the subject yet.
View on Reddit #54264714

ResponsibleTruck4717@reddit

Can someone explain me why Intel / Amd not making some mid / high range card with absurd amount of vram like 128gb just to flood the market.
View on Reddit #54298509

EugenePopcorn@reddit

Because these firms are all run by business goons obsessed with market segmentation.
View on Reddit #54309026

ResponsibleTruck4717@reddit

Correct me if I'm wrong currently Nvidia is the one controlling the market right? wouldn't be better for amd / Intel get a foot hold os more tools will works with their cards.
View on Reddit #54309607

EugenePopcorn@reddit

That would be a way to deliver massive value for customers, but the business goons have their hearts set on delivering massive value to shareholders by selling data center GPUs instead. 
View on Reddit #54356147

crantob@reddit

And who created that surge in demand? The people in DC and their friends with the printed money that you and I do not have access-to.
View on Reddit #55994419

Bandit-level-200@reddit

32gb following nvidia as always
View on Reddit #54235055

Medium_Chemist_4032@reddit

I swear AMD feels like NVidia's controlled opposition
View on Reddit #54242579

grady_vuckovic@reddit

No need to compete when there's only two choices in the market and you can simply match your competitor rather than undercutting them on price aggressively.
View on Reddit #54298569

crantob@reddit

But that IS competition. This isn't a charity. You're always compromising per-unit profit versus total profit in your pricing. And you're always trying to get the best selling price you can. Right now there's a flood of institutional, corporate and government money (which flows into institutional and corporate) buying away resources from we, the people. That's a real problem that takes some learning to understand.
View on Reddit #55994168

emprahsFury@reddit

its crazy how far behind AMD is. Nvidia is releasing 96 gb cards to the consumer (and the $/GB is the same as a 5090).
View on Reddit #54242692

KontoOficjalneMR@reddit

> Nvidia is releasing 96 gb cards to the consumer (and the $/GB is the same as a 5090). Huh? What card is that?
View on Reddit #54243183

frankchn@reddit

RTX Pro 6000
View on Reddit #54248504

KontoOficjalneMR@reddit

I was going to say that's bullshit but then I checked how much 5090 costs now.
View on Reddit #54249518

frankchn@reddit

Yeah it is 33% more per GB based off MSRP pricing, but I am not sure how available the $2000 5090 FE is — realistically if you want a RTX 5090 today you are going to spend $3000+. Meanwhile, previous generations of RTX workstation cards are generally available at MSRP.
View on Reddit #54250083

KontoOficjalneMR@reddit

I checked nd it's available for 2k on best buy USA website. I found several others around 2200$ as well. So I think if you try you can get it fro MSRP. And 8500 is still a speculated/leaked price AFAIK not MSRP.
View on Reddit #54252056

frankchn@reddit

I just checked the Best Buy website and there is a product listing for the Founder’s Edition at $2000, but it is “Sold Out” and apart from occasional stock drops have been that way since launch. If you search on Newegg for stock available to ship, it is all priced beyond $3000.
View on Reddit #54252355

KontoOficjalneMR@reddit

Electronics prices are a general shitshow than;s to Trump's tariffs. Like I said we'll see what'll be the price of RTX Pro 6000 once it's actually available to order.
View on Reddit #54252591

frankchn@reddit

Yeah, no disagreement there.
View on Reddit #54252634

Hunting-Succcubus@reddit

Full disagreement here, tarrif only apply to us border. Other countries should have original pricing
View on Reddit #54290046

avinash240@reddit

Got a link to this 2k Best Buy 5090? I'll buy it right now.
View on Reddit #54282861

Hunting-Succcubus@reddit

Why cuda core not multiplying with 3? Are vram cost that much? Thats silly. I need 60000 coda core for 6k$. And 3x vram.
View on Reddit #54289753

emprahsFury@reddit

the cheapest 5090 on newegg is 2500. 3 of them is 7500. That means there is an extra 1000 premium for the vram on an rtx pro 6000. Which is an extra $10/GB. So sorry for the egregious lie. I'm sorry the price of a fast food meal too big a lie for you to countenance.
View on Reddit #54259680

KontoOficjalneMR@reddit

No. That's the price of 96 fast food meals. And 30% difference in price. So quite the bullshit. You were wrong - own it, instead of shifting goalposts.
View on Reddit #54260855

thrownawaymane@reddit

Lol what consumer The 96gb card is 1000% enterprise
View on Reddit #54244538

emprahsFury@reddit

if you can buy it from consumer channels it's available to consumers. You can order it the same way you can order a 5090.
View on Reddit #54259811

kb4000@reddit

I don't see any consumer facing listings anywhere in the US from an official retail partner.
View on Reddit #54264962

Bandit-level-200@reddit

> Nvidia is releasing 96 gb cards to the consumer enterprise and don't mistake it for goodwill, extra vram does not make it worth its 8k price tag memory modules doesn't cost 1k a piece like nvidia seems to try to tell us
View on Reddit #54244733

emprahsFury@reddit

no one said it was based on goodwill.
View on Reddit #54259747

frankchn@reddit

It is not worth it to us consumers, but that’s not their target market. It is for companies who won’t blink at spending $30k a computer for their ML engineer. After all, what’s $30k if you are already paying the engineer half a million a year, especially if they are more efficient.
View on Reddit #54248776

My_Unbiased_Opinion@reddit

This thing is DOA at anything above 1500. At some point, people would rather just buy a 5090. 
View on Reddit #54258566

HugoCortell@reddit

If they make it 1000-1200, it'll be great. Otherwise, stacking old 3090tis will still be king.
View on Reddit #54621098

Xyzzymoon@reddit

How?
View on Reddit #54297975

custodiam99@reddit

Well the price is the most important factor.
View on Reddit #54233946

FastDecode1@reddit

Not for enterprise users. "Pro" means it's a professional card for people who use it to make money, so even if it costs thousands (which it does), the card pays itself back in no time. The last Radeon Pro card with 32GB VRAM (W7800) had an MSRP of $2,500.
View on Reddit #54237468

BusRevolutionary9893@reddit

He obviously meant for us. 
View on Reddit #54237830

FastDecode1@reddit

"Us" referring to whom exactly? The only obvious thing here is that this is an expensive card aimed at the professional market, not the home/hobbyist user. I'm sure there's plenty of enterprise/pro folks here who want to run models locally for the same reasons that home users do. Being able to better guarantee data privacy and security because you're not sending it over the internet (potentially to another country) to be processed on someone else's computer is very valuable in the professional space, not just for home users. The most important for the target audience of this card is availability and the quality of support, not the price.
View on Reddit #54253963

HugoCortell@reddit

Us refers to we, comrade. The people demand bread and graphics cards.
View on Reddit #54620801

CarefulGarage3902@reddit

There’s an nvidia verified gamer/creator program now for getting to buy an nvidia 5080/5090 on the nvidia marketplace at msrp. If they think I would pay $500 more for a card with the same specs and no CUDA then they some dumb dumbs. Maybe the exception here would be if someone was wanting to buy multiple for making a multi gpu rig, but even then I imagine CUDA with some 4090’s or 3090’s would be better. I suppose there’s the possibility that they’re going to surprise us with some CUDA like new software that justifies the msrp, but I doubt it. Given the lack of CUDA, what is the most yall would pay for this gpu? Comment below
View on Reddit #54249393

Ninja_Weedle@reddit

800$
View on Reddit #54263392

nostriluu@reddit

$801
View on Reddit #54270648

bblankuser@reddit

So we're paying more for the same amount of vram?
View on Reddit #54263887

Such_Advantage_6949@reddit

If this card has higher msrp than 5090, it can be quite dead on arrival especially If it has same bandwidth, sam vram.
View on Reddit #54239602

PorchettaM@reddit

Enterprise cares about all the certifications and support you don't get with consumer cards. Nvidia is still selling 32 and 24 GB Pro cards even though the 5090 exists.
View on Reddit #54247849

b3081a@reddit

Probably $1000-$1200 at most.
View on Reddit #54241851

BusRevolutionary9893@reddit

The RDNA 4-based card with 32GB is likely to be a successor or comparable to the W7800, given the similar memory capacity and professional focus. The W7800’s $2,499 price sets a baseline.
View on Reddit #54244134

Such_Advantage_6949@reddit

Yea knowing them, that is wht they will fo. Then they can wonder why the card not selling
View on Reddit #54245631

resnet152@reddit

Well that and CUDA
View on Reddit #54245857

custodiam99@reddit

We are talking about inference.
View on Reddit #54249839

Rustybot@reddit

Sadly, it will be Market Price.
View on Reddit #54245654

gfy_expert@reddit

As well availability
View on Reddit #54237222

512bitinstruction@reddit

I would actually prefer if they added ROCm support to their uma iGPUs.
View on Reddit #54307284

Healthy-Nebula-3603@reddit

Why only 32 GB !
View on Reddit #54239772

b3081a@reddit

It's already the max that is possible for a 256bit GDDR6 bus. If they opted for GDDR7 then they could go 48GB and eventually 64GB.
View on Reddit #54241913

Conscious_Cut_6144@reddit

Can you not double up on ram like you do with dram, like 2/3 sticks per Channel? No bandwidth increase just additional ram
View on Reddit #54278391

b3081a@reddit

Desktop/server DDR can do this because they have chipselect pins so they can support multiple ranks per channel. GDDR don't have them, so all they can do is clamshell rather than increasing ranks. 32GB per 256bit GDDR6 is already using the highest available capacity GDDR chip and combining them with clamshell so there's no further chance of doubling the capacity
View on Reddit #54290697

Conscious_Cut_6144@reddit

Someone figured it out... [https://www.reddit.com/r/LocalLLaMA/comments/1j6i1ma/comment/mgp30xg/](https://www.reddit.com/r/LocalLLaMA/comments/1j6i1ma/comment/mgp30xg/)
View on Reddit #54297654

b3081a@reddit

That's obviously faked. It's over a month since then but we haven't seen any availability.
View on Reddit #54303542

Healthy-Nebula-3603@reddit

I don't understand why producers do not make multilayer VRAM memory like HBM or FLASH.
View on Reddit #54244791

Hunting-Succcubus@reddit

They make, it’s called HBM and it’s expensive.
View on Reddit #54294456

Alphasite@reddit

Isn’t that literally HBM??? AMD actually helped invent it and shipped a few consumer cards with it. It’s just more expensive than VRAM. 
View on Reddit #54292747

KontoOficjalneMR@reddit

It starts wih Mo and ends with ney
View on Reddit #54252967

Healthy-Nebula-3603@reddit

Lol......ehhhhh I hope they finally start building multilayer VRAM as we finally have reason for it know.
View on Reddit #54255447

AmazinglyObliviouse@reddit

That's my favorite impressionist painter
View on Reddit #54254407

relmny@reddit

Isn't the "upgraded" rtx 4900 48gb GDDR6? How come some people can make a 48gb with GDDR6 and ADM can't?
View on Reddit #54254478

eding42@reddit

You would need a fatter memory bus. This is the max possible under 256 bit assuming you’re not using 3 GB modules
View on Reddit #54258542

relmny@reddit

Still, why do they limit themselves? Is AMD, not some random very small business with a hand full of people that take some "old" 24gb GPUs and turn them into 48gb... Yet those very small businesses manage to do it and AMD don't. Some are even sold for about $3000
View on Reddit #54266109

Txt8aker@reddit

Blame the system. See, high demand = high cost. That means high cost for us and high cost for the manufacturer. Memory chip is used everywhere and the particular one used on GPUs are very special kind. It's also not about why they can't but they decide to do it for business reasons (gotta milk the consumers to make as much profit as it can)
View on Reddit #54286018

Allseeing_Argos@reddit

It's because AMD execs all have Nvidia stocks. so if they release a product that is too good they will personally lose money. They're gimping themselves on purpose.
View on Reddit #54275316

eding42@reddit

They limit themselves to the smaller memory bus for cost / yield reasons, memory controllers are more sensitive to defects + they don’t scale as well with smaller nodes. AMD 100% could make a 512 bit version of the 9070 XT die LOL but that would cost a LOT of money per chip (in addition to the fixed cost of the tape out, which is usually in the tens of millions of dollars) The 24 GB to 48 GB conversion is possible probably bc whatever GPU that was has a bigger memory bus.
View on Reddit #54266351

asssuber@reddit

AMD makes the 48GB W7800 with a $2500 MSRP. Partners used to be able to put more VRAM in GPUs in the past, but they are forbidden now by AMD and Nvidia, and I guess Intel too. The reason is to not canibalize that professional market where they charge absurd premiums for the extra VRAM. W7800
View on Reddit #54278974

Nexter92@reddit

They will run what ? ROCm ? LOL. The only way to make them usable is to sell them for 380/400$ MAX, that is gonna be good card for LLM but not with ROCm but Vulkan.
View on Reddit #54234364

custodiam99@reddit

I have an RX 7900XTX and I'm running ROCm on Windows 11 and LM Studio. It's speed is 92% of Vulkan but with better DDR5 memory management. I have no complains. What am I missing?
View on Reddit #54234693

Nexter92@reddit

Linux ROCm here. Almost every image generation or video generation is compatible with CUDA not ROCm or have problem with ROCm due to shitty code. For LLM text generation on linux, vulkan do not require anything, no LTS version of Ubuntu or what so ever. ROCm require LTS version, it's a problem on linux. Vulkan work without installing anything. Vulkan is faster than ROCM. Vulkan is non LTS locked. Vulkan is supported on 99% of Linux distribution.
View on Reddit #54235029

MikeLPU@reddit

I use fedora, no LTS shit.
View on Reddit #54235180

Nexter92@reddit

Fedora is not in the official compatible list of distro, one update > goodbye your working distro :) [https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/system-requirements.html#supported-distributions](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/system-requirements.html#supported-distributions)
View on Reddit #54235550

rusty_fans@reddit

Official support isn't necessarily better if the community keeps up with updates.
View on Reddit #54236268

Nexter92@reddit

L-O-L. even if that was true, performance is still shit : [https://github.com/ollama/ollama/pull/5059#issuecomment-2816882002](https://github.com/ollama/ollama/pull/5059#issuecomment-2816882002) CUDA or Vulkan, other stuff is currently shit. I love my AMD GPU, but for AI... Amd really need to wake up.
View on Reddit #54236576

InsideYork@reddit

I’ve blocked your useless posts
View on Reddit #54290480

MikeLPU@reddit

I just added a rhel9 rocm repo and everything works fine. It's officially supported.
View on Reddit #54263840

AppearanceHeavy6724@reddit

Vulkan has issues with flash attention.
View on Reddit #54237063

giant3@reddit

From some posts on llama.cpp, flash attention is only available on GPUs with **coopmat2** extension. It has nothing to Vulkan AFAIK. On other GPUs, if you enable flash attention, it swaps data to RAM and uses the CPU which makes the performance go down as there is constant swapping from RAM to VRAM.
View on Reddit #54248594

AppearanceHeavy6724@reddit

Flash attention works fine on 3060 CUDA but not with Vulkan.
View on Reddit #54250560

giant3@reddit

Can you check with `vulkaninfo | grep coop`?
View on Reddit #54252329

Nexter92@reddit

Lol, i use flash attention everyday, no issue at all (llamacpp, gemma 3 12/27B, Q4\_K\_M).
View on Reddit #54237977

AppearanceHeavy6724@reddit

On Nvidia with Vulkan prompt processing massively slows down (compared to CUDA), esp at Q8 quantised cache, 1/2 to 1/4 of cuda PP.
View on Reddit #54238567

Nexter92@reddit

CUDA is well written, ROCm is not, and AMD card have very very great support with vulkan on windows or linux 😉
View on Reddit #54238728

AppearanceHeavy6724@reddit

what is your prompt proccessing speed on say LLama 3.1 8b at Q8 cache on AMD?
View on Reddit #54243960

plankalkul-z1@reddit

> ROCm require LTS version, it's a problem on linux. So do many CUDA[-based] libraries, and yet they do run fine on my Kubuntu 24.10. I agree that Vulkan seems to be a better solution than ROCm -- at the moment. As a side note, I'm yet to see a hardware company, any HW company, that is good at software. UI always looks like it was designed by their marketing alone... Thankfully, we no longer have NVIDIA-styled green bitmapped buttons that stuck like sore thumbs, but it still leaves a lot to be desired.
View on Reddit #54237818

custodiam99@reddit

In Windows 11 it worked after I refreshed LM Studio and installed HIP. It was like 5 minutes. No problems yet.
View on Reddit #54235158

WolpertingerRumo@reddit

NVIDIA superiority complex. Right now NVIDIA **is** superior in software support, by far, CUDA enjoying default status, ROCm is an addon. But I have a feeling this will change, and then it will be good to already have looked into alternatives.
View on Reddit #54236647

custodiam99@reddit

Sure, I bough my GPU recently because only in 2025 was I sure that ROCm will be painless for me. AND it works now. I hope it will get better.
View on Reddit #54237038

DrBearJ3w@reddit

W7900 was 48GB. RDNA doesn't have GDRR7 chips. Yes, architecture is better,but it's not that good. If those cards have HBM3e, then it's another story. Because I don't really care about cuda
View on Reddit #54288031

HistorianPotential48@reddit

haha no these guys pulled a funny against zluda
View on Reddit #54287616

Ok_Top9254@reddit

32GB is literally nothing for a workstation gpu... Nvidia starts at that capacity and currently goes up to 96GB lol.
View on Reddit #54264644

Freonr2@reddit

32GB for workstation class GPU when NV is delivering up to 96GB on Blackwell Pro is fairly weak. I'd hope to see 48/64/96GB cards to be competitive. 48GB Blackwell is ~$4600. In theory the 5090 32gb is $1999 (admittedly, good luck on that). Pricing has to make sense in that context along with some discount to make up for the software stack and variant on actual availability on cards moving forward. They could try for $1999-$2499 if they actually deliver and if 5090s remain elusive maybe, but even that is a bit of a stretch. If they offered some sort of NVLink-like interface between cards that could add value since NVLink disappeared from everything outside datacenter class. A bit underwhelmed. AMD could really capture market by offering better $/GB even if all other specs are a bit behind. GDDR6 already means bandwidth is likely going to be a bit lame unless they've got some space magic, like a huge SRAM cache and prayers the software can utilize it effectively.
View on Reddit #54248976

mindwip@reddit

Odd way to write 48 or 64 or 96!
View on Reddit #54242592

Sicarius_The_First@reddit

2 little, 2 late
View on Reddit #54241095

beedunc@reddit

Sounds expensive. I don’t know how that helps us.
View on Reddit #54238799