AMD Ryzen AI Max+ PRO 495 APU could arrive with 192GB of unified memory — leaked PassMark benchmarks suggest modest update over Strix Halo

[-]

996forever@reddit

Reinforcing the rumours that “Medusa halo” or whatever is not happening for well over another year likely two.

[-]

EmptyVolition242@reddit

And even if it does come out, will it be available much? The Strix Halo chip was only available in a tablet for months.

[-]

WJMazepas@reddit

Even in laptops is not that much available. Only lower specs versions also

And I would 100% have the tablet, but the price is way too much expensive

[-]

torpedospurs@reddit

The tablet is limited to 80W TDP though. Leaving a lot of performance on the table.

[-]

YeshYyyK@reddit

why do people say this? it diminishes rapidly after 50W or so on GPU, CPU maybe 60W, combined won't be remotely 110W because uncore/idle is rather high

and most gaming laptops in the realm of portability will never do combined CPU+GPU anyways

[-]

Of course there are diminishing returns. But the graph in the video you linked tests it on the Z13 itself, where 80W corresponds to maximum fans and hitting maximum temps because the cooling solution is maxed out. The same 80W on a platform with beefier cooling will probably get you more performance because the chip is operating at a lower temp. With that beefier cooling, push it to 120W and you'll see scores more like the median Notebookcheck's 1823. The max for the chip is 140W and perhaps that's the wattage where you get Notebookcheck's max number of 1912.

And that's just CPU! If you need to work intensely both CPU and GPU then 120W has to be valuable. Most gaming laptops want to feed their GPUs with as much power as possible for this reason. I definitely would like to see how well the TUF A14 or A16 will perform with the Max 395 since they have more room for beefier cooling.

[-]

YeshYyyK@reddit

...the GPU literally flatlines at 45W

...the CPU gets 7.5% performance from 60 to 80W

instead of providing me a source, you are gaslighting with what I provided. idk who is trolling anymore

[-]

torpedospurs@reddit

If Strix Halo could give full GPU performance at 45W it would have wiped the floor with the 5060s and Panther Lakes of the world. But it can't and it hasn't.

Let's see it on a TUF A16 where it can really stretch its legs like it does on the mini PCs.

[-]

WJMazepas@reddit

More than enough for me

Also, it would be a great tablet to hook up to an eGPU and use all that CPU

[-]

NerdProcrastinating@reddit

I would expect it to be a delayed OEM release as the LPDDR6 would be a new board design.

[-]

996forever@reddit

It will be yet another extremely niche product (one of AMD’s MANY) that exists in 3 weird form factor “laptops” and a dozen of no name brand mini pcs that ship to 10 regions.

The entire tech internet will wank over it but it will be poor value for money with wonky software support and it will have zero real life presence until the next gen comes up. Rinse and repeat.

[-]

Kryohi@reddit

Same as the infamous Nvidia N1 then.

[-]

996forever@reddit

Probably yes. Although that one does exist in a thinkstation.

[-]

Cheerful_Champion@reddit

Strix halo got swallowed by mini pcs marketed as home llm servers.

[-]

IORelay@reddit

But miniPC is not a big market, AMD didn't produce that many, also strix halo's performance is not that stellar compared to other laptops chips just has large ram/vram.

[-]

Cheerful_Champion@reddit

Which is exactly what you need if you want to run a decent model locally

[-]

IORelay@reddit

The bandwidth is actually a bit too low for even that it can run MOEs okay, but any osrge dense model it'll not be good.

It's easier and cheaper than sourcing a bunch of 3090s, I'll give it that.

[-]

Cheerful_Champion@reddit

I thought that Medusa Halo was expected in late 2027 from the start, wasn't it? Given the Zen6 + RDNA5 combo it should arrive 1 year after Zen5 (as usual).

[-]

996forever@reddit

It’s looking more like CES 2028 if a “refresh” is leaked in April which points to release in next year’s CES when it comes to amd mobile products.

[-]

Hour_Firefighter_707@reddit

More memory is always welcome, of course, but is the 8060S even remotely powerful enough to get usable token generation speed on a model that needs that much RAM?

[-]

NerdProcrastinating@reddit

It would help a lot with workstation usage + 120B models.

On my Framework desktop 128GB, I find that I end up using so much RAM for browser, slack, desktop apps, local voice typing model, and a few heavy dev environments for parallel agent work (a few docker containers + vite, vs code, eslint + ts server, bloated Claude Code) that I can't afford to leave a 120B model loaded.

As other comments mentioned, it would enable running models like MiniMax at Q4 at slow, but still usable speeds so that's a big win (assuming you don't have a bloated dev setup like I do).

[-]

Evilsushione@reddit

I had the same problem then I switched my dev machine to Linux and now I have plenty of ram. I also stopped using VS code for the majority of my development. I mostly go with a custom headless IDE for Agent development. This save a ton on memory overhead.

[-]

NerdProcrastinating@reddit

This is running Fedora. My main problem is really just specific to the app dev environment I'm working on as it is replicating full cloud infrastructure and is a very large project. Then a few copies of the app running with everything loaded.. oh well.

[-]

VenditatioDelendaEst@reddit

Fedora's out-of-box configuration uses Zram and no swap. Maybe try disabling that and adding a swap file/partition + zswap? Zswap seems to be getting more non-Android kernel dev attention these days, and with disk swap idle browser tabs can be completely gotten out of the way. Zram theoretically has writeback, but without Android's massive userspace infrastructure you're not benefitting from it.

[-]

Cheerful_Champion@reddit

Memory bandwidth will be the main bottleneck anyway

[-]

NerdProcrastinating@reddit

Perhaps they will give it a small bandwidth boost above the current 8000 MT/s.

Hopefully 8533 MT/s at a minimum. AMD seem to suck at achieving high memory speeds so I doubt they'll get to 9600.

[-]

NerdProcrastinating@reddit

heh someone offended calling out AMD memory speeds to down vote. Reddit is weird.

[-]

torpedospurs@reddit

Strix Halo was initially billed to support 8533 MT/s and the pre-release products were all billed as such. But when the release came the number went down to 8000. I still wonder what happened.

[-]

ChocomelP@reddit

The main bottleneck depends on your use case and what model you use.

[-]

Cheerful_Champion@reddit

If you are getting a mini PC with relatively low compute power and with 192GB of unified memory it's pretty telling what models and use cases you are aiming for and then bandwidth is going to be the bottleneck in pretty much all of them (I guess RAG is an exception).

[-]

mckirkus@reddit

Epyc can do 12 channels with 16GB DIMMs at 192GB. I wonder if they could pull off something similar without the ridculous power requirements Epyc has. (typing this on an Epyc that is idling at over 100 w).

[-]

Cheerful_Champion@reddit

They could. They are designing this stuff so there's nothing stopping them from creating APU with 12 channels, small CPU and big GPU. It wouldn't fit miniPC format, but it's doable. The main problem is this would be expensive product for a very niche use case. Any potential sales wouldn't cover development costs.

[-]

Evilsushione@reddit

Aren’t these things designed as multi packaged chiplets? If so the development costs aren’t that much since it’s part of a family of products. Also don’t discount the effectiveness of Halo products that set reputations for entire product lines.

[-]

Cheerful_Champion@reddit

The thing is, this wouldn't be based on existing product. This would require more custom work

[-]

mckirkus@reddit

Yeah, desktop starts to run into stability problems at 4 channels. And it's $500 for 16GB of Registered DDR-5 right now.

[-]

IORelay@reddit

The memory bandwidth is okay for MOE models but strix halo already chokes on a 70b dense model which is only 40ishGB at Q4.

[-]

waitmarks@reddit

Not unless the memory bandwidth also increases. That is it's main bottleneck currently.

[-]

Cheerful_Champion@reddit

Per rumors, it should support 8533 ram vs 8000 of strix halo

[-]

Reactor-Licker@reddit

Then it would be on par with the memory bandwidth of DGX Spark / Nvidia GB10.

[-]

waitmarks@reddit

That should be a nice minor bump, but I don't think that will majorly change what kind of models it can run.

[-]

lazyhustlermusic@reddit

Would be cool to throw a couple more channels in there

[-]

996forever@reddit

Apparently Medusa Halo will get 384bit bus

[-]

sussy_ball@reddit

That's a given if it uses LPDDR6. LPDDR6 uses 24 bit channels instead of 16 bit channels found in LPDDR5.

[-]

Cheerful_Champion@reddit

I think so too. If someone wants true change then only option is to wait for Medusa Halo (that's, according to rumors, supposed to double or even triple bandwidth) or get mini pc from Apple.

[-]

BloodyLlama@reddit

One you start getting that big prompt processing becomes a bigger bottleneck than token generation, and if my strix halo is any indication the answer is "sort of" in that anything better costs significantly more.

[-]

wywywywy@reddit

It's actually a 8065S, but it doesn't make much of a difference anyway

[-]

Tai9ch@reddit

It won't be fast, but that'll enable running slightly larger MoE models like MiniMax at Q4, as well as keeping multiple models loaded for stuff like switching between two models that are good at different things (e.g. one is multimodal).

[-]

RedTuesdayMusic@reddit

Wake me up when the iGPU is RDNA4+

[-]

InflammableAccount@reddit

More RDNA4 configurations aren't in their roadmap. It was a stop-gap for desktop and entry level workstation cards.

UDNA, the replacement uArch, is supposedly getting the full treatment. Desktop, mobile SOC integration, Datacenter.

Given that's been the plan for a while, it makes little sense for them to backtrack and start the work on porting RDNA4 to SOC integration when it wasn't designed for it.

[-]

Vince789@reddit

But UDNA isn't arriving until at least 2027, so RDNA4 will get about 2.5 years, which is essentially typical GPU cycle

RDNA4 also brought among the largest architectural changes too, hence RDNA4 isn't s short stop-gap solution

Also Samsung's Exynos 2600 has a RDNA4 based iGPU, there's no technical reason preventing AMD from bringing RDNA4 to their laptops if they wanted

[-]

InflammableAccount@reddit

Because UDNA was delayed.

It was supposed to be coming... in a month or two. Clearly the delays put it back another 6 months.

Take a look at the design of RDNA 4's layout. It's clearly not designed for scaling down.

[-]

Vince789@reddit

If we're counting delays, it's still about 2 years for RDNA4 since it was also delayed about 6 months

Take a look at the design of RDNA 4's layout. It's clearly not designed for scaling down

Explain. Any decent architecture should easily be able to scale from desktops to laptop (great architectures should be capable of scaling down to phones)

And how Samsung put RDNA4 into the Exynos 2600 smartphone chip

[-]

InflammableAccount@reddit

Oh, excuse me, I screwed up my comment. I was pretty tired. I meant to link this rumor news from 2025: https://www.techpowerup.com/339101/amds-upcoming-udna-rdna-5-gpu-could-feature-96-cus-and-384-bit-memory-bus

[-]

PMARC14@reddit

It is just laziness, they don't want to put in the cost to design a new iGPU segment when UDNA was supposed to sweep away everything. Considering UDNA and LPDDR6 ended up delayed a bit they will just coast till they can get it out. Helps that every bit of silicon is data center with crumbs for consumer, they will just hold up on consumer innovations and shinies till enough capacity arrives to grab everyone holding out. Applies to everyone in silicon basically.

[-]

MrMPFR@reddit

RDNA 5 is not even iGPU, only the premium options with AT3 and AT4 chiplets. GFX11.7 (RDNA4m/RDNA3.5+) mobile extending into Zen 7 mobile at least.

Shared R&D pipeline gonna help. Alternating between CDNA and RDNA gonna be a gamechanger for AMD. Not more keeping CDNA on a divergent path starting with CDNA 5.

I can't wait to hear more soon. By late June at ISC 2026 AMD will release CDNA 5 whitepaper and that'll provide many clues for RDNA 5 considering it borrows many things from that architecture.

[-]

996forever@reddit

Every other Radeon generation is a stopgap. Stopgaps that last as long as a “serious generation” at that.

[-]

Kryohi@reddit

Arguably most gens that don't end up in consoles aren't "serious" ones

[-]

996forever@reddit

Aka all but once or twice per decade

[-]

Evilsushione@reddit

I’ve been saying this for years but they need to make a unified APU that has both their top of the line processor and gpu with massive amounts of ram in a unified package. It would be ridiculous in price and wouldn’t be upgradeable but would offer insane performance and would differentiate them amongst their peers.

[-]

Good-Hand-8140@reddit

Isn't this it?

[-]

Evilsushione@reddit

Closer but it’s not their best in class CPU and GPU

[-]

Kryohi@reddit

You're gonna wait until Q4 2027 at a minimum. Medusa Halo with RDNA5 and lpddr6 isn't coming anytime soon, and this refresh of Strix Halo basically confirms it.

[-]

MrMPFR@reddit

Agreed. Realistically I put an actual launch no earlier than Computex 2028. Market is a mess and AMD is not in a hurry rn.

[-]

foxfox021@reddit

and when the price gets reasonable

[-]

996forever@reddit

You will sleep for eternity

[-]

mediandude@reddit

Exynos includes RDNA as well, doesn't it?

[-]

996forever@reddit

I was told on this sub the xclipse gpus are custom Samsung solutions only loosely based on rdna, and therefore their speculation that rdna4 doesn’t scale down to mobile stands, and therefore despite benevolent corporate AMD’s best efforts RDNA4 mobile and APUs couldn’t be made. Was I lied to?

[-]

Vince789@reddit

I mean there was never any architectural/technical reason to believe RDNA4 wouldn't work in laptops anyways

The Exynos 2600's GPU is only ~31.4mm2, easily small enough for laptops even on less dense nodes and with additional blocks Samsung may have cut out

For reference:

AMD Zen5 Strix Point's 8WGP RDNA3.5 was about 42.5mm2
Intel's Panther lake-H's Xe3 12C is about 55mm2

[-]

nisaaru@reddit

If they want these products to compete they have to.

[-]

996forever@reddit

What actions of theirs so far made you think they wanted to?

[-]

nisaaru@reddit

I didn't really get the impression that AMD really tried to compete with the current Strix Halo. That looked more like testing out a prototype like product with a new CPU/GPU die connection which also came later than previously planned.

To me it looked like a small scale overpriced launch. They need Zen6 for a lower power APU and be more competitive vs. Apple and Intel's new APUs.

[-]

foxfox021@reddit

true that

[-]

Malygos_Spellweaver@reddit

RDNA 3.5 lmao, ok sure, even for AI this is lacking in the AI cores...

[-]

waiting_for_zban@reddit

RDNA 3.5

This is what bogged me about the strix halo. Running optimized FP8 is totally out of the question with vllm. Luckily thanks to the community, llama.cpp is really great now.

[-]

Loose_Skill6641@reddit

yeah it's kinda pointless chip

[-]

Mysterious-Duty2101@reddit

unified memory = 🤮🤮🤮🤮
Just put LPCAMM2 or two of them to get quad channel.

[-]

Kryohi@reddit

The two concepts, unified memory and replaceable memory modules, are completely uncorrelated.

[-]

Mysterious-Duty2101@reddit

Nah, companies are gonna push unified memory because it's good for their profits. The big advantage of x86 is modularity, if you're gonna lose that, then you might as well just buy a MacBook.

[-]

Cory123125@reddit

You are just fundamentally misunderstanding what unified memory is.

You could technically have modular unified memory, and if anything unified memory saves you money as the same memory can be used for multiple purposes and you don't have to copy things from pool to pool (still materializing/partially existent)

[-]

tamerlanOne@reddit

Aumentare di poco le velocità generali di CPU e GPU senza mettere mano alla banda della memoria ram non è una gran cosa... Ok i 200gb di memoria unificata ma modelli pesanti saranno ancor più penalizzati dalla mancanza di banda memoria ristretta. Quindi a che pro aumentare la ram fisica se poi la banda è il vero collo di bottiglia di questo hardware?

[-]

Frissu@reddit

Cooool another ultraniche product, put only inside a devices for AMDillion dollars.

Also AMD is having a Polaris moment with RDNA3 i see.
2030 - AMD is releasing AI Max 666 APU with RDNA 3.5.5.5+++ OC Black Edition

[-]

Sylanthra@reddit

Can't wait to see the next generation of 2.5lb "handhelds" with external water cooling setup powered by this thing.

[-]

Snapdragon_865@reddit

Anything is a handheld if you're jacked enough

[-]

narwi@reddit (OP)

A stoday is May 4th - "we used Lord Vader as the test user and he force levitated the handheld just fine".

[-]

imaginary_num6er@reddit

Are these new PRO chips the reason why AMD is nerfing ECC RAM support on consumer chips? Just like what AMD did on AM4 if you wanted both integrated graphics and ECC.

[-]

narwi@reddit (OP)

I think not. I think that is market segmentation so there is more space for AM5 socketed EPYC.

[-]

Zombiliescu@reddit

low end cpus need better igpus , they should just make them with 4 channel memory already ; who is buying 1000$ apus ? 1% of the market ?

[-]

PM_ME_UR_TOSTADAS@reddit

Putting 4 channel memory on APUs will make them cost $1000 anyway. IMO the solution is downscaling the CPU to match the weak GPU. Then the memory bandwidth bottleneck is less of an issue too.

[-]

Minced-Juice@reddit

These APUs would have made sense and have wider adoption if AMD had swallowed the price of the LPDDR5 modules and passed it on to the OEM at zero additional cost, like Intel does with Lunar Lake.

Of course, AMD is in no position to do that; so this will remain a niche for the foreseeable future.

[-]

Tai9ch@reddit

Supply and demand determine price.

AI stuff is still in high demand. Even at $3k, these Strix Halo PCs are still a pretty good deal as prosumer local LLM hosts.

It's really disappointing that supply isn't higher and prices aren't coming down faster, but being able to run medium size LLMs today is kind of like doing AAA 4k gaming was in like 2014. Enthusiasts can do it for $$$, but the hardware production capacity just doesn't exist yet to sell it at mainstream prices.

Personally I hope that prosumer demand stays high and datacenter demand drops a bit so we get more offerings in a few years as production catches up with demand. If we get the big demand crash that lots of PC gamers seem to be hoping for it'll give us a small short term price drop but it won't give us the new hardware releases that everyone wants in a few years.

But if demand can sustain long enough that new fab capacity actually gets built (which takes a couple years), then we should see new releases in all the market segments at reasonable prices and everyone will be happy. In like late 2028.

[-]

Minced-Juice@reddit

Speculative "hoarding" is driving memory prices. Not supply and demand.

Memory pricing has become like an auction, and the memory suppliers have no intention of increasing supply.

[-]

ProZoid_10@reddit

This is ai money printer

[-]

996forever@reddit

The lack of any system through Dell and Lenovo would tell you it's really just a niche low volume product.

[-]

SirActionhaHAA@reddit

These ain't consumer skus, the name already told ya so. Idk why people think that everything is aimed at gamers or whatever, nah these for ai.

[-]

Minced-Juice@reddit

AI Max Pro 395+ can be found in HP Zbooks. The 128 GB config goes for 5500 euros at geizhals.

Nothing in my comment remotely said anything about wanting these for gaming.

It is a simple expression of the fact that a big reason for poor volume of Strix Halo - other than TSMC taking its sweet time to increase InFO capacity - is the OEM's having to foot the bill for the memory cost.

[-]

LastChancellor@reddit

Fix the low wattage memory bug, then we'll talk

[-]

Buckwheat469@reddit

I was looking into the Framework laptops with the AI Max+ chips in them. They look fairly promising for small to medium local LLMs.

I was also trying to figure out if it's possible to unify my integrated graphics chip with a 7900 XTX but it's kind of clunky to do. I wish it were easier for computers to just see graphics cards like memory and use them seamlessly as one cohesive unit (like SLI), rather than having to assign tasks to different cards, or to be relegated to not using the integrated graphics at all.

[-]

saturnworship@reddit

So we're going to see amd m1 moment?

[-]

Various-Welder5544@reddit

They'll do anything but put RDNA 4 on their apus. No FSR 4 Lmaooo.

[-]

DehydratedButTired@reddit

Why buy a laptop with 5k of memory when it could cost 10k?

[-]

AutoModerator@reddit

Hello narwi! Please double check that this submission is original reporting and is not an unverified rumor or repost that does not rise to the standards of /r/hardware. If this link is reporting on the work of another site/source or is an unverified rumor, please delete this submission. If this warning is in error, please report this comment and we will remove it.

I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.