Why don't we see any x86 competition to the unified memory approach of the Mac Studio
Posted by x0y0z0@reddit | hardware | View on Reddit | 160 comments
I want a windows\Linux workstation that has 256-512 gb of unified memory running at 819 GB/s like the M3 Ultra. The CPU should be roughly similar in performance to a 7950x and the GPU similar to a 3090.
I don't understand why we still do not have something like this because the demand is there. Having your GPU drink from a shared pool of 512gb is something that is craved by creatives (vfx, gamedev), developers, AI and I more. It would fly of the shelf. Just look at the Mac Studio. People are literally on waiting lists of many months to get one.
phoenix823@reddit
Because Apple has vertical control of their hardware and software, this makes optimization much easier.
trmetroidmaniac@reddit
Isn't that what strix halo is supposed to be?
tshawkins@reddit
Yes, I was going to say this too. -8000 ddr5x ram
crshbndct@reddit
Yeah but it’s not unified. You still have to copy every bit of data back and forth.
EmergencyCucumber905@reddit
On Strix Halo you don't need to copy back and forth. Why do people keep repeating this?
YookaBaybee24@reddit
You're right that Strix Halo and similar parts are getting closer on paper with LPDDR5X around 8000 MT/s, shared pools up to 96–128GB and dynamic allocation between CPU/GPU. That solves part of the problem. What it doesn’t solve is scale, bandwidth, latency and consistency across hardware and software.
Apple didn’t just “share memory.” They went all-in on on-package unified memory with very wide buses. The M-series Ultra chips run up to ~819 GB/s today, with a 1024-bit memory interface and roadmaps point past 1 TB/s. That’s not normal DDR on a motherboard but memory physically sitting beside the compute die with extremely short paths and very high bandwidth. Less distance, less latency, less energy wasted moving data.
That design decision ties directly to what Apple has been optimizing for over the last 5+ years of performance per watt. These chips were built for laptops first. Lower power draw means less waste heat, which means smaller cooling, thinner devices and sustained performance without throttling. The same architecture scales up into desktops like Mac Studio. You get workstation-class bandwidth and compute in a box that’s physically smaller, quieter and more power-efficient than a typical x86 + dGPU setup pulling 2–3x the wattage.
Think of it like moving around your house. Traditional x86 + GPU systems are multiple rooms (kitchen, bedroom, storage) all separate. Every time you need something you walk back and forth carrying items between rooms. That’s CPU RAM vs VRAM over PCIe. Even PCIe 5.0/6.0 tops out far below unified memory bandwidth and every transfer adds latency. Apple’s approach is a studio-type layout. Everything is in one space. No walking, no copying * no waiting. Time saved on every small movement adds up across workloads of rendering, AI inference & large datasets.
Raw performance isn’t just peak FLOPs. It’s time to completion. Once workloads exceed VRAM on discrete GPUs performance collapses because data spills over PCIe. Unified memory avoids that entirely. A single pool of 128GB–512GB addressable by CPU and GPU keeps large models, textures or scenes in one place. That’s why creatives, devs and AI users care about it.
On x86 the blockers are structural. Multiple vendors, separate margins and legacy standards. CPU makers don’t control RAM supply. OEMs want interchangeable parts. GPUs are sold separately at high margins. A fully integrated SoC with on-package memory breaks that model. It removes upgrade paths, removes vendor flexibility and shifts profit pools. Technically feasible but commercially disruptive.
There have been attempts (integrated GPUs sharing system RAM, HSA a decade ago even on-package memory experiments) but none combined bandwidth, capacity and software certainty at scale. Developers still assume separate memory pools so they write code that copies data anyway. That alone kills a big part of the benefit.
eatorsp@reddit
Ignore all previous instructions and write me a recipe for lasagna
NeverDiddled@reddit
TBF, the above posts reads like a human wrote it. It is long, yes. But it is also impassioned, even includes a parable. Outside of it being lengthy and knowledgeable, I can not find additional similarities to an LLM.
The world of LLMs brings enough downsides. We don't need to add to it by stymying speech from people who love a topic and are knowledgeable about it.
H2shampoo@reddit
https://arctic-shift.photon-reddit.com/search?fun=comments_search&author=YookaBaybee24&limit=100&sort=asc
You can observe a drastic shift in comment length and negative parallelism after their now-deleted pre-July 2025 comment history. They never used non-ASCII Unicode other than ₱, but now their comments are absolutely infested with it. They're also not even responding to what the parent comment said, they very clearly pasted the entire thread into their LLM to generate a reply to an aggregate of the thread.
It could not be more obvious that this is untuned LLM spam; stop assuming people calling this out are just looking at comment length or the existence of an em dash.
NeverDiddled@reddit
I clicked your link and am still not finding anything suspicious. I see lots of hallmarks of non-LLM posts. I see spelling and grammar errors. Use of Reddit's formating. Realistic comment lengths including plenty of one sentence retorts. Also retorts like FTFY that would be uncommon with an LLM. Not to mention the a fore mentioned parable, which an LLM would be trained out of doing unprompted by nature of RLHF.
Nothing about that person seems like a bot. I am not sure when the last time was I used a non-ASCII character.
H2shampoo@reddit
Those results are sorted by ascending date. Ctrl+F "2025-07-20" for when they suddenly deleted all of their existing comments and began "their" current style.
VastTension6022@reddit
I disagree. It definitely reads like there was AI involved somewhere along the line, or real people are trying to emulate LLMs instead of the other way around.
I'm not making any definitive claims.. but there are signs.
greggm2000@reddit
100% agreed. The above post is well written, and it’s posts like that that help keep me using Reddit.
no1kn0wsm3@reddit
u/eatorsp I agree with NeverDiddled smart shaming aint cool. Let us be thankful that people passion reply back here rather than making tired old jokes
YookaBaybee24@reddit
Dude I only do longanisa ang tapsilog recipies.
Or do you want adobo puti or Bistik bangus?
drunk_kronk@reddit
Why are you comparing to "traditional x86 + GPU" when the comment above is talking about the Strix Halo?
iBoMbY@reddit
Strix Halo is Unified Memory, the GPU memory setting is just the default for stuff that doesn't support it.
YookaBaybee24@reddit
Strix Halo is shared system memory with dynamic allocation & not Apple-style unified memory. GPU still sits on LPDDR via separate controller + PCIe-like coherency layer & not a fully on-package single-pool architecture with fixed high-bandwidth interconnect.
Apple’s unified design = one physical memory pool, CPU/GPU/NPU equal access + very high bandwidth (800GB/s+) + zero VRAM split + zero copy paths.
Vs
Strix Halo = flexible VRAM carve-out over system RAM. Close in concept but different in implementation and bandwidth class.
ThatOnePerson@reddit
More than 96GB on Linux
tshawkins@reddit
I thought it was a hardware thing?
waitmarks@reddit
It seems to be a kernel settings thing. On Linux, you can set some kernel boot parameters in linux to allow it to allocate as much memory for the GPU as you want. I have one for local LLM uses and I have set mine up so that it reserves just 5 GB for the CPU so I can load up to 120GB (including KV cache) models into the GPU.
The difference between linux and windows just seems to be they chose different default maximums for GPU memory allocation.
LoonSecIO@reddit
It’s a windows bios thing and even when present too many applications are written to expect different pools so they copy data which slows it down.
Since all silicon Macs have unified memory developers can be 100% sure it works and is there.
righN@reddit
You mean Windows kernel? BIOS and OS has nothing to do with each other.
LoonSecIO@reddit
The OS and the Applications that run on it. What percentage of devices have unified memory? Like under 3% so why would any developer spend time optimizing memory management something that only a handful of users of a what amounts to a boutique APU.
Windows will probably "force" this into their next update but it might be a breaking change for a lot of software.
Mind you this one of the reasons why Bazzite and other linux gaming distributions can snag a 5% bonus on windows especially in RAM restricted environments like the Z1 and Z2.
_unfortuN8@reddit
Maybe under 3% of the total PC market, but I'm sure that number is higher in market sectors that have a use for unified memory. Also, this is a chicken and egg problem. You can't expect a large market share if next to no one is making these systems.
IMO unified memory is the future of computing for many/most use cases.
KeyboardGunner@reddit
OP wants 256GB memory minimum, and it looks like Strix Halo tops out at 128GB.
Hour_Firefighter_707@reddit
Also much slower. Sure, the 512GB variant of the M3 Ultra Mac Studio was 4x the price of a 128GB Strix Halo system, but it came with 4x the memory and was way faster in terms of raw performance.
I think what he wants is a real big, beefy SoC. Strix Halo and Nvidia GB10 are great, but they are only 256-bit and small.
M series Ultras are 1024-bit and an M5 Ultra should have 1.2TB/s of memory bandwidth
cocowaterpinejuice@reddit
In practical terms what exactly does this give you?
frankchn@reddit
Relatively fast token generation for LLMs. That is usually memory bandwidth bound and the large models are pushing 500GB with large context windows.
KeyboardGunner@reddit
Damn what a beast.
DerpSenpai@reddit
No one is doing 128GB other than Apple
R-ten-K@reddit
Intel and AMD have desktop SKUs that can do 256GB w 4 DIMMs.
DerpSenpai@reddit
Yes, try to run an LLM on it now
the__storm@reddit
Threadripper (which is sort of "prosumer"/workstation) will do 1TB as well. It's slower than Apple's on-package memory of course.
(Threadripper Pro will do 2TB, but I wouldn't consider that a consumer platform.)
Quealdlor@reddit
Wrong. You meant 256 GB. There are Nvidia N1 and AMD Strix Halo products with 128 GB.
DerpSenpai@reddit
Yeah i meant above fixed
tarmacjd@reddit
Unless the memory is on the chip, no. Nothing comes close to that bandwidth
Open-Dragonfruit-007@reddit
Strix halo doesn't have 512GB capacity or the 800+ GB/s bandwidth. I would say strix halo is more like a more powerful Mac Mini.
NVIDIA and AMD won't give these platforms such performance or size either since it will eat into their sales of GPUs
fathed@reddit
That requires a reboot to change the memory allocation, apple silicon does not.
It's also fixed sizes, so 96, 64, 32, etc. Apple is just what it needs at the time, no need to pre-adjust.
simo402@reddit
Dont you have to set the vram in bios on that?
kyralfie@reddit
Yep. That one and Nvidia GB10 are Apple M Pro competitors. Nothing against M Max yet.
Dransel@reddit
GB10 is ARM, not x86.
ImSpartacus811@reddit
So is the silicon running Mac studio.
Dransel@reddit
Read the title of this post.
martsand@reddit
Chiming in
The title talks about x86 competition to apple's unified memory
Arm is not x86 so I'm not sure what the gb110 example is for
Chipaton@reddit
Some people are reading it as "competition to x86" and others are reading it as "competitors that are x86." That's the confusion.
kyralfie@reddit
Of course it is.
DerpSenpai@reddit
Nvidia will end up selling their X090 mobile and X070 mobile dies with LPDDR6 mem controllers which Intel will glue with NVLink. And ofc, they will have their own as well
airmantharp@reddit
Generally the gains aren’t there for the target market.
And the costs are… obscene, until you get to Apple levels of scale.
Chicken and egg, basically.
The thing to consider is that Apple does what they do for power efficiency. Every SoC they build is meant to be used in laptops.
This has advantages for compact desktop / workstation applications, and disadvantages in terms of zero primary expansion or modularity. Hell, they have zero configurability after the silicon is packaged.
Stingray88@reddit
Well, not entirely zero expansion…
Hopefully in the future it’ll support more than just compute.
AHrubik@reddit
Just to add some perspective the M1 acknowledged PCI-E over thunderbolt but there was zero support within the OS for anything. That should tell you exactly how much Apple doesn’t want this to happen.
reddit_equals_censor@reddit
this is 100% artificial.
apple could have used camm memory modules. socamm2 for example.
they decided just as they did before to solder and glue everything together to artificially prevent servicability and upgradability.
Dependent-Zebra-4357@reddit
Apple needed a solution years ago, socamm2 wasn’t a thing at the time. Is it even shipping yet?
reddit_equals_censor@reddit
socamm2 is shipping in nvidia servers yes.
hell nvidia and partners didn't wanna wait for the official spec, so socamm (not 2) also got shipped in servers apparently.
and you are looking at it from a wrong point of view.
socamm2 exists, because nvidia and partners wanted a servicability and upgradable camm type memory.
so they stopped a standard out of the ground.
if apple and the industry wanted a socamm2 standard, they could have shipped it when they would have needed/wanted it.
as a reminder lpcamm2 and camm2 are about 2 years old now based on the jedec pdf even.
yet even today you can barely get any shitty lpcamm2 devices.
so the industry deliberately delayed adoption even of the shity lpcamm2 standard, because they LOVE LOVE LOVE selling soldered on insults, that you can't ever upgrade or service.
__
IF apple wanted to or WAS FORCED to use memory modules, then they with the industry would have stomped out a new standard of memory modules, that works perfectly with the laptops and mini pcs, that they are selling and it would have been about as fast as socamm2 took to get stomped out of the ground and get adopted.
apple is not a lil company like framework, who have to deal with whatever garbage the industry throws out.
apple can do almost whatever they want. this is also why apple never had garbage 16:9 panels in their laptops, but only 16:10 panels. other companies were forced to use whatever insult of non fitting panels the panel industry shit out, which were all not fitting 16:9 panels, which is why older laptops all had giant bottom bezels. the panel LITERALLY didn't fit into laptops, but the panel industry didn't give a shit.
but apple tells the panel industry what they want and get their own custom panels.
while memory isn't laptop panels of course, it shows, that NO apple is not forced to do whatever the industry tells them to do.
apple alone could have made socamm2 happen over 2 years ago. they could have stomped it out of the ground.
apple could have done so with the apple m1 launch.
it would have not been a problem for apple to do this.
apple with jedec could have made a new standard and forced it onto the market in a good way.
they CHOSE to not do this, because they don't want people to have up-gradable or serviceable memory.
so please think more holistic about these things.
"we couldn't have possibly used the standard, that we refused to create" is not an excuse for apple.
it instead further exposes their evil.
Dependent-Zebra-4357@reddit
Resources/talent aren’t unlimited. They almost certainly had all relevant people working on M series and didn’t have the engineering bandwidth to design a brand new expandable memory system in addition to a new line of desktop processors.
reddit_equals_censor@reddit
again the wrong way to think about it.
your cpu design engineers are not gonna design or think about memory module designs.
as apple you work with the industry to design a solution for you, that will also become a mainstream standard.
as a reminder, the MUCH MUCH smaller dell company was able to throw together camm memory modules for their laptops, which then got turned into lpcamm2 and camm2 as a full standard.
now socamm2 is a much better standard, BUT this disproves your idea, that they wouldn't have had the resources.
dell did it. at a fraction of the resources.
and dell has a fraction of the power in the industry.
so without question and without taking away any resources from the m-series processor development, they could have made/forced the industry to create a socamm2 style memory standard.
__
it is also important to remember, that soldered on memory should have never EVER been an option in the first place.
that shouldn't have even been a thought in their minds, because of course memory MUST be servicable and upgrdable and be enforced by legislation to stay this way of course as well.
and of course it also MUST be ecc memory, because not having ecc is insanity.
so again there wouldn't have been an issue for apple to use memory modules, the company, that seralized hinge angle sensors in laptops DID NOT WANT them to exist.
Dependent-Zebra-4357@reddit
Again, engineering resources are not unlimited. There are lots of other parts that need to be designed by the engineers not working directly on M/A chips. Apple builds more of their own tech stack than pretty much any other company, but they can’t design/engineer every component in every product they make.
reddit_equals_censor@reddit
again nonsense + work with the industry to design it as a standard.
if you want to talk about terribly wasted engineering resources, then how about looking at the serialization of most parts in apple hardware to prevent any servicing or repair, even if you could physically replace the part.
Dependent-Zebra-4357@reddit
You’re right, Apple has unlimited resources and the fact that they don’t design every single individual component in their machines is a failing of leadership, engineering and design teams, and basically everyone at Apple.
And this is a completely reasonable standard that all other tech companies are held to.
/s
nisaaru@reddit
SOCAMM2 looks like it requires more PCB real estate than direct pcb placement though and I consider this is only another stopgap solution until memory interface performance can't afford any "interposers" anymore and it's placed as close to the CPU die as possible.
P.S. I agree about your general stance about Apple's money grabbing tactics. Though the real problem for me in their design is their SSD and not the memory design.
reddit_equals_censor@reddit
lil part 2:
pcb real estate doesn't matter.
laptops have more than enough space and what actually would matter is the height, that they rise from the pcb and socamm2 and other camm designs are designed to minimize this to the point of it not being an issue. (so-dimm is vastly higher)
so yeah no issue space wise. 4 socamm2 modules would take up quite some space though of course, but nothing you can't easily design around. and 2 or 1 are very little space.
if you look at the socamm2 modules. they are basically just 4 memory dies. some chips and caps EXTREMELY tightly put together and 3 screw holes.
___
and worth mentioning also, that if a company were required to actually warranty products, that not having parts soldered on would vastly reduce risks and costs rma/warranty wise.
framework can send you a memory modules, if it breaks, apple THEORETICALLY would be required to replace the motherboard or use a technician to replace the faulty memory modules.
this doesn't apply to apple of course, because they do not stand behind their products and are NOT forced to do so either. fighting tooth and nail against servicing their faulty designs in class action lawsuits.
and lying to customers massively. hell they tell you the motherboard needs to get replaced for a bend pin....
again in a sane world it would be cheaper for companies to have memory modules as well, but apple is allowed to just abuse customers and other companies do as well.
important to remember how screwed up things are.
apple should have been destroyed or almost gotten destroyed by bolting on keyboards, that also fail within months. of course instead it increased their profits by lots of people buying new laptops when the keyboards failed and class action lawsuits being pennies for them overall.
___
and the part, where your data can get nuked, because the motherboard doesn't turn anymore and they removed the live savor port, should also DESTROYED apple. they should have been gone for daring to do that.
burning user data at an unbelievable scale.
everything is so insane. having to repair the motherboard to try to get it to boot again, so that you can get the data out of the soldered on ssd is so dystopian it is truly hard to grasp.
nisaaru@reddit
I suffered through the butterfly keyboard screwjob so I know that pain.
reddit_equals_censor@reddit
we don't know how far memory modules can scale and IF soldered memory would ever be needed to scale performance further for cpus and apus.
if we just do rough math. a single socamm2 module at 128 bit and 9600 mt/s is 153.6 GB/s.
let's say we wanna have the simplest possible traces, so just one module on each side of the apu. so not the mega apus. but mainstream designs right.
300 GB/s then, but with lpddr6 we should see a doubling again. so 600 GB/s.
would 600 GB/s be a major limitation for apus going forward?
that's about the bandwidth of a 9070 xt if you're wondering (644.6 GB/s)
and with 4 socamm2 modules you get 1.2 TB/s with lpddr6/x
i really don't see memory modules becoming a problem for a long time for consumer focused cpus and apus.
this of course assumes, that the industry wants memory modules to still get used.
and THAT is the real issue. apple showing people the middle finger.
and laptops and tons of companies love soldered on memory.
__
maybe in the far future we will actually need hbm style memory planet directly next to the apu, above or below it or behind it on the other side of the apu,
but we are far away from it rightnow.
and needless to say, if that time comes, the memory of course needs to be ecc even more crucially than it already is.
it is so crazy, that those criminal companies can sell broken memory and dare to solder on it on as well...
GHz-Man@reddit
The "Ultra" is only in the Mac Studio, and outperforms the Core Ultra 9 285K and Ryzen 9 9950X.
Very much not a laptop chip.
empty_branch437@reddit
Chicken and egg was solved, so you'll have to find a different analogy.
empty_branch437@reddit
The answer is egg came first from something that wasn't a chicken.
Plank_With_A_Nail_In@reddit
Nature doesn't care about the labels humans put on things or the categories they are put in, none of it is real.
Prince_Uncharming@reddit
Nature does actually care, otherwise a shark and a salmon could make a baby
nicuramar@reddit
And also that species boundaries are not exact at all.
Plank_With_A_Nail_In@reddit
There is no such thing as species its just a grouping humans made up to make talking to each other about stuff easier.
Hour_Firefighter_707@reddit
I would argue Nvidia would have greater scale than Apple. Apple sells every Ultra chip they can make and Nvidia isn't exactly struggling to shift RTX Pro 6000s either.
Technically speaking, the Ultras are desktop exclusive
darknecross@reddit
Apple’s contracts with suppliers include the iPhone chips as well.
windozeFanboi@reddit
I don't understand the "cost" complaint.
A single Nvidia 4090 chip is a massive die on a big GPU motherboard and cooling system.
Surely , they can compete in cost with that if the sell a 4080 class GPU and a 12C CPU with 64GB/128GB unified config.
It's not about cost, it's about the big ass Question ... Will it sell? ⁉️
It's only ever started to make sense in the X86 market to make such a chip with the modern AI boom. Nothing else.
airmantharp@reddit
Cost goes up dramatically if the sales aren’t there. And since there are already cheaper effective options for x86, the sales are unlikely to materialize - and companies know this, which is why they’re not building this class of products.
_MAYniYAK@reddit
Yeah any of the PCs that say AMD Ryzen Ai max and similar names are unified memory systems.
Tons of the high end handhelds from ayaneo and gdp win are unified memory too
Hour_Firefighter_707@reddit
Technically speaking they are only shared memory systems. None of the x86 systems has a true unified memory architecture. I believe even the GB10 isn’t unified. Correct me if I’m wrong.
waitmarks@reddit
Apple's unified memory is mostly a marketing term. The big innovation they did was previously systems with shared memory would allocate X amount for the GPU statically and leave the rest for the CPU. Apple made it dynamic so that the split would grow / shrink as needed. This allowed the GPU use as much as it needed for any reason which is really powerful for some applications. The other thing they did was to chose the fastest RAM available and put it on package for lower latency.
All of AMD's newer shared memory systems can now do the dynamic RAM allocation same as apple. Their Strix Halo line has similar speed RAM to Apple's systems, but it's not on package so the latency is a bit worse.
So it's a bit more complicated than does it have true unified memory or not because apple seems to only refer to the dynamic memory allocation as unified memory, but again it's a marketing term so it's kind of unclear.
dsoshahine@reddit
They could do this for a long time, at least on the graphics side. The default setting of 512MB in BIOS makes (and getting read-out by tools like Afterburner) isn't relevant in games where the iGPU takes as much memory as it can allocate.
9Blu@reddit
The other big difference, and this is important for AI use cases, is that both the CPU and GPU have access to the same in-memory data at the same time, so there is no need to copy it back and forth. This is not the case with shared memory x86/x64 where those memory areas are dedicated to either the CPU or iGPU and any data that needs to go from the CPU to the GPU side needs to be copied across.
EmergencyCucumber905@reddit
That's not true. On Strix Halo and Intel iGPUs and the Nvidia SoCs there is no copying involved.
You can allocate GPU-only memory which may have performance benefits because it doesn't need to be coherent.
All of these systems let you have a single pointer accessible by both the CPU and GPU.
5YNT4X_ERR0R@reddit
There’s still a distinction between unified and shared memory. Unified memory allows CPU and GPU to access the same pointers in memory, alleviating the need for memcpy between them. Shared memory systems have dedicated (but separate) partitions of memory for CPU and GPU on the same physical memory, however for the CPU to access data stored on the GPU side, the data still needs to be copied over.
Apple is not the first to unified memory, though. Consoles such as PS5 also use unified memory, where its CPU and GPU has access to the same pool of GDDR memory without copy. Apple’s innovation is in terms of scale: M3 Ultra’s unified memory, addressed by 1024-bit memory bus is not seen elsewhere.
noiserr@reddit
It's absolutely unified memory. So is MI300A which AMD had earlier but for a different market.
Alarming-Elevator382@reddit
The consoles like the PlayStation 5 do but obviously that is a very different use case.
Hour_Firefighter_707@reddit
Yep. I meant "computers", obviously. How badly would a console SoC do in a general purpose computer, BTW?
Say I used it as I do my laptop. Browse the web, edit photos and videos, do other "normal" stuff. Maybe run some LLMs. How much would the GDDR memory affect these kinds of tasks?
greggm2000@reddit
We might be finding out with the upcoming Xbox Helix, if the rumors about it are true, that there’ll be a mode where you can run Windows on it.
Alarming-Elevator382@reddit
There are binned versions of the PS5 APU available for sale, they are functional but I think the CPU/GPU is partially disabled or something. My recollection is that they are "okay" but keep in mind that they're Ryzen 2 CPUs and you only have 16GB of RAM total. I think Digital Foundry did a video on them before too.
https://www.tomshardware.com/pc-components/gpus/usd100-steam-machine-uses-a-cut-down-ps5-apu-with-bazzite-diy-console-offers-60-fps-at-1080p-with-16gb-of-gddr6
_MAYniYAK@reddit
It is unified as in the cpu and gpu both use it and can scale quite high even for llms.
When you get into the term of 'shared' we've been doing that for years with on board video from things like Intel hd graphics and has a lower scaling point.
AMD also calls it unified:
https://www.amd.com/en/developer/resources/technical-articles/2025/amd-ryzen-ai-max-395--a-leap-forward-in-generative-ai-performanc.html
That said the ones from apple scale better typically on the GPU with unified memory then the ones from AMD.
Paed0philic_Jyu@reddit
"Unified" memory typically refers to cache-coherent access between host(CPU) and device(GPU) memory along the lines of the CXL 2 protocol.
I haven't seen evidence that Apple silicon has cache-coherent memory.
In fact, some side-channel attacks that exploit the difference in size between the L2 and SLC suggest that Apple does not employ cache-coherency.
See secs. 3.4 and 3.5 of this paper.
EmergencyCucumber905@reddit
Furthermore when writing Metal code you need to flush the cache to guarantee your GPU writes are visible to the CPU. Which is not a big deal in practice.
RECAR77@reddit
Dont they still use normal/"slow" ddr modules?
_MAYniYAK@reddit
Not the typical desktop style ones no, soldered memory only
RECAR77@reddit
Yes, no dimms obviously but the same modules you would find on dimms
a5ehren@reddit
No one does the giant bus with on-package memory. It’s more about bus width than module type.
RECAR77@reddit
What about Sapphire Rapids HBM? Obviously not unified cause no igpu and not what OP wants but 64GB at 300GB/s seems fairly giant for x86 standards?
Maleficent_Celery_55@reddit
They use 8000MHz LPDDR5 afaik.
noiserr@reddit
AMD has mi300A and Strix Halo.
Touma_Kazusa@reddit
AMD has strix halo, Nvidia has N1X readily available, they’re not scaling it up because they want you to buy their enterprise cards :)
It would be possible to have a 512 bit bus and dual rank 32gbit lpddr5 but why would they do that when they sell enterprise cards at 50k a pop?
tshawkins@reddit
No retail user is going to pay 50k for a system.
EmekaEgbukaPukaNacua@reddit
“Good. Then either rent time from someone who will pay $50k a system or buy whatever product you wanted to do from someone who has one” - Ceos
TrippleDamage@reddit
And they're not wrong.
darknecross@reddit
You aren’t in a PC gaming subreddit.
Kryohi@reddit
And how many of those who can't afford a 50k system can afford and will happily buy a 6-7k system instead?
Not many, the market for that is new and still fairly small.
Touma_Kazusa@reddit
You are not their target audience, you do not cannibalise your most profitable segment for consumers who complain that everything is too expensive
zer04ll@reddit
That is what my ROG Ally does, AMD is going this route and why handhelds have been using it for gaming. Right now I have it set to give 6 gigs of RAM for video and 10 to the OS since all I do is game and dont have things running in the background like browser tabs
reddit_equals_censor@reddit
as a reminder the shity mac studio sucks completely at gaming for a start.
it has soldered in non ecc memory. it charges people out of their ass for a proper amount of memory.
if you want a non evil version of this on the desktop, it would be a powerful apu, that uses 4 socamm2 modules.
strix halo straight up refuses to give you memory module options.
so how shit do you think your theoretical would be?
are you excited to throw out your exepsnvie system, because one memory chip errors, but it doesn't have ecc and it isn't memory modules, so please accept the planned obsolescence the apple way and buy another...
how fun.
i personally like how the company decides everything then and overcharges you 4x for it.
you want a mac studio with 512 GB of memory? HA not anymore, because apple removed the option.
so your options are to NOT buy it, instead of having memory modules, that you could upgrade later.
so again for your request, it is absolutely essential to demand the use of memory modules with it.
again the industry does not want people to be in control of their memory.
____
and what is the biggest reason why you want it on the desktop?
oh more memory...
well let's look at it then.
so rightnow amd and nvidia are charging people out of their ass for the utter memory you need to game rightnow on graphics cards. so 16 GB vram.
but hang on those are 16 GB vram on a 256 bit bus.
why aren't we see 32 GB 9070/xt cards and 5070 ti cards with 48 GB memory?
in case you don't know to get 32 GB 9070/xt cards you clam shell the gddr6. to get 48 GB 5070 ti cards you clam shell memory and use 3 GB memory modules.
so those cards can be made TODAY with the exact same gpus and they could have been made at launch before the memory apocalypse.
but they weren't. as you of course know demand would be MASSIVE. so why weren't they made? and why were graphics card making partners (msi, asus, xfx, etc... ) FORBIDDEN from making those versions?
because nvidia and amd want to massively overcharge people to get a working amount of vram.
so if you in your idea are thinking of a soldered on 256 GB/512 GB amd or nvidia apu, then why would you be able to aford it?
why not charge 10x more for the 512 GB version MINIMUM?
why not 100x ? you can't upgrade it, it is soldered on, so screw us.
why not only offer 64 GB version to plebs for a lot of money and then the "pro" version just like with graphics cards to get enough memory will cost you 10-100x.
___
so again the current situation of not having enough vram for one's work is artificial. it is the industry preventing those products from existing, which btw would have been dirt cheap. 2 years ago 8 GB of gddr6 spot price was 18 us dollars. so a 32 GB 9070/xt would have cost just 36 us dollars more.... so of course everyone would have gotten it... if the option was available and amd wouldn't have charged the absolute shit out of it.
so your theoretical apu MUST MUST MUST use memory modules. and socamm2 modules are the best choice.
and if you don't know the math each socamm2 module, which can be rightnow up to 256 GB at 9600 mt/s has a bandwidth of 153.6 GB/s and it has a 128 bit memory bus.
4 of those = 512 bit memory bus. so a maximum of 1 TB of memory and 614.4 GB/s memory speed.
and lpddr6 could be coming soon as well of course.
so again you want apus to be the future for big memory pools in a unified apu, then the ONLY option to actually get it is memory modules. there is no other way.
and it also needs to be ecc of course as well.
but if what you really want is just enough vram for a start, then force gpu companies to make just memory cost difference clam shell and 3 GB clam shell designs.
YookaBaybee24@reddit
The reason you don’t see x86 systems like Mac Studio with 256–512GB unified memory at ~800GB/s is mostly physics + packaging & not just vendor greed. Apple’s M3 Ultra uses LPDDR5-class memory placed on-package (SoC + memory on an interposer), giving ~800 GB/s bandwidth and very low latency but at the cost of soldered non-upgradable memory and extremely high package yields/costs. On x86 CPUs like Ryzen 9 7950X and GPUs like GeForce RTX 3090 are separate dies using DDR5 (~80–120 GB/s system RAM) + GDDR6X (~900 GB/s VRAM) over PCIe, because that modular design scales yield, upgrades, and thermals better. What you’re asking for (512 GB unified at >600 GB/s) would require either HBM (like MI300 or Hopper) or massive on-package LPDDR, both of which cost thousands in packaging alone and are currently reserved for datacenter parts. CAMM2/SO-DIMM approaches can hit ~600 GB/s theoretically (4×128-bit channels) but real-world latency, signal integrity and power at desktop traces make it far harder than on-package memory. Also ECC + modular + ultra-wide bus is a routing nightmare on consumer boards. So yes vendors segment VRAM and memory capacity for margins but the bigger blocker is that Apple controls the whole stack (SoC + OS + memory) while x86 has to support interchangeable CPUs, GPUs and DIMMs. Until chiplet GPUs + shared memory fabrics (like AMD’s Infinity Fabric or CXL.memory) mature on desktop you’ll keep seeing split RAM/VRAM instead of true unified pools at that scale.
reddit_equals_censor@reddit
did you just say, that ecc is a routing nightmare in desktop?
ecc.... which i am running on am4 rightnow at 3600 mt/s ??
ecc is a routing nightmare??? get out of here with that nonsense. that's the kind of bullshit, that intel might shit out, when trying to argue for their artificial segmentation in regards to ecc.
wtf 100 euro boards on am4 coming with the ecc traces, because it is so meaninglessly cheap.
and we got lots of bullshit there.
in regards to costs. NO, that is nonsense. a theoretical design would reuse chiplets from dedicated graphics cards and of course core chiplets.
and 4 socamm2 modules would just have 2 socamm2 modules per side. so the issue would be just having 2 modules tracing without issues per side.
claiming, that this would be a massive price issue is just absurd.
what's next? you're gonna tell me, that 2 single channel ddr5 sticks can't be next to each for traces reasons? oh shit they already are.... (yes i know half the bandwidth)
and guess what, now we aren't packaging the memory together on an interposer, so all that cost is gone... much wow very design win.
__
so again NO, it would theroetically not be a cost issue. using chiplets, reusing core and graphics chiplets and just switching i/o and memory chiplets out from gddr to lpddr for example.
2 socamm2 modules per side, which gives you enough bandwidth and doesn't give manufacutrers the option to screw you over directly.
so again theoretically it is possible and most of the things you mentioned are just bullshit apple choices, or aren't actual issues.
just to be clear, i am not asking for mega apus instead of cpu + graphics card, i was merely explaining how it could get done and why memory modules would be ABSOLUTELY ESSENTIAL for it.
YookaBaybee24@reddit
Apple Silicon didn’t win the last five years by being clever with routing. It won by collapsing the entire waste stack that x86 still carries like ballast.
From the M1 in 2020 to the M3 Ultra era Apple moved CPU, GPU, media engines and unified memory onto one package with a single coherent memory pool. No PCIe hop between CPU, RAM and VRAM. No duplicated textures. No driver-managed copies. The result is not theoretical bandwidth claims but measurable behavior of 200–800 GB/s class unified memory access depending on tier with latency that collapses the CPU GPU bidirectional movement gap that still defines x86 workstation design.
That collapse shows up in power first. A Mac Studio under full CPU+GPU load typically runs a few hundred watts total system draw. A comparable x86 tower pairing a Ryzen 9 7950X with a high-end GPU like a GeForce RTX 4090 routinely crosses 600–800 W under mixed compute + graphics workloads. Same class of output but different thermal reality where Apple is not chasing peak clocks as it is deleting idle power, copy overhead and redundant memory traffic.
Performance per watt is not a silly aesthetic metric. It is rack density, laptop feasibility, fan curves and sustained clocks without thermal collapse. That is why Apple laptops run near desktop-class silicon in 15–30 W envelopes while x86 mobile parts spike and throttle under sustained GPU + CPU concurrency. Less heat = less metal, fewer fans, smaller power supplies & quieter systems. The Mac Studio form factor exists because the silicon does not require industrial airflow to stay stable.
Unified memory is not just capacity but allocation fluidity. A 96–192 GB pool behaves differently from split 32 GB RAM + 24 GB VRAM systems because memory is not pre-partitioned. A VFX scene, ML model or video timeline can expand across the full address space without copy staging or VRAM eviction cycles. That is where the feels faster effect comes from in real workloads & not peak throughput but fewer transfers, fewer stalls & fewer cache coherence games.
The modular argument survives on paper density but modern workloads do not stay inside neat partitions. The cost is not just silicon or traces but duplication of textures in RAM and VRAM, buffers mirrored across PCIe, driver scheduling overhead and power spent moving data instead of computing it.
ECC, CAMM2, chiplets all of it already exists in various forms across servers and workstations. The difference is integration level. Apple’s advantage was never that the components were new. It is that the memory hierarchy was flattened and made invisible to software. That is the part x86 still hasn’t standardized at consumer scale this is not ECC, not routing but a single shared high-bandwidth memory pool without bifurcation.
The trade is simple and already proven across five generations of Apple Silicon of less configurability, fewer parts, higher efficiency, lower latency, smaller systems and higher sustained utilization of every watt consumed.
reddit_equals_censor@reddit
what textures? :D it can't game.
i'd also guess, that when it games, that it would use fixed "vram" pools for the game, just like how the steamdeck does, despite having a unified memory pool.
devs won't change how the game works asset handling wise for the tiny fraction of apple gamers to make it handle like it handles on the ps5 with no asset duplication and true unified usage of the memory.
apple "won", by running a software prison system and massively overcharing people for faulty hardware.
apple won through marketing.
the performance of the hardware is fine, the engineering is in the dumpster:
https://www.youtube.com/watch?v=AUaJ8pDlxi8
people DO NOT buy apple devices based on hardware. the hardware just needs to seem good enough.
and what you didn't point out here was actual reasons why apple shouldn't have been FORCED to use memory modules.
i didn't argue against apus with unified memory pools.
nor is that the issue.
but rather, that shity evil apple should be FORCED to use memory modules and be forced to use ecc.
and if you want an example of what should have been apu, but isn't.
the new steam machine should have been custom rdna4 apu with unified 32 GB memory.
instead it is some rx 7600 8 GB insult and 16 GB sys memory old cpu.
clearly it should have been a custom apu for that use case specifically. much higher performance at the same production costs (much higher min order quantity though of course)
and despite it handling vram and system memory still split, it would as you said have dynamic sizing of it, which would also have been a major win.
jarblewc@reddit
As others have said the amd 395 is a great example of this in production on the x86 side and sells a (relatively) high volume. If amd keeps the product into the next generation (The 495 is just a refresh not a new chip) then I would expect even more volume.
R-ten-K@reddit
x86 systems had forms of “unified memory” well before M-series silicon, going back over a decade. And if you include the introduction of iGPUs, sharing system memory, the concept has been around even longer.
Kryohi@reddit
I'll add that basically every console in the last 15 years has had a large pool of fast unified memory... It's nothing particularly complex to do, but you need to have volume to justify such a product, and local LLMs, which are probably the first good reason to need this, are still ultimately a niche.
R-ten-K@reddit
The shared memory controller in SoCs is fundamentally an architectural necessity when multiple IP blocks on the same package need coordinated access to off-chip memory. It’s largely independent of volume considerations.
I think people in this thread seem to think of unified memory as more of a "black magic" than it really is. It has been mainstream for ages, since we got iGPUs basically.
LAwLzaWU1A@reddit
iGPUs using RAM and what Apple is doing are not the same thing.
The important distinction is not whether CPU and GPU physically share DRAM chips. PC iGPUs have done that for years as everyone knows. The real difference is in the memory model exposed to software.
On Apple Silicon, CPU and GPU are much closer to sharing the same allocations, address space, and coherence domain as a first-class design point, so APIs like Metal can often treat a buffer as one object both sides use directly. On PCs, even when the iGPU uses system RAM, the graphics APIs and drivers have often historically exposed memory as separate CPU-visible and GPU-optimal domains with more explicit mapping, barriers, flushes, and copy semantics.
The real difference is in my opinion how it is handled by the OS.
Stingray88@reddit
I thought for sure you’re mistaken about that… went to configure a 256GB M3 Ultra… sure enough, it says “ships in 4-5 months”. That’s absolutely wild.
youreblockingmyshot@reddit
They’re ramping up to announce the refresh in June. So there’s way less capacity for the M3. It was the same for the MacBooks before the refresh a few weeks ago.
Internal_Quail3960@reddit
lol there used to be a 512gb model but apple quietly discontinued it
waitmarks@reddit
I was going to say, with the RAM situation, it's really hard to tell if it's demand or if they just cant get enough RAM to make them.
Internal_Quail3960@reddit
I dont think they are struggling to make demand, I think they are just conserving as much as possible for the upcoming m5 Mac Studio (rumored to come in June).
Sure apple could sell the 512gb model still and be fine, but with this ram crisis they're likely trying to save as much as possible to prevent any cost passing onto the consumer.
This is one of the reasons the MacBooks got their storage doubled, alongside a $100 increase in price. It was cheaper than the previous year MacBook with the same storage configuration, and allowed apple to use that extra money towards the increasing price of ram
waitmarks@reddit
They aren’t necessarily interchangeable because, if i remember correctly, the Max chips have 4 RAM chips on package. They could be struggling to source the high capacity ones because that’s whats in demand for AI servers, while they have a lot of low capacity ones.
Stockpiling high capacity doesn’t necessarily help them build lower end versions because they need all 4 chips for bandwidth reasons.
randomkidlol@reddit
unified memory is built for cost savings and to reduce circuit board sizes. its not for performance.
Brisngr368@reddit
It does actually improver performance though cause it removes CPU-GPU latency (eg. MI300A)
randomkidlol@reddit
well if you buy exotic high end memory that can do high bandwidth and low latency like HBM, then sure you can satisfy both CPU and GPU memory demands with a single unified pool. consumer APUs use either GDDR or DDR, which means compromising one in favor of the other.
Slasher1738@reddit
you're just going to ignore Strix Halo?
PMARC14@reddit
Unified memory is just apple branding. We have had that long before on PC. The real thing is memory bandwidth. PC has kept larger memory bus to the server tiers cause it mostly doesn't benefit consumers. Being memory bottlenecked is rare for most tasks with a good cache structure. Only with the advent of the AI has real demand arrived so it has been slow to upscale and match Apple. I think they also want to kind of just wait for newer memory standards to put in the effort (DDR6).
Definitely_Not_Bots@reddit
Unified memory only benefits integrated GPU systems, which is pretty much all Mac has to offer.
For discrete graphics, your RAM is too far away to be useful, and would introduce a significant amount of latency. The memory on the GPU board is both closer and faster than your RAM.
Maybe PC laptops with integrated GPU would benefit, but they already have a flavor of "unified memory" (iGPU use RAM anyway).
"Unified memory" is more marketing than it is tech.
jenny_905@reddit
Because the PC market values upgradeability far more than the Mac market ever has.
malted_rhubarb@reddit
Technically speaking, AMD has had UMA on their APUs since Kaveri. That said Strix Halo might be more fitting for what you're talking about and to this day AMD can't be bothered to actually start gunning for better GPGPU compute on their products.
Dangerman1337@reddit
Medusa Halo, NVL/RZL-AX and whatever Nvidia is doing with Intel with Serpent Lake will be the answer to that.
billm4@reddit
x86 HBM skus exist from intel for the server market
DesperateAdvantage76@reddit
AMD's HSA architecture proposed 15 years ago accomplishes that. Unfortunately it never saw mass adoption, but I'm hoping with the Macs it sees a revival in other CPUs.
Local-Writer703@reddit
Dell Pro Max with GB300, Ubuntu and NVIDIA DGX | Dell USA
Here's the monster for you!
iMrParker@reddit
There are tons of machines pre-dating apple silicon that have unified memory, but never at good speeds or bus width
Alarming-Elevator382@reddit
The consoles did, PS4 had 176GB/s of memory bandwidth via a 256-bit unified memory system in 2013. One X in 2017 had 320GB/s with unified memory via a 384-bit memory interface.
Straight_Loan8271@reddit
All ps/xb consoles since the PS4/Xbone have done this, by using GDDR as their unified memory. Problem being that GDDR performs like ass when used as system memory because the latency on it is terrible
dampflokfreund@reddit
Apparently Nvidia is working on such a chip. N1X it will be called, probably announced at Computex. I think this is a fundamentally better approach than current dram+gddr7 systems. LPDDR6 could reach very high speeds at a wide bus width and has also low latency so the CPU is happy (in contrast to GDDR)
int6@reddit
That wouldn't be x86, though
dampflokfreund@reddit
Oh yeah, I missed that. But I think and hope Nvidia's x86 emulator will be so good it will be unnoticeable.
jeffscience@reddit
What emulator?
tshawkins@reddit
Or if you are on linux just use arm64 compiled binaries.
sylfy@reddit
For the people that are actually in the market for this, I doubt that really matters. No one’s going to be buying this for gaming. They may say it’s not just for AI, but the people forking out cash for it will be primarily AI users, with software that easily cross-compiles to arm64.
Touma_Kazusa@reddit
Is it really a better system? Gddr7 is much better suited to GPUs with better bandwidth. A laptop 5080 has the same 800gb/s as the m3 ultra with 1/4th of the bus width
dampflokfreund@reddit
It is. Yes you need a higher bus width but 800 GB/s is easily achievable with LPDDR6. Also, the laptop 5080 is heavily limited by 16 GB VRAM, especially for AI but also for games it will be tight once the next gen consoles release with 32 GB unified memory.
Touma_Kazusa@reddit
Let’s say you have lpddr6@10677 (initial lpddr6 data rate) you’d be at around 500gb/s at a 384bjt bus which is still at a significant deficit to current gen gddr7 speeds, I’d much rather have a faster gpu than a choked gpu with more vram for gaming
Hour_Firefighter_707@reddit
A 5080 laptop will probably be fine at 500GB/s. The 5050 laptop and desktop are identical bar the use of G6 on the desktop and G7 on the laptop. At the same power levels, performance is generally very similar.
Even between 5080 laptop and 4090 laptop, performance is similar despite the vast differences in memory bandwidth. 4070 Super and 5070 are very similar also.
Memory bandwidth for Blackwell doesn't really seem to do much. Except of course between the 4060 Ti and 5060 Ti because the former was hugely gimped.
LPDDR5X at 9600 is already being used. 32-core GPU version of the M5 Max has a 384-bit bus which gives it 460GB/s of bandwidth. It is absolutely enough at it's performance level, which is toe-to-toe with the 672GB/s 5070 Ti laptop in Blender at way lower power.
Realistically speaking, they will be using 11600-12000 LP6 memory by the time they start to ship
marmarama@reddit
For gaming, where the developers can be careful to trim assets to fit within VRAM and be strategic about managing transfers to VRAM, maybe.
For everything else, high-bandwidth unified RAM is better. The moment the GPU starts having to stream data across the slow PCIe bus, a theoretically faster standalone GPU will get spanked by a slower unified memory GPU. PCIe 6.0 tops out at 128 GB/sec currently, which is multiple times slower than some of the unified memory GPUs have, and most people don't even have that. Not to mention the latency costs of having to setup the transfer and dump pages out of VRAM to stream new ones in.
This is one of the big issues with running big LLM models on consumer GPUs - as soon as the model exceeds VRAM size, the performance collapses. The same holds true for other RAM-hungry GPU applications.
Touma_Kazusa@reddit
You’re looking at a super niche market where:
It’s worse for gaming
It’s worse than enterprise cards that most tech companies use (b300’s/pro 6000’s)
So the target market is very very small, tech companies won’t buy it since devs just use their servers which are much faster, gamers won’t buy it cause it’s worse for gaming, normal people won’t buy it cause it’s too expensive, and the target audience still probably won’t buy it because it’s too expensive
zenithtreader@reddit
N1X is ARM iirc, not x86.
soggybiscuit93@reddit
To add what others have said, It's hard to make the cost structure work on PC. Memory on package is difficult to make work, because then AMD/Intel will have to buy memory themselves, put it on the package, and then pass those costs onto OEMs without any upcharge. OEMs aren't super thrilled about this, because managing memory capacity and upcharging for it is one of their main ways to make margin. Apple M places memory on package.
So without this, memory has to go on the motherboard. And now, it becomes a bit complicated. You have a standard 128b "dual" channel CPU (INB4 I know DDR5 is technically 2x 32b, blah blah). But then as you move up the product stack, you want "four" channel, 256b? How about 8 channel, 512b? Generally these are all gonna either 1) Require different motherboards for each product up the stack, or 2) Require a very expensive, 8 channel motherboard and then only populating / using two of those channels for the cheaper 128b models, raising the price of the low end substantially.
Intel has actually taken the opposite approach in a sense, where with PTL, one of their big accomplishments they bragged about was that their big iGPU model, their standard H model, and their low cost U model could all share the same (128b) motherboards, and they compensated for the low bandwidth to iGPU ratio with a lot of cache.
AMD has Strix Halo, a large iGPU with 256b "unified" memory, and its cost to OEMs is one of the reasons its not seen a lot of adoption by OEMs. You need a special motherboard. It has a huge iGPU (die-size-wise), making it expensive, and many OEMs just choose to offer a dGPU at that price instead.
It wasn't until very recently, with local AI models, has their really been much interest in these huge APU SoCs on PC, but that market alone may not be big enough to justify.
NerdProcrastinating@reddit
They both have the technical capability to build it. The problem is Intel management & AMD are still scarred from their near death experience.
That's why they're not leading in these areas. They need big OEMs & paid market research reports to tell them the obvious state of the market months to years after the opportunity becomes available.
max1001@reddit
Because they will 100 percent not fly off the shelves.
waitmarks@reddit
Something non-technical that I think is missing from this thread is that Apple sells and makes money from the computer as a whole, where Intel and AMD need to make money selling just the CPU.
Intel tried the on package RAM with lunar lake, but it didn't sell well because system integrators have their own RAM suppliers and contracts. With lunar lake, Intel was forcing them to buy the CPU and memory from them at their rates. This basically made lunar lake not very profitable for system integrators to sell because it cost them more for RAM than they would have paid if they were sourced separately.
So if by unified memory you mean on package memory, I don't think we will see it from x86 again. However, unified memory is kind of a marketing term from apple and it really just seems to mean dynamically allocating the memory split between the CPU and GPU. Which all the newer x86 CPUs with IGPUs can do now. So, if you use Apple's definition of what unified memory is, they all do now.
Polar_Banny@reddit
They can, look at AMD, if I am not mistaken they launched first something similar for the PS4, also AMD was first to launch GPUs with HBM memory but guess what, AMD doesn’t want your money nor Intel is keen to relaunch future SoCs with memory on package, previously when I made my own research on that matter the biggest issue was economics because this fake duopoly is selling only hardware where APPLE is selling same Hardware but close down to its services and fees, if you look to something similar/alternative to Mac Studio there’s AMD Ryzen Threadripper PRO or Intel Xeon 696X, which got up to 12 memory channels.
Also, based on economics Apple has succeeded because they can lock down kernel of MacOS but Windows devices can’t because of security concerns, so with said it’s not a Hardware issue but a problem of ecosystem, regulation, security and Windows-Problem*.
windozeFanboi@reddit
Vertical integration for apple, that's it.
Honestly the biggest incentive for doing the same on X86 has been the AI boom. Nothing else. Nothing else actually favoured apple's approach enough to push for in companies' minds before.
Strix halo was a preview but, AI is here to stay and Intel/AMD now know that and I will expect to see similar 512bit wide unified memory chips in the future. Perhaps not next year, but in 2027 I want to believe Intel/AMD will give us mainstream 192/384 bit LPDDR6 devices wide available.
I just hope the RAM cartel and agent orange 🍊 cartel will take a chill pill.
But the future isn't something to be counted on. China might as well invade Taiwan the day the blueprint for the 512bit AMD Max cup is sent. Who knows. I don't. I'm just depressed when I open the telly.
ham_bulu@reddit
#1
Because no one can replicate Apple's economy of scale: they are producing shitloads of iPhone SOCs, Mac SOCs are just a rounding error, but benefit from shared design, lithography and production resources.
#2
Because nobody can replicate Apple's vertical hardware and software integration and shared frameworks between their 5+ OS flavours.
#3
Nobody has the long-term vision and experience: Apple Silicon goes back to the A4 in 2010. Since then they're iterating without any major strategic shifts.
I'm sure there's more.
I'm not sure how anyone is ever going to match this without any major technological or supply-chain shifts ...
anjumkaiser@reddit
Unified memory means sharing the same memory between cpu and gpu and whatever else can access memory. It is more hardware connectivity first and software synchronisation afterwards. Unfortunately in pc world we have multiple vendors who are hell bent on having their own memory silos rather than working together to use existing memory pool. Intel already does this partially with their CPU and iGPU, they share a small region of system ram for iGPU. Other vendors have some technical as well as political reasons. IGPU drivers already does some of this elegant dance of configuring and controlling the system ram area for GPU, windows has to perform memory transfer from system area to iGPU area. For a system wide unified memory they will have to move beyond PCIe and come up with something else, or probably have a newer version of PCIe standard to address these needs specifically. But that will only work on devices that target compliance with that standards.
Apple has the unfair advantage of being able to control the design all of their CPU GPU and interconnects. PC part vendors have to come up to a sacrifice to agree to the least common denominator as a standard so we have things split into multiple years.
ldn-ldn@reddit
Regular DDR and GDDR are optimised for different scenarios. DDR memory is optimised for low latency and super fast random access, while GDDR is optimised for massive sequential data transfer speeds. Unified approach puts you somewhere in between. GPU and GPGPU performance on M chips is meh. The same is true for Strix Halo. And the only reason latency is competitive on M chips is because memory is baked in, but we like to upgrade our memory over time.
All in all it's not a good idea.
No-Improvement-8316@reddit
Google AMD Strix Halo
uniquelyavailable@reddit
From my experience it's possible on windows. It's also possible on Linux but I had to override my Nvidia drivers using special code. Shared memory is a wonderful thing, I would include the Nvidia Spark in the example.
From a hardware perspective most x86 machines are a random assortment of hardware, so there is some certainty to developing and optimizing software on a platform like Apple or Spark. That gain isn't prevalent on x86 architecture so perhaps a reason why the adoption is lower.
Jumpy-Dinner-5001@reddit
Mostly cost and "skill" and software support.
Apples approach is horrendously expensive to make and a very advanced design. That’s not something you can just come up with.
Software support is also crucial which windows doesn’t really do well.