What would it actually take to build a modular, upgradeable GPU: packaged chiplet modules, swappable VRAM, standardized base board?

Posted by Goldenskyofficial@reddit | hardware | View on Reddit | 48 comments

I've been going down a rabbit hole thinking about GPU modularity and eWaste, and I want to pressure-test the idea with people who know this stuff better than me.

The concept: instead of buying an entire graphics card every generation, you buy a standardized PCB base (power delivery, PCIe interface, display outputs) and a sealed compute module (think Jensen's on-stage chip samples, a packaged die with HBM inside, exposing a standardized connector on the outside). When a new generation drops, you swap the module. Optionally slot in additional VRAM on the base board for expandability.

I'm aware of the obvious objections:

- High-speed interconnects across a physical join are hell for signal integrity
- Contact resistance at high pin density is a real problem
- Bandwidth tradeoff between in-package memory and external VRAM

But I'm specifically not talking about raw die swapping or wireless data transfer. The magnet/latch mechanism would be purely mechanical. The electrical path is physical contact pads, closer in concept to a ZIF socket or LGA than anything exotic.

UCIe and chiplet architectures are already moving in this direction at the packaging level. The question is whether a user-serviceable version is physically plausible with current or near-future interconnect technology, and whether the performance tradeoff is acceptable for a product targeting repairability and longevity over raw benchmarks.

What are the actual hard limits here? Where does this idea break down that I haven't considered?

[-]

CommanderArcher@reddit

Signal integrity, cooling, socket updates, practicality

The GPU is already the upgradable part of the computer, th rest of th components on a GPU are so cheap compared to the chip and VRAM that this concept doesn't really matter.

Socketable VRAM would be neat, but that's also a sales decision, giving you the freedom to choose how much VRAM you have also means giving you you the choice to not buy their higher spec models for more VRAM.

Even if this is all technically possible, I don't think you've getting around that part.

[-]

NervusBelli@reddit

Not only that - socketable chips would hurt latency most likely, soldered chips allow for shorter traces

[-]

Exist50@reddit

Physical distance is basically irrelevant for the latencies we're talking about. You're looking at maybe 1ns when GPU mem latency is closer to 200ns. And GPUs aren't particularly latency sensitive to begin with.

[-]

is-this-a-nick@reddit

Any kind of connector causes signal integrity issues though.

[-]

Exist50@reddit

You can get nearly full LPDDR speed with LPCAMM/SOCAMM, and you can achieve GPU-tier bandwidth using LPDDR (e.g. Apple).

[-]

VenditatioDelendaEst@reddit

But if you do both at the same time it'll use a hell of a lot of power.

IIRC Apple can use hi-Z on-die termination because the DRAM is on-package.

[-]

Exist50@reddit

But if you do both at the same time it'll use a hell of a lot of power.

Why would it? LPDDR is actually significantly more efficient from a power per bandwidth perspective that GDDR. Roughly on par with HBM, even.

[-]

doscomputer@reddit

we're talking maybe half a ns when ram time is measured in ms

there is no reason at all for these things needing to be soldered other than its way cheaper than having to make sockets with high pin density, potentially wasting copper for pads/traces the users config aren't currently using.

from a technical and physical standpoint, there is no degradation in performance, GPUs aren't using any special technology that isn't common with socketed server CPUs and desktop chips.

its entirely about the size of the GPU market and the vendors maximizing profits. If there were 10x more sales then perhaps GPU motherboards would make sense, then economies of scale would take over and they'd make more profit selling everything separately

[-]

Noreng@reddit

GPUs aren't using any special technology that isn't common with socketed server CPUs and desktop chips.

Apart from the memory signal operating at 8 GHz with PAM3 signalling (or 10 GHz without PAM3 in the case of the Radeon 9000 series), while socketed server/desktop/laptop chips peak at 4.8 GHz?

Increasing the distance to DRAM from the GPU makes the signalling more difficult, if not impossible.

Take a look at how much closer the memory chips are located compared to the GPU on the 5090 compared to the Radeon 390X:

https://www.techpowerup.com/review/msi-r9-390x-gaming/3.html

https://www.techpowerup.com/review/msi-geforce-rtx-5090-lightning-z/3.html

[-]

SANICTHEGOTTAGOFAST@reddit

LPDDR5X absolutely goes up to/beyond 8-10GHz and modern reference (e.g. mobile CPU) platforms have no problem being socketed. Not cheap, but not an impossible feat by any means.

[-]

Noreng@reddit

LPDDR5X peaks at 5.3 GHz according to JEDEC, that is 10600 MT/s. If you know of a chip that's rated for more, please show me.

[-]

Kryohi@reddit

No one says you can't make mid range gpus out with lpddr5x/lpddr6 instead of gddr though.
In fact that's what we're going to get next year for some gpus.

[-]

Noreng@reddit

It's certainly possible to replace GDDR7 with LPDDR5X if the memory interface is compatible.

The question is how much performance is lost considering memory bandwidth will be reduced by roughly 67%

Making a larger die with a wider memory interface will just make the GPU more expensive. See Strix Halo as an example.

[-]

SANICTHEGOTTAGOFAST@reddit

Ah right, brain fart.

[-]

the_dude_that_faps@reddit

Not really. VRAM already has higher latency than regular socketed CPU memory.

Trace length is irrelevant to latency. Signal integrity is the stopper. In any case, something like CAMM could improve signal integrity enough to have almost parity. It would cost, though.

[-]

wtallis@reddit

Trace length isn't going to affect DRAM latency. Most of the latency is inside the memory chip or inside the processor. Trace length will affect what memory frequency can reasonably be achieved.

[-]

spellstrike@reddit

And the validation of all of the above within the product release schedule which is far too short for board partners to achieve when they barely get product samples of the reference design before launch.

[-]

Jaded-Gap-9919@reddit

It’s called mxm 🥱

[-]

spellstrike@reddit

Nvdia has had a rough time with just simply a new power standard. Now imagine having new connection standards for whatever you want to be modular. The cost and reliability and new limitations are not there.

[-]

Lincolns_Revenge@reddit

Has anyone come up with an add-on to improve RTX 5090 cable safety yet?

I live in a small amount of fear That I didn't connect mine in the most ideal manner.

There was that one 5090 model that can sense uneven voltage over the pins. And maybe a motherboard or two with something that improves the situation. I wish someone would make a connector with additional safety features. My guess is that isn't happening because it's a difficult thing to guarantee.

[-]

Detente@reddit

There's this from Thermal Grizzly. It has a sensor and an alarm. https://www.thermal-grizzly.com/en/wireview-gpu-pro/s-tg-wv-p-h1n

[-]

spellstrike@reddit

The only ideal manner is to spread the load across more cross sectional area of cable and connection. It's a joke that the safety margin doesn't allow for similar or better tolerance than the previous connector it replaced. If the new standard is required, then use multiple cables.

[-]

doscomputer@reddit

I mean the whole standard is a blatant cost cutting measure, attempting to reduce the size of the pins, reducing contact area, just generally increasing the likelihood of a fault in every scenario is the problem with 12VHPWR.

And the goal was to send even more power down it, in a weaker connector. With 4 extra pins for current sensing. hmm, 16 pin cable and bulky... weird.

Its more than possible to build a connection standard that works perfectly and thats why CPUs or RAM burning out of their sockets is extremely rare. 12VHPWR deliberately used higher tolerances/lower electrical performance connectors than the old PCIE power cable standard and somehow its still not an example for everyone that engineering isn't just a game. wild to me

[-]

bebo117722@reddit

The signal integrity and cooling issues are real, but honestly the bigger blocker is that GPU makers profit from selling you a whole new board every generation. A modular standard would kill their margins. Framework proved it's possible for laptops, but convincing NVIDIA or AMD to play along is the real hard limit.

[-]

YairJ@reddit

I've thought about that too, but I don't know much about the technical difficulties involved. Intel filed some very relevant patents, though that does not mean everything described here is already practical: https://underfox3.substack.com/p/intel-compression-mount-technology

So taking some points from there, I could see a replaceable GPU(or CPU) with a big substrate that has the sockets for memory modules directly on it(possibly on both sides?), which would both improve signal integrity compared to going through PCB and not lock the whole card to specific memory standards(and reduce confusion with numbers of modules). Combine that with replaceable VRM on the card, and with standards for the heights that components present for the heat spreader.

[-]

SJGucky@reddit

There are already socketable cards with interchangeable GPU-die (same size) and VRAM.
But those are special cards used during validation and testing. They need a big clamp mechanism with integrated watercooing to hold the GPU/VRAM tight. Its unviable for consumers.

There are some videos floating around, like factory tours, showing those.

[-]

Goldenskyofficial@reddit (OP)

Very interesting! I'll look those up...

[-]

jenny_905@reddit

You might be interested in what Bolt Graphics are doing. They seem to be committed to expandable VRAM.

I just mentioned it because I got an email from them yesterday saying they've got prototype taped out.

[-]

LuluButterFive@reddit

VRM, powerdelivery, complicated PCB layers and cooling

[-]

reddit_equals_censor@reddit

cooling wouldn't be a problem at all.

memory cooling in graphics card is a side consideration. it just needs to be good enough... with some flat contact to the cooler with thermalpads being just fine.

having 2 socamm2 modules, that have the same type of basic connection is not a problem at all. you'd have FIXED EXACT heights for the socamm2 modules of course, so you know the thermalpad thickness you need and the cooler design.

and a theoretical socketed gpu would have the same case. fixed height all the same.

a socketed gpu would be a little thicker of course, but beyond that no cooling issue. you veryy likely would want to sell gpus with heatspreaders then, but that isn't a problem, as long as no toothpaste gets used as tim between ihs and gpu.

so cooling designs would be different, but not harder or more expensive necessarily at all.

and as a side effect you'd have less broken cards due to broken memory solder balls from pci-e slot strain and trying to reduce this somehow in design, because socamm2 modules wouldn't have that issue.

[-]

reddit_equals_censor@reddit

in regards to memory and bandwidth.

not a problem. it can be done now.

slap 4 socamm2 modules on a graphics card and you say goodbye to any vram concerns and a bandwidth of 614.4 GB/s.

a single 9600 mt/s socamm2 modules with its 128 bit bus has 153.6 GB/s bandwidth.

614.4 GB/s is more than enough for a good midrange card.

the 9070 xt for reference has 644.6 GB/s memory bandwidth.

and the 9060 xt 16 GB has 322.3 GB/s memory bandwidth.

so 4 socamm2 modules = 90070 xt bandwidth and achieveable performance without problems.

and 2 = 9060 xt performance and bandwidth.

and this is BEFORE we have lpddr6, which would be expected to double bandwidth per socamm2 module.

so again memory module wise NOT A PROBLEM. solved problem could make chips and cards RIGHTNOW.

gpu socket also shouldn't be a problem, although of course you'd want a good socket.

so THEORETICALLY it could very much mostly get done.

and with graphics card "motherboards" lifetime being as long as am4 then in comparison.

and new generations of it being based on new memory generations just like with proper motherboards.

needing a new pcb then and gpu with going to lpddr6x vs lpddr5x for example.

____

so all of this aside a huge reason people are talking about these days is because graphics card companies are scamming people by not putting a working amount of vram on cards.

if we had rightnow AT ABSOLUTELY MINIMUM 24 GB vram. so your 5060 and 9060 would start with 24 GB, then people would likely not be talking about vram modules too much, however cool it would be.

i mean i'd love to see memory modules and ecc memory for everything, that is even remotely possible to have it and again it is 100% possible for 9070xt tier graphics cards already.

remember, that amd and nvidia criminally prevent partners from selling higher memory capacity graphics cards.

an xfx is NOT ALLOWED to make a 32 GB 9070xt.

an msi is FORBIDDEN to sell a 32 GB 5070 ti and make it without an nvidia 12 pin fire hazard.

[-]

GenZia@reddit

The answer is very simple: they don’t want to sell you a modular, upgradable GPU.

After all, GPU manufacturers won’t even let you mount standard 120 or 140 mm fans on their cards, so a truly modular GPU, with sockets, slots, and whatnot, is a bit of a stretch.

If anything, they’d rather have you rent or lease hardware and pay a monthly subscription.

Proprietary, closed source software has already largely transitioned to a subscription based model (SaaS), so it’s only “logical” to assume that hardware could be next.

It became blatantly apparent when the file explorer app (Simple File Manager) on my Android smartphone suddenly started showing ads, trying to coax me into paying $15 per month for the privilege of accessing my own files on my own device.

That was one hell of a wake up call, enough that I almost entirely transitioned to FOSS apps within days… but I digress.

[-]

Goldenskyofficial@reddit (OP)

You're basically describing why the incumbent players won't build this (which I agree with completely). But that's exactly the opening for a company whose business model is built around the opposite philosophy. The worse the subscription/lock-in trend gets, the bigger the addressable market for a company selling genuine ownership. Framework laptops exist because enough people were fed up with unrepairable MacBooks. Same dynamic, isn't it?

[-]

R-ten-K@reddit

You’re trying to solve a problem you don’t fully understand with a solution you don’t fully grasp. At the same time, you’re likely overestimating the market for GPUs with upgradable memory.

In practice, what you’re proposing would probably make things more complex, less performant, and more expensive. Yet assume there must be a huge market for the less performant more expensive product because reasons.

[-]

dudemanguy301@reddit

Same dynamic, isn't it?

Not at all.

Modular mobile parts existed before framework, they simply bucked the trend towards even tighter integration, even then the trend seems to be not only intensifying but paying dividends in performance / battery life especially the ones that mount memory onto the package, so staying relevant will only get more complicated in the future.

For example a company like CLEVO sells basic semi modular designs to other firms like MSI to make their own final product laptops by filling all the empty modules with actual components and slapping it into an appropriately branded shell. When my IBUYPOWER laptop charger died I just needed to buy an MSI charger because they where both at the end of the day CLEVOs.

You are asking for an ecosystem that doesn’t exist, to suddenly start existing.

[-]

dudemanguy301@reddit

Forget modularity, if making a competitive GPU of any kind where easy we wouldn’t have endured nearly 2 decades of a lopsided duopoly only to finally arrive at an even more lopsided trio-poly.

A business model is only a start, GPU in this current AI boom are an unassailable fortress of eye watering R&D budgets, cut throat talent acquisition, libraries of patent filings, decades of built in driver maturity and developer relations, massive incumbent contracts for limited manufacturing capabilities.

[-]

the_dude_that_faps@reddit

The problem is that GPUs are so complex and hard to engineer, and the consumer market so cutthroat, that I'm not sure a player that brought this innovation to market would be able to compete unless they already won on other metrics.

[-]

doscomputer@reddit

I mean the real strategy is to become a GPU design company yourself and become the 4th vendor.

lets imagine a world intel and amd have the good graces to support your plan and sell you GPU dies. Now you have to create the sockets, the peripherals, cooling solution, and the motherboards themselves. Like I guess if I could buy 9060xt, with the option of upgrading to a 9070xt, that would be cool. But if it costs a good bit more, including the upgrade set cost, I don't think it'd be worth it. Also you have the whole issue of memory slots, with AMD its easy enough you'd only need 8gb cards in 1x1 and 2x1 configurations, but then if someone buys a 9070 chip and only has the 8gb installed, their memory bandwidth is cut in half and it might be more than you can work around with a custom bios

so at that point realistically you should just be making your own chips, that way you can actually have deep control of the platform and the product stack so the whole upgrade shceme plays nicely for consumers... but if you have to design your own GPUs, that costs a lot of money... better cut some corners and keep it all soldered for a few generations 😏

[-]

StrategyEven3974@reddit

besides everything else, the problem is the chip itself.

How is a startup going to put in an order for top-tier GPU chips from TSMC when trillion dollar companies like Apple, Nvidia, and AMD have them sold out for the next 3 years? The sheer amount of money you would need to put together to just put in an order would be staggering.

TSMC won't even talk to you unless you come to them with $200 million, with a design your engineers have already developed. Who's got that kind of money?

[-]

GenZia@reddit

Nvidia won’t allow these “modular GPUs” to exist.

I don’t think I have to explain the relationship between Nvidia and its AIB partners on r/Hardware. Plus, I’m old enough to remember Asus ROG Mars, a GPU lineup that unofficially featured two GPUs per card.

Nvidia was quick to tighten the noose around Asus and kill off the entire lineup with Maxwell:

https://www.techpowerup.com/gpu-specs/asus-rog-mars-760.b2646

[-]

III-V@reddit

A time machine and going back to the 80s / 90s when that was a thing.

[-]

R-ten-K@reddit

It would not be practical or remotely cost effective.

[-]

Life_Menu_4094@reddit

It would seem more "practical" to just create a second socket / memory slot / power connector on the motherboard than to keep the daughterboard idea.

[-]

the_dude_that_faps@reddit

I'm sure we could solve the challenges. But it would essentially be a motherboard for GPUs. That won't be cheap considering the amount of traces that go from the GPU to memory.

[-]

eivittunyt@reddit

With current technology you need soldered core and memory for signal integrity unless you are willing to take a massive performance drop and everything else on a gpu costs like $50 so there is very little reason to make a product like that even if you wanted to.

[-]

Exist50@reddit

You could do a large number of channels of LPCAMM.

[-]

Kougar@reddit

Creating a fully upgradable GPU is literally just creating a miniature motherboard. So think of it that way. Once you do the problems become more apparent.

It would be possible to create, but what you're envisioning is literally what we have now for CPUs. Desktop/mobile motherboards that allow us to plug in any CPU compatible with that socket. NVIDA won't ever be willing to go that route, it's additional time & cost in R&D for them and means selling lower margin products. AMD might decide to join such a platform, but they don't have the market position to create one because NVIDIA would never ever go along with it.

Purely mechanical sockets don't negate the already mentioned problems. But more importantly, it's a fixed socket. Meaning all future GPUs must be drop-in compatible to that specific socket. They also must have the same memory controllers baked into them to be compatible with the type of memory on the board. The memory slots are also a fixed socket so the memory standard chosen cant be changed after the fact so you will be stuck with whatever signal integrity, frequency, and routing limitations that apply with it. Whether it's slot based for upgradability or not, the pinouts, routing, and bus width of the socket are limitations you can't change after the fact, so upgrading HBM memory isn't going to deliver the results or bandwidth increases you might be thinking it would. Bus width usually determines the pin count and trace layout most of all.

While a VRM module could be reused, the lifespan of VRM components tends to grade over time especially if they have to endure high temperatures or damp, salty, or humid environments. Or questionable power sources, a silly number of people don't live in places with grounded power. Even in the US it's not guaranteed to be clean power either (as in adhering to powerline specs). So while creating a GPU-motherboard would lead to reusability, I'm not sure how reliably they would be long term. GPUs have been known to blow out VRM components with 1-5 years of life, and even well-built components are going to be a risk after a full decade of intensive use. This is also assuming the VRM was overbuilt to begin with, because NVIDIA has proven it's willing to constantly push the absolute limits of power consumption in small form factors with every new generation. A 600w rated VRM today may not power the 800w GPU tomorrow, or a 1Kw a generation after that. And there's absolutely no sense in using HBM memory on anything but the most powerful tier GPUs, so I'm only thinking of flagship tier cards here.

[-]

AutoModerator@reddit

Hello! It looks like this might be a question or a request for help that violates our rules on /r/hardware. If your post is about a computer build or tech support, please delete this post and resubmit it to /r/buildapc or /r/techsupport. If not please click report on this comment and the moderators will take a look. Thanks!

I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.