China modded GPU (eg. 4090 48gb) --> I'm gonna figure it out. IS THERE NO ONE ELSE CURIOUS??

Posted by LeatherRub7248@reddit | LocalLLaMA | View on Reddit | 77 comments

There's a dearth of information (in the english world) about these cards.

The good recent video is probably this one:
https://www.youtube.com/watch?v=TcRGBeOENLg

even in this subreddit, there's seems to be few reviews of these cards.

Last couple of decent threads:
https://www.reddit.com/r/LocalLLaMA/comments/1s62b23/bought_rtx4080_32gb_triple_fan_from_china/
https://www.reddit.com/r/LocalLLaMA/comments/1nifajh/i_bought_a_modded_4090_48gb_in_shenzhen_this_is/

Is there really NOONE else who has tried these?

In particular
1. Software / bios / quirks that make them NOT run as per unmodded card
2. Short term consistency, does it run fast for a test, but hang / die when stressed?
3. Long term reliability - does the whole thing fail within 2 months of regular usage?
4. Are the benchmarks good? Where are the results??
5. source and price?

chinese video site blibli has ton of videos, and taobao (and other ecomm) sites also lots of sellers.

If i can piece together enough research, i may also visit shenzhen to pick up a few.

If you're interested in this space, DM me . hope to form a group to split up research efforts.

Also any native chinese speakers who are familiar in this space also please join in,

[-]

Heathen711@reddit

I have three 48GB 4090 blower cards running in my servers. 2 run Qwen 3.6 27b, 1 runs stable-diffusion.cpp workload. Cooling is an issue, I swapped in 4k rpm server fans to feed them and keep the backplate cool. Otherwise i've had no software issues.

[-]

chocofoxy@reddit

water cool them

[-]

uniqueusername649@reddit

They have twice the capacity but what about the bandwidth? Is it the same? Or is the speed actually slower? I know that the D variants typically double memory but use slower RAM, so this might be problematic for AI use where memory speed is king. That's I avoided D variants so far.

[-]

Heathen711@reddit

Same clock and VRAM speed, just double the VRAM capacity, so it can hold more working memory.

Mine aren't the D variant, there's another reply to this thread that says the D is 10-15% slower, but I think he means in clock not VRAM speed.

[-]

TheWaffleKingg@reddit

I can only imagine the noise this makes

I had to pull my chassis HDD fans out because I just couldnt take the noise (rack is in my home office). Temps haven't been great but im installing some noctua fans today/tomorrow. Won't be as good but sure beats what I have now

[-]

Heathen711@reddit

S12038-4K, they are a little loud but they are wider (thicker?) then the normal 120mm fans to pull more air. IMO they aren't louder then the 120mm I had before but more a lot more air. They are as loud as the 8k 80mm in my 1U SuperMicro server that sounds like a jet take off when I turn it on...

[-]

remghoost7@reddit

I'm curious, do you have to us custom drivers for them....?
Or can you just update your drivers via the Nvidia App...?

That was always my hitch on grabbing one of these.

[-]

No-Refrigerator-1672@reddit

I have 2x 3080 20gb, they work with proprietary Nvidia drivers out of the box no issue.

[-]

BillDStrong@reddit

These are the ones that might be in my budget range. Are you happy with the 2? Or is 40GB not enough?

[-]

-dysangel-@reddit

is 40GB not enough?

Enough for what purpose? I have a 512GB and 128GB unified memory machines, and there is always something just out of reach either in terms of VRAM or being fast enough to run it well, etc. Enough really depends on what you want out of it.

[-]

Heathen711@reddit

Old post: from when i talked about it last has some info, but short answer: nothing special, normal drivers under ubuntu.

[-]

LeatherRub7248@reddit (OP)

Nice thanks for dropping by!
mind if i DM you?

I've seen options for liquid cooling for a slight markup.

did you source them all at the same time?

[-]

Heathen711@reddit

you can DM but i don't see why, we can talk here.

No i bought one to test it out, and then two more to scale it up.

[-]

LeatherRub7248@reddit (OP)

can you share your source suppolier? i've seen a ton on taobao but tbh im not sure i trust the reviews.

also how heavy do u run them? 24/7 with constant load with multi-user environment? Or is it bursty?

[-]

Heathen711@reddit

ebay -> bodorship -> "OEM 48GB RTX 4090 Founders Edition Dual width GPU Graphics card Ganming/ Server"

they are a reseller, but they've helped me get in contact with the actual shop who made them when i first started (the first card i got have ECC turned on and the noob i was back then freaked when ram was 46gb and not 48gb). They also offer warranty on their card via that shop so made it less risky IMO.

The SD workflow is very bursty CPU, high VRAM (wan 2.2).

The LLMs are hit hard on weekends/nights (when i'm not at work) and i'm vibecoding so there it will be hours of high gpu and vram utilization.

I never turn them off (solar on my house offsets my electricity cost) so they will sit idle with high vram all day. Single user environment BUT i do have n8n workflows that use them so they do get concurrent requests from time to time, but with 2 gpus running 27b and then litellm routing to least active it load balances well.

[-]

LeatherRub7248@reddit (OP)

thank you for sharing, this is giving me some confidence to try this out

[-]

Heathen711@reddit

Dropping this incase you do go this route: Fans i swapped into my server to keep the cards cool: S12038-4K

[-]

palindsay@reddit

Same here. I lower power on mine to keep them cooler with not much performance impact. It also keeps blower quieter.

[-]

Heathen711@reddit

yup, 350w gives you the same perf as 450w, BUT i did notice that it causes problems when vram is in the 90% range, so i swapped the fans so i can push it. Old post: from when i talked about it last

[-]

Kulidc@reddit

I have one 4090D 48GB before but I sold it.

Its a nice card, I would give him that. 48GB of vram is plenty for full context length text inference (I mainly use Qwen 3.5 27B with vllm on it), image and video generation. I had a wild ride with it, especially for the first few weeks as a headless server.

However, it is kind of going down hill from there.

1) The card is load as hell, even with the fan control and power limit from MSI Afterburner. I PLed it down to 70%, which is around ~300 watt and I assume a lot people will do the same, and there is not much noise improvement imo.

2) The modified vbios is kind of buggy. Don't get me wrong, the card takes around 425w at full load, which indeed is the TDP of the card. Yet, the card could take up to ~80w even when idle as a headless server, mostly around 50 to 60w. I assume this is some kind of leaked vbios from nvidia for testing.

3) The lifespan of the card. Do noted that AD102 core are being re-soldered onto the new PCB, and this actually shaved off the lifespan of the core. The card could lasts for 3 more years and it could be dead the next day.

Some chinese forums did have some posts warned about the source of the card itself after experiencing core or vram failures. If the card comes from OEM factory, you will have a less chance of encountering those problems. You will have a much higher chance of failures if the card comes from small workshop that handle the vram and core soldering manually.

I think Brother Zhang (Zhang Ge from Bilibili, one of workshop owner that Gamer Nexus introduced on YT) repaired a 4090 48GB before.

So yeah, I sold the card near the price of purchasing after 6 months of using, a few hundred dollars less maybe, which I consider this as operating cost.

[-]

Positive-Road3903@reddit

Weird thing about Brother Zhang, that he uses lead-free solder balls for reballing the cores.

[-]

ThatsALovelyShirt@reddit

Can break the rules, they've got that RoHS sticker! /s

[-]

Kulidc@reddit

Lead solder balls have a much lower melting point, making it easier for small workshops to process the VRAM and core. This, however, also results in the resoldered component not being able to tolerate high heat. GPUs generate a high amount of heat nowadays, and this could most likely weaken the soldering even if it is not reaching the actual melting point of lead alloys.

I think this is the main reason why Brother Zhang used lead-free solder despite the cost and trouble of operating the BGA rework machines.

[-]

LeatherRub7248@reddit (OP)

Thx for the real world experience !!! Really helpful datapoint. When u say buggy vbios is it purely the idle power draw? Or were there other issues (unstable, hang etc )

[-]

nightowlflaps@reddit

I'm gonna tell you I had one of these and it completely died after about a year of use.

[-]

Kulidc@reddit

I forgot to mention that the idling doesn't mean vllm or llama.cpp are loaded in the background and waiting for a new prompt. It is just constantly drawing 60W while doing absolutely nothing.

I did have some trouble making my Windows 10/11 recognize this card. I ended up needing to use DDU (Display Driver Uninstaller) to reinstall the Nvidia driver.

However, this problem does not occur in my headless Linux mini PC for some weird reason. I suspect it could be that Linux has a high tolerance for this kind of hardware. If you want to buy one, you may need to keep this in mind.

Other than this and the high power draw, I haven't encountered any other issues, the card just worked fine like a regular 4090.

[-]

LeatherRub7248@reddit (OP)

Did your Linux mini PC fit it directly in the case? I have a mini PC but light need a pcie extender or something to fit this

[-]

Kulidc@reddit

I had two headless Linux mini PCs. One came with a native 4.0 x8 PCIe slot (Beelink GTi series, which should be 12 iirc), and one without. Both were on top of my main 5090 PC.

I used a Beelink external docking station (the name should be either EX or EX Pro docking station; I cannot remember as the whole setup had been given to my relatives) with the 4090.

I am pretty sure you could use m2 to oculink to connect with the card with an eGPU docking station.

[-]

Dany0@reddit

nah it's not a leaked bios, most likely they modified a 4090 bios by taking values from a pro card, hence the idle power draw, they're probably tricking the protections by modifying resistors too

[-]

Kulidc@reddit

It could be. I am not brave enough to take apart this card, not only because I have insufficient knowledge of the PCB, but also because I don't have enough equipment.

However, with my half ass mandarin, Many Chinese forum posts believe in the possibility of a leaked VBIOS rather than a modified VBIOS. The same goes for the gaining-popularity 4080 Super 32GB in Chinese forums.

[-]

330d@reddit

RTX 5000 Pro Blackwell 48GB can be had for sub 4.2k CHF (5.35k USD) in Switzerland, new and with warranty. A modded 4090 should be under 3k USD for me to even start considering that, it was nice for a certain period I'm sure but not interesting to me personally anymore. A modded 20GB 3080 for around 700 USD is more interesting though.

[-]

LeatherRub7248@reddit (OP)

What use case do u have for the 20gb 3080? Anynparticular models in mind ?

[-]

eviloni@reddit

I run Qwen 3.6 27B on a pair of them at full context

[-]

330d@reddit

How much were they if you don't mind?

[-]

eviloni@reddit

I paid under $600 at the time. But here is the ebay listing i bought from

OEM NVIDIA Gefoce RTX 3080 20GB Turbo GPU GDDR6X PCIe 4.0 x16 Graphics Card | eBay

[-]

330d@reddit

In my case it's interesting for diffusion workflows and pipeline parallelism for the price, sorry didn't notice this was LocalLLaMa thread, should have been more specific.

[-]

eviloni@reddit

You can get a modded 4080 with 32gb vram for under $1500 on alibaba, so you can get like 3x of those for the cost of one RTX5000 pro for double the total vram

[-]

2Norn@reddit

on ebay cheapest i saw was 4150, mostly aroundd 4500

in that case yeah 5000 pro makes more sense

[-]

eviloni@reddit

I have 2 of the modded 3080's and they work just fine. Best value GPUs out there for local llm purposes. Been side eying the modded 4080's though

[-]

free-interpreter@reddit

Where did you buy them?

[-]

eviloni@reddit

I didn't want to wait on China so i just bought off ebay. Here's the actual listing i used (was below $600 when i bought)

OEM NVIDIA Gefoce RTX 3080 20GB Turbo GPU GDDR6X PCIe 4.0 x16 Graphics Card | eBay

[-]

Single_Ring4886@reddit

On reddit you get downvotes even if you write that water is wet...

[-]

formatme@reddit

plenty of people have tried them, its not worth the risk or the money buying a refurbished GPU with modded VRAM from china, anyone who has that money will just go all out for something better.

[-]

LeatherRub7248@reddit (OP)

48gb 4090 is probably abt USD3.5-4k, with the a6000 48gb at 1.5-2x that price.

So i guess the question is whether the markup is worth the reliability concerns.

Right now there's really no other option to get 48gb running decently at home at a better prrice.

[-]

tuura032@reddit

Why not just 2x3090 for $2k? Obviously harder to scale and has compromises, but substantial cost savings.

[-]

LeatherRub7248@reddit (OP)

i guess the trade off is PP / TPs speed, and also the fact its two cards (power / cooling etc).

[-]

tuura032@reddit

I get about 70 tps with qwen 3.6, with WSL and a PCIe bottleneck. cooling is a non issue. Power capped at 50% keeps total wattage 400-450w.

Being two cards is an issue, but there are an awful lot of people that either want to add more GPU or already have 2-4.

I don't think I'd spend another $2k for more speed or a cleaned up desk. Harder to replace a modded 4090 if something goes wrong.

That said, I'm more than happy to read about other people's experiences with the cards! If prices came down I'd be interested.

[-]

popecostea@reddit

You’re missing the RTX PRO 5000 my friend.

[-]

Cargo4kd2@reddit

3x rtx pro2000 is at about 2.4k a pro 5000 48gb is about 5k no reason to take the risk imo

[-]

computune@reddit

Hello,

I am the only person to be doing these upgrades in the USA Since Sept 2025 (plug: gpulab.net)

Ive upgraded roughly 100 cards so far and you can find my work on youtube.

I want to state that I upgrade regular 4090's, the full power ones we have here in the US. China have "D" variant gpu cores which are gimped in performance 10-15% compared to the ones we have in the USA. So i do these mods to full 4090's and the performance stays the same.

1: They have a modified vbios, but run as normal cards without any driver tweaks, some complain they have no P2P but this is a non issue for 99% of workloads ive seen them used for. Local diffusion (comfy, llm across multiple cards, typically 96gb vram). Their performance is the same as a 24gb card, i have a video comparing performance across multiple benchmarks, llm, diffusion workloads.

2: They run without issue, and some of my customers run them in VAST farms without issue.

3: Ive seen a few failures which the user RMA's the card, this is typically because theyve been running it HOT in a vast farm and the rear memory modules get hot and if left hot for a long time (hours-days on end), after a few months a memory module can fail and needs to be replaced, this is a standard repair procedure that any gpu repair shop can identify and replace.

To address this, ive developed a custom backplate that has better cooling fins, and holes to mount a 90mm fan, its coming to market in about a month. Water blocks are also on the way in about a month.

4: on my website (llm comparison, and youtube channel has them being used for gaming and blender rendering to compare. Same card, same performance.

5: In the USA, right now 1449 for upgrades and 3650 for a whole card. Increased prices due to shortage of memory chips. The memory chips used for them are only becoming more rare as theyve stopped producing them and this upgrade/mod is in demand.

Also 32gb super's are on the way and will be available next month.

[-]

LeatherRub7248@reddit (OP)

Thx for sharing! Do modded versions have higher idle power draw vs non modded as commented earlier by one user ?

[-]

computune@reddit

I actually dont know compared to non modded. typically i run them on linux with the official nvidia drivers. Off the top of my head i think I see them idle from 16-30w

[-]

TheAncientOnce@reddit

I've been seeing a lot of these for a while. The cards that could be modded are few, partly bc of the software limits. But 2080 ti 22gb mods, H100 32gb mods and 4090 48gb mods are flooding sites like Taobao. I think the stuff is pretty mature now but every once a while you still run into dead cards so only the hobbyists are willing to go into it knowing the risks. Some of these mods are done by gpu motherboards(?) they made where the board could house more vram chips. So the quality if the mods depends on the quality if the supplier and the industry is simply not as regulated. From the bilibili videos I've seen, they work, and the benchmarks aren't a lot worse than the original version of the cards. So, if you still wanna go into it, I think these guys also sell them on ebay if you specifically search up modded cards.

[-]

havenoammo@reddit

Saw some 5090 @ 96 GB VRAM not sure if real though. Technically makes sense since the RTX PRO 6000 is the same chip AFAIK, with just an unlocked BIOS and more VRAM.

[-]

thesuperbob@reddit

I asked the Alibaba store offering those for pricing and availability and got no specific answers, so not sure if those are real. Also not sure how it would make financial sense over a proper RTX PRO, since it probably wouldn't be significantly cheaper, while being a helluva lot more sketchy.

I mean, maybe if I was a company looking for 100+ of those and actually went over there to talk to the guys at the company, figured they are legit, and negotiated a good deal... Then it might be worth considering, but for getting just one or two as a small company/rich hobbyist, IMO that's not a reasonable price range for "trust me bro" kind of assurance/warranty conditions.

Still curious about them though.

[-]

grumd@reddit

I saw 3080 20gb on Alibaba

[-]

LeatherRub7248@reddit (OP)

hmm in other words, this is a niche market hence the relative silence.

come to think of it, you might be right!

[-]

WonderRico@reddit

I got two of them running perfectly fine, for a little more than a year now.

using official nvidia open drivers. in a proxmox LXC on an older gen epyc motherboard, in my garage because very loud.

software limited to 300W

using this : https://github.com/sasha0552/nvidia-pstated to make them go in lower state and lower their idle power draw to 22W (instead of 60 without)

It was a gamble to by those, and i'm glad I took it. however, I wouldn't advise anyone to do the same if you cannot afford to loose the money.

[-]

clairenguyen_ops@reddit

The hardware modding scene for local LLMs is wild. It's great to see people pushing the boundaries of what's possible for consumer-grade inference. More VRAM is the eternal bottleneck.

[-]

clairenguyen_ops@reddit

this is great

[-]

Ok-Measurement-1575@reddit

It's an extremely loud 4090 with no real hope of making it quieter.

Kinda impossible to run in your own home if you value your sanity.

[-]

horeaper@reddit

There are water cooled version on taobao

[-]

t3rmina1@reddit

PewDiePie used an all modded 4090 rig if I recall correctly.

[-]

a_beautiful_rhind@reddit

P2P doesn't work and it's no longer cheap enough vs some of the blackwells. It's time was a year or even 2y ago.

[-]

techlatest_net@reddit

The reliability bit is what keeps me from jumping in. Cooling and VBIOS quirks sound manageable, but a random early failure would be a pain.

[-]

clairenguyen_ops@reddit

Reliability on these modded cards is the big variable - if the VRAM is non-standard your inference stability tests will tell you more than any benchmark. Worth baking a failure budget into any hardware decision like this.

[-]

HumanDrone8721@reddit

Coming from EU, no, not really, ALL available one around are in China, that brings VAT+Customs (and risk of confiscation and destruction for stupid reasons, like no valid CE certificate), the risk to get a castrated D variant is huge and the warranty is non existent.

The RTX Pro 5000 is exactly at the same price and has full vendor support and warranty.

The only time when this is a valid choice if you can get some defective boards to scrap the RAM and GPU and have it modded on a local shop that does this, the blanko kit with everything is less than 300€. But if you don't know anyone at Amazon retoure, it is impossible to find them at an acceptable price to beat the Pro 5000.

[-]

ThenExtension9196@reddit

A have many of these cards. They are 4090s with a loud blower on a custom pcb with fairly mid to low end board components. Modified vbios. That’s it. No magic.

[-]

LeatherRub7248@reddit (OP)

Do they work well and reliably for you?

[-]

t3rmina1@reddit

Most of us just use Pro 6000s. At current prices the cost of 2x modded 4090s is pretty close to a single RTX Pro 6000 and simplifies everything about the power, thermals, and pcie connections.

[-]

Last_Mastod0n@reddit

If i knew how to microsolder i would consider modding my 4090

[-]

shaghaiex@reddit

I have some EMS background. I wouldn't touch BGA.

I would probably need a day of training 😉 (tools are cheap, if you skip the x-ray)

[-]

Flashy_Oven_570@reddit

I looked into this a ton. Sourcing the vram chips is a huge issue and after a point it just becomes more cost effective to buy another card. My mandarin isn’t the greatest but even looking on 淘宝(taobao) didn’t give any real good sources. They’re pretty reliable from what I’ve heard and what others have said, only catch with them is you need to flash a custom bios.

[-]

No_Mango7658@reddit

Old news

[-]

Raredisarray@reddit

That was a cool YouTube. Very interesting !!

[-]

FearFactory2904@reddit

My assumptions could be wrong but I will explain why im not considering it.

Theres probably 2+ million unmodded rtx4090 in circulation. If theres some bug or glitch or issue with production methods it will become very obvious. With a smaller sample of modded cards it may take much much longer to notice issue tendancies or trends. Also i dont know to what accuracy the modified cards are identical since tolerances may not be as tight as the mass produced official factories. My worry would be spending a lot of money on something where I could be the 1 in 100 that the solder is just sloppy and causes a short lifespan with an expensive catastrophic failure and then exchanges with overseas modders could get even more complicated given the current political climate. So easier to just use unmodded cards.

[-]

TanguayX@reddit

I will say, it is fascinating. Talk about brute force. Yow