Apparently Asus is working with Nvidia on a 784GB "Coherent" Memory desktop PC with 20 PFLOPS AI Performance

Posted by waiting_for_zban@reddit | LocalLLaMA | View on Reddit | 79 comments

Somehow the announcement went under the radar, but back in May, along side the Ascent GX10, Asus announced the ExpertCenter Pro ET900N G3, with GB300 Blackwell. They don't really say what's a "Coherent" memory, but my guess it's another term of saying unified memory like Apple and AMD.

The announcement and the specs are very dry on details, but given the GB300, we might get a very decent memory bandwidth, without looking like a hideous frankestein monster.

This might be r/Localllama wet dream. If they manage to price it well, and fix that memory bandwidth (that plagued Spark), they have my money.

[-]

ALeglessSpider@reddit

Yooo they unveiled it today!

[-]

waiting_for_zban@reddit (OP)

Just saw the post. 110k FFS, kinda nuts. You can buy 10x RTX 6000 Pro with that money.

[-]

TokenRingAI@reddit

100K+

[-]

Regular-Forever5876@reddit

My source said 150k

[-]

TokenRingAI@reddit

You can sell a kidney in some countries

[-]

some_user_2021@reddit

But if you buy more, you save more

[-]

ortegaalfredo@reddit

Looks like price is close to 6 figures.

I would like something close to 2 figures.

[-]

Lissanro@reddit

6 figures? I rather see more reasonably priced GPUs with larger VRAM. I guess I will keep my workstation for now, with EPYC 1TB RAM + 96GB VRAM (4x3090), I managed to build it using 4 figures.

[-]

Own-Junket6393@reddit

Hi, am trying to build my own, I liked your RAM/GPU config. Do you mind sharing the full parts list?

[-]

Lissanro@reddit

If interested to know more, in my another comment shared a photo and other details about my rig including what motherboard, GPUs, PSUs and other parts I use and what the chassis looks like.

[-]

Mart-McUH@reddit

Ferengi say they can accommodate your request. You can have one for 99 bars of gold-pressed latinum.

[-]

ortegaalfredo@reddit

Never do business with dirty ferengi.

[-]

waiting_for_zban@reddit (OP)

I mean the annoucement featured a guy working in an office, next to a badly photoshopped version of the PC. I doubt his boss paid 100k for it.

I am hoping it will be around 15k-20k mark, but given the volatile NAND prices, it might be far fetched.

[-]

Karyo_Ten@reddit

What's worse is that the Asus box used to look different but they put the same guy.

[-]

Karyo_Ten@reddit

[-]

jay-aay-ess-ohh-enn@reddit

In that article it says:

While pricing and general availability details for the ExpertCenter Pro ET900N G3 are yet to be officially released, this custom-designed system, based on NVIDIA's DGX blueprint, is anticipated to come with a premium price tag, potentially exceeding $30k.

[-]

az226@reddit

Likely $40-50k.

[-]

az226@reddit

Zero chance this will be 15-20k.

[-]

ortegaalfredo@reddit

I respect that they didn't even use AI for the image.

[-]

ThisGonBHard@reddit

It would match with it being exactly 4x GB300 GPUs, as there is an 1.4 TB of VRAM version there.

100+k range

[-]

waiting_for_zban@reddit (OP)

According to Dell's annoucement of the Dell Pro Max, the setup might be with 496GB LPDDR5X CPU memory, 288GB HBM3e GPU memory.

[-]

holchansg@reddit

I mean, if you consider something in the ballpark of at max 512bits of memory bandwidht, LPPDR5x dies, i could see it being like sub 30k... How much is a 512gb Mac Ultra?

[-]

fallingdowndizzyvr@reddit

Looks like price will be close to 6 figures.

Which is what a 512GB Mac Studio is so that would be a bargain price.

[-]

-Sliced-@reddit

A 512GB Mac Studio is 4 figures.

[-]

fallingdowndizzyvr@reddit

Oh you are right. I got my 5 and 6 figures mixed up.

[-]

MrPecunius@reddit

[-]

Freonr2@reddit

It is the Nvidia DGX Station, announced back in March with the Spark.

https://www.nvidia.com/en-us/products/workstations/dgx-station/

It's a GB300 288GB (2 PB/s 20 TFLOP fake sparse fp4) + 496GB of LPDDR5X at ~400GB/s.

I'm sure Asus, Dell, Gigabyte, etc will all make their own branded version.

Nothing new to see here. Rumor is $80k, which was sorta announced as Dell is giving one away as a prize for the B200 nvfp4 cuda kernel competition that is currently going on in collaboration with GPU Mode.

[-]

waiting_for_zban@reddit (OP)

Rumor is $80k, which was sorta announced as Dell is giving one away as a prize for the B200 nvfp4 cuda kernel competition

I really hope not, that 80k is a very hefty price. You can build a monster server with 10x RTX 6000 Pro for less than that, and it would be nearly 1TB of full VRAM and not split between LPDDR5X and HBM.

I hope Nvidia doesn't over-estimate their market again and release an underpowered device for big bucks, otherwise I will be waiting for the next AMD release cycle. Rumor has it they are prepping a 128GB VRAM GPU for mid-2026.

[-]

Karyo_Ten@reddit

Not a rumor you can preorder it at https://gptshop.ai/config/indexus.html

The difference is that RTX Pro 6000 would have a PCIe5 duplex 64GB/s interconnect while GB200 is 900GB/s interconnect and the GPU itself has 8TB/s memory bandwidth.

And with the recent rise in memory price ...

[-]

waiting_for_zban@reddit (OP)

How trustworthy is this shop?

B200 1.5T

available now, 1 month lead time - from $350,000

Kimi K2 Thinking 1T FP8 up to 1000 tokens/s

350k in a "desktop" form factor?

But the GH200 624GB is interesting

Nvidia H200 Hopper Tensor Core GPU 480GB of LPDDR5X memory with EEC 144GB of HBM3e memory

for 39k, just to run Kimi K2 Thinking 1T FP4 >10 tokens/s.

My shoddy setup can run Kimi K2 Thinking with smol-IQ2_KS with 4 tk/s or smol-IQ4_KSS with 1.5 tk/s.

Does 10x token speed increase, justify 10x price increase? Dunno. But 39k for running K2 at FP4 is wild. You can semi achieve this with 2x Mac studio M3 Ultra, and assuming next year Apple would drop an M5 Ultra with MMUL, it might be a serious contender.

Nonetheless, it is good to have competition. Still waiting on AMD to enter this segment.

[-]

No_Afternoon_4260@reddit

Gh200 are grace-hopper architecture it doesn't support fp4, Blackwell does

[-]

waiting_for_zban@reddit (OP)

Gh200 are grace-hopper architecture it doesn't support fp4, Blackwell does

It can still run, just not optimized to run it. That's why 39k for just 10tk/s FP4 is a bad value proposition.

[-]

No_Afternoon_4260@reddit

Yeah ofc, also this is 144gb vram 480 ram and a 900gb/s bidirectional nvlink between cpu/gpu. I don't know any inference engine that's optimised for this setup.

[-]

waiting_for_zban@reddit (OP)

It's not clear yet how easy it is to deploy in such env, given that eveyrthing is in cuda ecosystem. We'll need to wait and see some hands-on setups. But it might have full vllm as it's all cuda baked in.

[-]

Karyo_Ten@reddit

You can use TensorRT and everyone running Grace Hopper uses vLLM or SGLang.

[-]

No_Afternoon_4260@reddit

Sure, what I'm telling you is if you are using nvidia-smi for example you'll only see 96 or 144gb of vram for the gh200 (because they did with a h100 chip (96gb) or a h200 (144gb)) The rest is ram attached to the grace arm cpu. Cuda or no cuda

[-]

muyuu@reddit

wow that's a lot of $$$, i wonder what kind of output do you get with that context size though

easier to rent to figure it out, i guess

[-]

Karyo_Ten@reddit

Large context is tricky, the Kimi-K2 Linear attention tries to solve this and you have a lot of info on the challenges of traditional attention with large context if you look for Kimi, example: https://medium.com/data-science-in-your-pocket/kimi-linear-bye-bye-transformers-c79f843f208c

And above 65K, at least for LLMs from before this summer (I think gpt-oss attention sinks, glm-4.6 increase of context size), the perf degrade a lot: https://fiction.live/stories/Fiction-liveBench-Sept-29-2025/oQdzQvKHw8JyXbN87

[-]

Freonr2@reddit

The price is pretty much expected when you compare to the datacenter GB300 systems.

[-]

Standard_Property237@reddit

I work for a large computer OEM, I can confirm it will be close to $75K

[-]

Dry_Management_8203@reddit

Wasn't 20 PFlops the theoretical performance of one human brain at work?

Did this also fly under the radar? For 200K, you could make a small copy of yourself?

[-]

MrPecunius@reddit

You caused me to look this up, and the opinions are interesting:

https://aiimpacts.org/brain-performance-in-flops/

[-]

ThisWillPass@reddit

100PFlops?

[-]

quantum_splicer@reddit

I personally think that, Nvidia actually need to be matching the hardware towards what is actually needed to accommodate large models.

I also think we've been playing it safe by sticking to traditional hardware which has basically been mostly the same for the last decade, like yeah we have made incremental progress. But hardware in use today is pretty generic. I don't see anyone doing anything substantially different.

We have various technologies in research which we know are likely to yield considerable progress but we are reluctant to diverge away from our cookie cutter standardized approach. But maybe that is more an function of fabrication infrastructure and the limitations in adapting to different processes.

[-]

Decayedthought@reddit

You act like it's easy to overcome years of research and IP development and years of software development overnight. The reason hardware hasn't changed much is because changing it makes the software not work. No one wants to make it all work on new hardware.

But having 10s of thousands of GPU cores doesn't seem innovative to you?

[-]

quantum_splicer@reddit

The hardware research side of things predates the mass rollout of ai.

Example is optical ram, it predates 2010.

Large companies with limited competition do not have large incentive to innovate, incremental releases and occasional larger changes.

Tech companies follow an pattern in there release cycles incremental releases and then an innovative release that is meant to be substantial.

I do not believe that tech companies randomly adopted this model, I believe the approach is taken because it's good for business.

An corporations overriding objective is to maximize profits anything beyond this is incidental.

[-]

LocoMod@reddit

Corporations survive on profits and making money. There are plenty of frontier ideas such as optical ram that won’t scale. Large companies don’t make conscious decisions to avoid innovation. They have teams of experts spending way more hours than you and I ever will figuring out the cost benefit of innovation. Because if they didn’t do that they wouldn’t exist, like many people with genius ideas that pursue those out of passion go bankrupt. Nvidia isn’t sitting around waiting for some unicorn to take its market share. They are enjoying their position of dominance while standing on the graves of the thousands of startups that where fueled by hopes and dreams.

In the real world you compete to win or you die. That’s it.

[-]

MrPecunius@reddit

They are enjoying their position of dominance while standing on the graves of the thousands of startups that where fueled by hopes and dreams.

... and reaping the rewards of buying/acquihiring dozens of other startups with good ideas.

[-]

FullOf_Bad_Ideas@reddit

Nvidia is amazing at making hardware for running big LLMs.

We just don't have the money to be their primary customer for this hardware.

[-]

Dontdoitagain69@reddit

Word

[-]

noiserr@reddit

The tech isn't the issue. We have the tech. mi300A is a perfect example. You could have had that instead of Strix Halo two years ago. But the issue is cost.

[-]

CatalyticDragon@reddit

So it'll be NVIDIA's take on an ARM based Threadripper.

[-]

Dontdoitagain69@reddit

More like Ampere and Graviton

[-]

MrGupplez@reddit

Makes me think of this meme:

https://www.reddit.com/r/comics/comments/d1sm26/behold_the_ultimate_life_form/

[-]

Different_Fix_2217@reddit

I saw that before, using the GB300 with a estimated price point of $80K

[-]

evildeece@reddit

They probably mean cache-coherent, so writes to a memory address from any device in the bus will invalidate that address in other devices cache

[-]

blbd@reddit

I mean... that's nice and stuff... but Nvidia's machines use janky proprietary Linuces...

[-]

spaceman3000@reddit

It's just ubuntu.

[-]

Eugr@reddit

It's not that bad. DGX OS is just Ubuntu 24.04 (currently) with modified kernel and extra packages pre installed. Kernel source is available, all other packages too. They even provide instructions on how to set it all up on top of stock Ubuntu.

I was able to install Fedora 43 on my DGX Sparks, and after I compiled nvidia kernel, everything was working as expected.

[-]

AIMadeSimple@reddit

The real breakthrough here isn't the 784GB—it's the "coherent" memory architecture. If this is truly unified like Apple's approach, you eliminate the CPU-GPU transfer bottleneck that kills performance on traditional setups. The GB300's 2 PB/s bandwidth means you could run 405B models at full speed without quantization. But at rumored $80k, this is enterprise territory. The real question: will this push consumer GPU makers to finally offer 48-96GB options at reasonable prices?

[-]

Finanzamt_Endgegner@reddit

I doubt this will be cheap with ramageddon atm...

[-]

sourceholder@reddit

Yeah, announcement is poorly timed. I suspect product will be shelved due to new market conditions.

[-]

Zyj@reddit

This product caused it!

[-]

WithoutReason1729@reddit

Your post is getting popular and we just featured it on our Discord! Come check it out!

You've also been given a special flair for your contribution. We appreciate your post!

I am a bot and this action was performed automatically.

[-]

fallingdowndizzyvr@reddit

"Nvidia didn't disclose the recommended pricing of its DGX Station, which will be sold by Asus, Boxx, Dell, HP, Lambda, Lenovo, and Supermicro. Keeping in mind that each compute GPU in an SXM form factor costs tens of thousands of dollars, the DGX Station will likely cost a five-digit sum."

https://www.tomshardware.com/tech-industry/artificial-intelligence/nvidia-unveils-dgx-station-workstation-pcs-gb300-blackwell-ultra-inside

[-]

pscoutou@reddit

That article was written March 2025.

Prices are very different (higher) today.

[-]

az226@reddit

When I tried this coherent memory crap it didn’t work. I tried using all their special images and what not and it just wasn’t working. Compiling PyTorch from source didn’t help either.

They really ought I fix the software before releasing the hardware into the wild.

Dollars to donuts this pflops figure is 4bit not 16bit.

[-]

Tyme4Trouble@reddit

This is a DGX Station GB300 clone. We already know most of what it’ll entail other than price which I expect to be north or $50K the GPU alone is valued at around $40K. The rest of the memory is LPDDR5x memory.

It’s essentially half of a GB300 with one of the GPUs carved off.

[-]

waiting_for_zban@reddit (OP)

https://www.dell.com/en-us/lp/dell-pro-max-nvidia-ai-dev

Interesting! Dell offers some more tidbits 496GB LPDDR5X CPU memory, 288GB HBM3e GPU memory

[-]

They should give it to all of us for free and book it as a bubble investment.

[-]

tired_fella@reddit

Price:200k per unit.