Sanity check for a Threadripper + Dual RTX 6000 Ada node (Weather Forecasting / Deep Learning)
Posted by Icy_Gas8807@reddit | LocalLLaMA | View on Reddit | 7 comments
Hola!!
tldr
I’m in the process of finalizing a spec for a dedicated AI workstation/server node. The primary use case is training deep learning models for weather forecasting (transformers/CFD work), involving parallel processing of wind data. We are aiming for a setup that is powerful now but "horizontally scalable" later (i.e., we plan to network multiple of these nodes together in the future).
Here is the current draft build: • GPU: 2x NVIDIA RTX 6000 Ada (Plan to scale to 4x later) • CPU: AMD Threadripper PRO 7985WX (64-Core) • Motherboard: ASUS Pro WS WRX90E-SAGE SE • RAM: 512GB DDR5 ECC (8-channel population) • Storage: Enterprise U.2 NVMe drives (Micron/Solidigm) • Chassis: Fractal Meshify 2 XL (with industrial 3000RPM fans)
My main questions for the community: 1. Motherboard Quirks: Has anyone deployed the WRX90E-SAGE SE with 4x double-width cards? I want to ensure the spacing/thermals are manageable on air cooling before we commit.
-
Networking: Since we plan to cluster these later, is 100GbE sufficient, or should we be looking immediately at InfiniBand if we want these nodes to talk efficiently?
-
The "Ada" Limitation: We chose the RTX 6000 Ada for the raw compute/VRAM density, fully aware they lack NVLink. For those doing transformer training, has the PCIe bottleneck been a major issue for you with model parallelism, or is software sharding (DeepSpeed/FSDP) efficient enough? Any advice or "gotchas" regarding this specific hardware combination would be greatly appreciated. Thanks!
NewBronzeAge@reddit
no point in getting threadripper when you can use epyc imo.
Icy_Gas8807@reddit (OP)
Thanks! Will see the comparison, price diff is huge!!
MelodicRecognition7@reddit
you should buy RTX PRO 6000 instead of RTX 6000 Ada
Icy_Gas8807@reddit (OP)
Going for pro 6000, availability issue of 6000 anyway
Normal-Ad-7114@reddit
What for?
If you're using GPUs for the neural networks, 90% of this will idle at all times
Icy_Gas8807@reddit (OP)
It will be a server for my company, we need it for continuous development of forecast model as well.
ResidentPositive4122@reddit
What's the price for Ada? Last I checked it went down a bit but not enough to justify it when 6000PRO are readily available. You get new arch, fp4 support and double the VRAM.