RTX Pro 4000 + 2000 Ada ?

Posted by bromatofiel@reddit | LocalLLaMA | View on Reddit | 4 comments

So I just bought a RTX Pro 4000 BLACKWELL 24Gb to replace my RTX 2000 Ada 16GB, So far, I've been tinkering with llama-cpp, and esp. with Qwen 3.6 MoE , I was wondering if it was worth keeping the two GPUs. I know theorically, more VRAM is better, but do I have to follow RAM-like rules such as "both GPUs should be of the same size" or something similar? Morever, can both GPU communicate over PCIe or should I look for a more exotic connectivity? Kind of a GPU newbie here, so sorry for the dumb questions ¯_(ツ)_/¯

[-]

PassengerPigeon343@reddit

You can experiment with splitting across cards, or you can push the models to one card and use the second card for other workload like a speech-to-text model for voice mode or a smaller task model. If your main model doesn’t support vision for instance, you could have an always hot second vision-capable model to route vision tasks to. It’s always nice to have more compute and more VRAM.

[-]

I’m new too so this is based on my own research and I could be wrong.

RTX Pro 4000 + 2000 Ada ?

PassengerPigeon343@reddit

abnormal_human@reddit

Miserable-Dare5090@reddit

Kyuiki@reddit