Help me upgrade for 3k
Posted by Borkato@reddit | LocalLLaMA | View on Reddit | 17 comments
My current system:
Intel core i7-11700KF
48 GB RAM
ASROCK Z590-C/AC mobo
RTX 3090 24GB (undervolted to 250W) + RTX 3070 8 GB, and a third unused (because it doesn’t fit in the case) RTX 2060 6GB (mentioning this because it would be cool to plug it in to a larger build if that’s what’s recommended)
1000W PSU
I was wondering if I should just buy a single 3090 for 1.5k and shove it in here to replace the RTX 3070 and get 48GB VRAM, or if I should use all of the 3k and try to get a setup that allows me to upgrade in the future to many cards at once, also allowing me to have the 2060, 2x3090, and 3070. I know they claim there’s bottlenecks but I used to run just the 3090 and the 2060 together and it was great.
Qwen told me that there will be PCIe bottlenecks and such with the PCIe lanes but I don’t really understand all that.
I’m irritated too because I’m stuck on DDR4 RAM. I like to run models like qwen 3.6 27B Q5 and it’s great right now at 27T/s eval and 900T/s pp, but I would absolutely love to try out some older 70B models for things like RP or even 122B models for coding and such. Any ideas?
Some earlier threads mentioned an EPYC cpu with a mobo that will fit it like a super micro x10srl? I’m not sure lol.
Bulky-Priority6824@reddit
Is this upgrade-itus or is there a shortfalls with what you're running now?
Borkato@reddit (OP)
I really, really want to be able to run higher quants because people say it’s much better, particularly I think 48GB vram would be great, with room to upgrade if I want more later… i also want to train 24Bs for writing and such.
All of this is a little limiting on my 30GB VRAM. But honestly it’s probably more upgrade-itis… I’m like 70% of the way to where I can say “alright, that’s enough” lol. Back when I was on my 3070+2060 I could feel I wanted way more, and now I’m almost there 😂 I just get annoyed at how dumb the models are sometimes and hope higher quants will fix it
Bulky-Priority6824@reddit
It probably won't fix it with a 16gb increase in model capacity or a minor quant bump but I guess the more important question is , what do you mean by its dumb have you analyzed your prompting vs output?
Stock_Ad9641@reddit
You’ll not want more than 2 GPUs, and for 2 GPUs only some motherboards support dual 8x pcie. With more GPUs your IO between them is slow, especially on old pcie standards.
You can run 27B on a 3090 quite comfortably in Q4 kv cache. Dual 3090 would allow to tensor parallel them to double your Prefill speed but that conflicts with MTP currently.
A 4090, if you can get one for a good price, would also be a significant upgrade.
Keep in mind that the lowest card will be a bottleneck. A 20 series card is very low in cuda support, so it might drag your 30 or 40 card down if used simultaneously.
I’d also consider a 5090 if you are an enthusiast, it’s expensive but solves your speed and vram issues.
PrintEngineering@reddit
The 2060 runs pcie3.0 which will downclock anything trying to communicate with it. That is what you get rid of. You can run your gpus external using an expansion board like attached image. You will spend 600 on it, 250 on cables, and 100 on risers. You can run 5x gpu at full PCIe4.0 x16 bandwidth for p2p communication amongst each other or 10x gpu at x8 half bandwidth single cable connections without too much penalty. It's better to load an ssd drive onto the expansion card for loading models. If you wanted to be really fancy you could raid a couple of ssd and get the models to load faster but at the end of the day just having this and the ability to configure all of your gpu externally is all you need. The rest of your system is almost irrelevant if you have this.
Borkato@reddit (OP)
This is quite interesting, I have never seen this! I will look into it, thank you :)
bluelobsterai@reddit
Figure you can sell your 3 cards… buy one modern one? Fp8 advantages are real. I’d sell the 3070 and 2060 and get a pro5000
Borkato@reddit (OP)
Hmmmmm I did not know a pro5000 existed. Thats so much money haha 😅 I originally set my budget at $2k, $3k is already stretching it!
bluelobsterai@reddit
Sell all three cards. https://www.bhphotovideo.com/c/product/1898513-REG/pny_vcnrtxpro5000b_pb_nvidia_rtx_pro_5000.html
Borkato@reddit (OP)
What would this get me exactly that 2 RTX 3090s wouldn’t? O:
bluelobsterai@reddit
Not for fp8 inference. It’s like 4 x 3090. For fp16 ( older 70b ) the 3090 is supreme
Borkato@reddit (OP)
Interesting, will research!! Thank you 🫡
nosimsol@reddit
$3800 or less gets you a 32GB 5090
Seems to be often the case, a little more gets you a little more and then you want a little more than that!
Borkato@reddit (OP)
True, but then my ram isn’t upgraded, there’s 0 room for more upgrades in the future vs a rig with empty slots, I can’t use either of my other GPUs and will just have to sell them, i have no idea if it’ll even fit in my case etc… also $3000 is lowkey a hard ceiling haha
nosimsol@reddit
IMO, with tech, it’s a hard call without knowing the future.
Borkato@reddit (OP)
That is true. :(
Borkato@reddit (OP)
Oh and I’m on Linux. I use arch btw. And I’m vegan btw. ;)