Upgraded to a 3rd GPU (x3 RTX 3060 12GBs)

Posted by Dundell@reddit | LocalLLaMA | View on Reddit | 15 comments

Figured out how to add a 3rd RTX 3060 12GB to keep up with the tinkering. My Ecne AI hopefully will now fix Mixtral, plus additional features like alltalk I want with a good rate. My brother is printing a vertical mounts for the new GPU to get it off the case floor.

Reply to Post

15 Comments

[-]

tatogt81@reddit

Hey u/Dundell loved your setup and sorry to revive an old post but your setup is similar to what I want to achieve, I am already running 2 GPU since my PSU (900w) has 2 GPU-power ports for powering both, how did you power your 3rd GPU what is the capacity of your PSU? could you share some info about the wiring and PSU? Are you using: pcie power splitter? molex to pcie? sata male to pcie? are you undervolting your gpus? Thanks in advance for sharing. I am sharing my PSU recommended options if it helps for any tips or suggestions. https://preview.redd.it/dohld0qfe3oe1.png?width=783&format=png&auto=webp&s=77b1f1ed03bee21cac919f8cdae21fc98e5b11ce

[-]

Dundell@reddit (OP)

The PSU I was using was a Hydro G Pro 1000w by FSP. This at the time was $112 on sale, and 3 pcie slots with splitters so 6 PCIE slots. The RTX 3060 12GBs I use are rated max 170w each, but I limit to 100W each. I've moved onto a dual PSU setup on a mining bench with an X99 Board now with x4 RTX 3060's and 1 2080 Super. About to swap the 2080S for 2 more 3060's in a month as long as prices don't get too insane and run all at pcie3.0@4 lanes. If you need extra pcie power cables, you can just get a molex to pcie for extra slots, but I'd keep it a low wattage usecase on those 120w and under. Look at the side of the PSU, and it will tell you what is set for pcie, CPU, perphrials, etc.

[-]

tatogt81@reddit

Thanks for the help! what is your performance and use cases, I am trying to use my station to help me with coding tasks (PWAs and Android, some iOS dev), I saw your posts and it looks you already have some experience with coding tasks, any tips you want to share or cautions. I was actually looking to get a breakout board with hp psu and use a riser to get 4 GPUs but the prices keep increasing like crazy and in my country you almost have to add 40% to any price between shipping taxes and carrier fees. Do you know if I can mix different 3060s from different vendors like MSI and EVGA? ATM I have both Ventus 2x models but those keep increasing prices like crazy. Again, thanks for any help !

[-]

Dundell@reddit (OP)

I have 3 different RTX 3060's in my x4 configuration and I have them essentially 2 running 2different models on tabbyapi exl2 with tensor parallel 4.0bpw Qwen 2.5 coder 32B with a 8.0bpw 0.5B Qwen 2.5 Coder as a draft model running 30k context with a speed of 12\~30\~40 t/s depending on the size of the context. I basically for a while have been splitting it to x2 3060's each doing QwQ with draft, and Qwen 2.5 coder with a draft on exl2 at 4.0bpw both 30k context as an Architect and Coder for Aider and my personal website chained. Without draft it's around 22t/s. Combining all 4 into 4.0bpw Qwen 2.5 72B was 30k context at 14t/s. I'd be interested in another 72B model if it's the next best thing.

[-]

tatogt81@reddit

Awesome thank you for taking the time to answer I will keep monitoring prices to see if I bite the bullet on the 3rd GPU but in the meantime I will try to set up the two I already have in a similar way as you are describing. I will keep researching your posts for inspiration and tips. Keep it up and thanks again

[-]

D3cto@reddit

Also interested to see how this scales. Have an X99 with Xeon that can take up to 4GPU. I can get 4x 2060 12GB for a little more than a single 3090. 3060's a little more.

[-]

candre23@reddit

P40s. You want as many P40s as your electric bill can handle.

[-]

vinciblechunk@reddit

P40s get the job done. My electric bill can handle 4, but my power supply can only handle 2. :( I did something a little nuts and modified my P40s with heatsinks and fans from dead EVGA 1070Ti SCs. It works better than it has any right to. They only get loud if I have both of them pegged at 100% for more than a minute.

[-]

candre23@reddit

If you do want to strap on some more P40s, get yourself a [HP server PSU](https://www.ebay.com/itm/354931849434) and a [breakout board.](https://www.ebay.com/itm/304817433395) Big power, cheap and easy.

[-]

tatogt81@reddit

sorry to revive an old thread, quick question: how do you power up the secondary PSU? or do you turn it on before turning your pc on and after the pc is off?

[-]

candre23@reddit

The breakout board has a floppy power connector on it. You plug the floppy power cable from your primary PSU into that. When the breakout board senses power on that cable, it fires up the secondary PSU. When your primary PSU shuts down, it drops the power to the floppy connector, and the breakout board shuts down the secondary PSU.

[-]

tatogt81@reddit

thanks, I found the breakout board and a HP PSU in eBay, now wondering if I bite the bullet to have in a (uncertain due to prices) future a 2+ 3060 GPU setup because the newer generations keep becoming impossible to get in my country even the 3060 got a price hike of almost 1.5x. GPU market is getting crazy

[-]

Lamushi@reddit

Can it run llama3:70b-instruct-q4\_0

[-]

Dundell@reddit (OP)

The 8k context version, yean no problem probably around 7-9 t/s, at least that's what Miqu was at with 15k context. This is now my old build with 36GB vram. The new build I'll post next week has 81GBs Vram and I'll reveal what it can run, context, and speed.

[-]

rookan@reddit

Any news on new rig? Curios about what it can run