Dual rtx 3090 build
Posted by Sufficient_Phone_242@reddit | LocalLLaMA | View on Reddit | 53 comments
Joining this community sparked a new hobby and interest in software engineering that I had lost.
So I made this dual rtx 3090 build mostly for inference , I know I won’t be replacing chatgpt anytime soon but what tool stack would help it be usable in a work environment ? Must MCP servers or custom tools/scripts ?
Currently using VScode preview with qwen3.6 27b and an nginx server, Im mostly interested in agentic work with usable context or at least a better knowledge of code base ( RAG pipeline?)
Been already such a helpful community , hopefully local llms continue to grow because I fear cloud will become unaffordable at a consumer level
BlackBeardAI@reddit
I tried 2x3090 in an Asus Tuf Gt502 casing and they got hot very quickly despite having fans in every possible fan slot. I had fans blowing air from the bottom too and you don't have any there. I am not sure if this is going to be long lasting setup.
Sufficient_Phone_242@reddit (OP)
80c for me seems like normal temps for a gpu ?
Will have to monitor closely for longer runs , not using 8hours a day for now , Got a picture of your setup ? That was my hesitation of going directly to mining frame
BlackBeardAI@reddit
Which setup? I got 4+1 nodes currently. You can see the details here:
https://github.com/blackbeardlabs/blackbeard-homelab/tree/main/nodes
https://old.reddit.com/r/LocalLLaMA/comments/1thkcor/meet_the_fleet_of_blackbeard/
Sufficient_Phone_242@reddit (OP)
Ah yes seeing your threadripper build , is that powered by on’y one psu for 4x rtx 3090 ? Got a 1600w psu for that purpose. I couldn’t buy a threadripper cpu the price is so much but for inference i would just get the vram and x1 slots
BlackBeardAI@reddit
It is a very old threadripper, 1950x. Mobo + cpu was 400 bucks or something. Yes 3090's run on a single 1600w psu but I power limited the GPU's to 250w so no problems. It unlocks bf16 qwen 3.6 27b with 260k ctx for me so no complaints.
kidflashonnikes@reddit
I havent read the full build specs - but I can already assume the 3rd pcie slot is not connected to the CPU directly and is instead connected to a chpset on the mobo - which means best case you are getting 4x lanes on the 3rd pcie slot and 8x-16x lanes gen 3-4 on the first slot. I woudl confirm this as this will hamper your t/s for inference (not by a crazy amount) but more so for training. It's not worth it in this case for the mobo to have this set up - because its likely the second slot being used bifurcates best case to 8x lane each at gen 4 - but you lose the slot spacing you have now
Sufficient_Phone_242@reddit (OP)
Yes I was debating on switching to pro creator mobo something like that 8x/8x but Im not sure if it’s worth it , im not training and the spacing might be off since those cards only have 2 slots it might not fit or nvlink is the performance gain worth 400ish dollars
kidflashonnikes@reddit
8x lanes is fine. I was running effectively 12 3090s on one Asus pro wrx90 sage se motherboard. It has 7 pcie slots, but 6 are 16x lanes. What I I did was used a pcie lane splitter, so I turned one pcie slot into 2, going from 16x per one card, to 8x per one lane for 2 cards. I will say this. 16x is always the best, but budget wise - an 8x lane for two cards is fine.
People get butthurt about this, 8x for inference is actually fine. Its like a 10% hit on t/s output, if any, especialyl when using an RTX 3090 (ampere architecture) with 8x lanes is fine for inference. The only concern for you would the 3rd lane. I would check to see if that 3rd pcie slot is offering 8x or 4x at gen4 speed. You dont want one card having 8x and the other 4x. Its likely the first and second slots are connected to the CPU (main connection) and the 3rd lane is connected to the chiplet
Terminator857@reddit
A link to case, motherboard, and power supply would be interesting. I'd like to do the same. Thanks!
Anbeeld@reddit
Not to sound negative, but half the colling in your case is just running cold air in and immediately out, not that much of it gets to components and then there's even less of getting their hot air out.
Sufficient_Phone_242@reddit (OP)
I think I misunderstood airflow you can be negative , but how do I get an airflow with 2 sides (top and side)? I can’t put any fans in the bottom , i ordered another fan for the rear should all my fans push air in then ?
Anbeeld@reddit
The thing is your entire top-right corner (as per image) works towards pulling in air and getting it out immediately. The very top right corner is probably the best illustration, just look at it, it's in and out.
This starves both CPU and GPU of fresh cold air that gets partially thrown away before it can reach the components.
What I would do in this case is simulate the classic front-to-back, adjusted to the side air pull.
So you remove 2 top fans, leaving only 1 in the left-top corner. The front 2 top fans mostly steal fresh air right after it got in, while the rear one actually exhausts hot air that gets into this vertical rear corner after CPU and GPU throws it away. GPU hotpots are located towards mid and rear too, not in the front.
Ideally you should also cover the grill left in place of these 2 top fans, so the air is guided towards components after intake and not partially leaked out through top grill, but this can get ugly so your call.
Then you add one exhaust fan to the rear, obviously.
Result: 3 intake fans in the front on the side, 2 exhaust fans in the top-rear corner (1 on top, 1 on rear) where most of hot air is located. Classic a la front-to-back flow, positive pressure, air actually moves throw components and out. Additionally bottom GPU pulls some air from bottom grill too.
Sufficient_Phone_242@reddit (OP)
Thanks for the lengthy explanation !
Makes complete sense , yeah I think the exhausts are the side fans in my case , I will switch them around and put the exhausts in the back , can’t fit any on the bottom , might try to find something to hide the grill like you said it might be ugly but inside wouldn’t show , maybe some white metal sheet magnetized on the top
Doing it right now
Anbeeld@reddit
On the bottom the GPU itself is staring right into the grill with its fans, so it does the job already there for the intake, as long as you remove the dust from the underneath of the PC from time to time.
Sufficient_Phone_242@reddit (OP)
Done my man ! Need to find something to hide the top right
Anbeeld@reddit
Any improvements measurable?
Sufficient_Phone_242@reddit (OP)
Yeah ! Think cpu really went down 6c , amazing
Anbeeld@reddit
Wait I just realized I completely misjudged your current setup lol, in terms of what intakes and what exhausts. So disregard the analysis of the current setup completely. But the final optimal setup is probably the same as I described anyways.
Anbeeld@reddit
Here's highly precise technical illustration on how you can improve the airflow.
Anbeeld@reddit
Here's a glorious illustration of current issues, exaggerated for higher art value.
wgaca2@reddit
i use opencode with agents, skills and magic context on 2x3090 with 196k context and qwen 3.6 27b q8 mtp. I am pretty happy with how it performs
sickmartian@reddit
you should be able to get the 262k context with 2x3090s on the q8 mtp, did you try it out? maybe found some issues when the context got too high and fell back to 196k?
free-interpreter@reddit
Can you point me to a config, that gives that?
sickmartian@reddit
sure, this is close to what I use:
free-interpreter@reddit
Thank you sir, will try that at home
wgaca2@reddit
depends on the batch settings and how much leftover you want. Since i am using it in windows i want some headroom left.
Sufficient_Phone_242@reddit (OP)
Nice same setup , Whats a « magic » context ? Did you add web search or its good enough to know latest packages and frameworks
wgaca2@reddit
magic context is like a persistant memory plugin that compacts the context and can recall when needed. I have websearch enabled
__JockY__@reddit
Check out Zed instead of VS Code - the AI integrations are great, the interface is way less cluttered. If that's your thing, Zed might rock for you.
Get into Skills now. Right away. I wish I'd done it sooner, they change everything, you may not even need MCP at all. Watch one of those 5 min youtube videos or, better, ask an agentic cli how to install a skill.
Speaking of which: agentic cli. You don't need chat. You don't need web interfaces. An agentic cli is chat and interface.
Jump into one of those and don't even bother with a chat interface. Straight into the deep end. You'll never look back.
That's enough to both accelerate you and keep you busy for a few days. Have fun!
boredquince@reddit
are the top fans pulling air in?? any particular reason why?
Sufficient_Phone_242@reddit (OP)
Cause I don’t know what I’m doing , wanted an airflow, should all the 6 fans be pushing in ? One other fan I ordered for the rear maybe that one could pull in ?
boredquince@reddit
you want positive air pressure to reduce dust buildup. more air in than out.
top/back out, bottom/side/front in (with filters)
if were u, I'd: - flip (OUT) the top ones - move 1 of the 3 top to the back, the one near the front, and OUT as well - flip all 3 side fans IN - maybe add a bottom fan IN if possible. should be, I can see the grill
this would create a 4 in, 3 outflow. positive pressure. if u can't add another fan, reduce the speed of the OUT fans or increase the speed of the inflow fans.
Sufficient_Phone_242@reddit (OP)
Just changed and placed them like you said ! and another redditor mentioned , missing a .5 inch width for the bottom fan to fit :(
boredquince@reddit
what about a smaller fan?
Jipok_@reddit
Did you install copper shims on the VRAM under the backplate? If not, you should limit the power draw to under 300W, otherwise you risk frying the card under continuous load. The memory runs hot, it's on the back of the PCB, and you currently have no cooling for it.
tuura032@reddit
I have the same GPUs. Smaller version (Lian Li) of this case.
I am tempted to put it all into a bigger case... but maybe just using 2 PSUs and leaving the 2nd GPU next to my case will suffice. I just can't get over that 50% of the volume of the case is in the wrong spot lol.
nihsett@reddit
How much approx did it set you back? Beautiful setup but those feel a little too close for sufficient airflow.
Sufficient_Phone_242@reddit (OP)
Yeah going to monitor temperatures , I’m in CAD dollars so everything is ballooned compared to USD .. I paid 80$ for case so worst case I change to a mining rig ( cheap 80$ also )
Maybe about 3.6k All-In just the rtx 3090 are in the 1k-1.3k each and bought a near mint one , hard to find good ones that weren’t mined on
LongDistanceRope@reddit
is that a regular atx motherboard or e-atx? I was wondering how two 3 slots card would fit.
On an atx board there are bunch of connectors on the bottom, fan headers, usb, power button etc. Does it fit under the second gpu?
Sufficient_Phone_242@reddit (OP)
ATX the b650 eagle ax , I got lucky that one of the slots matched, its the before last pci-e slot
I would’ve measured the length of the bottom and gotten a taller case on bottom ( they were all taller on top ) the usb cables are all good the wires come more from the back so not so bad
Sufficient_Pie_7912@reddit
Hello, brother.
Adventurous-Paper566@reddit
What are both cards inference temperatures with a model fully offloaded on GPU?
The lower card isn't running too hot?
aeroumbria@reddit
I think that is a good gap. I have tried two cards with no gap before, and it was pretty much impossible unless you seriously limit the voltage. Regular gaming card if you can leave a gap, otherwise blower only, unless undervolt black magic.
Sufficient_Phone_242@reddit (OP)
70-80c? They are undervolted also so it hasnt been a problem. have been running 1-2 hour sessions though didn’t try a full day, my next step would be 4 gpus and mining rig but will try to optimize my 2 and use it enough to justify the purchase before
hyperfiled@reddit
right. at that point i don't think i'd enclose it all in a case
Chris92991@reddit
How’s it run? Significant gains with gaming? How about large language models?
Sufficient_Phone_242@reddit (OP)
No gain for gaming it’s 1 gpu connected so runs like a 1x rtx 3090 , sli doesn’t exist anymore and games don’t support sli anyway , for inference though I get the full 48gb vram 256k context qwen3.6 27b , didnt optimize token speed getting 30 I think
Blues520@reddit
Clean!
yes_i_tried_google@reddit
Nice but probably a dumb question…. where does the PSU live? I want to do something similar with 3090ti
TurnOffAutoCorrect@reddit
As mentioned by ABLPHA, the PSU is in its own compartment behind the motherboard...
https://i.vgy.me/0w4UuB.png
Sufficient_Phone_242@reddit (OP)
It’s a good question , I didn’t know also because I hadn’t built a pc for 10+ years so the cases options are nice , it’s wider to put the psu and cable behind in a different compartment (btw it’s an Asrock Pg 1600g) I suggest measuring space between pci-e and gpu cards , im lucky I had 3 pci-e to fit the 2nd Rtx somewhere
armory case
ABLPHA@reddit
Looks like a O11D case, which means the PSU is in the second chamber behind the mobo
jacek2023@reddit
Without the open frame there is will be always some noise.