The tried to make me go to rehab. I said no no no…
Posted by Key-Currency1242@reddit | LocalLLaMA | View on Reddit | 138 comments
Posted by Key-Currency1242@reddit | LocalLLaMA | View on Reddit | 138 comments
segmond@reddit
gpu 3 and 6 have 0 loaded on them, but yet over 100w. check your stuff out ... here's mine fully loaded at idle
JustSayin_thatuknow@reddit
Came here to say exactly this, no. 3 and 6 don’t have any load yet are drawing a lot of power, worth investigating…
Cupakov@reddit
a light mode terminal, jesus christ dude
gawwagool@reddit
some people have trouble reading white text on black background. like me. short blocks like 2-3 sentences are okay, but everything about that gets very hard to read. you can google astigmatism if you‘re interested.
Zc5Gwu@reddit
I use light mode on my laptop because it's more readable in bright sunlight and adverse conditions. I don't use black on white though, I use solarized light.
relmny@reddit
Or simply prefer black over white...
fragment_me@reddit
Some people do, but the vast majority of people see better when black text is laid on a white background.
Moist-Length1766@reddit
because the black terminal doesn't emit the sun directly onto your face
Ok-Education1394@reddit
That's true power consumption!
EthanMiner@reddit
Normally a bad riser cable for me when this happens.
dzedaj@reddit
which motherboard and risers do you use?
Grdosjek@reddit
i think hacker hacked you and color inverted your image.
year2039nuclearwar@reddit
He’s running it on windows
Turbulent_Pin7635@reddit
Oh! Boy! For sure the rig is faster, but my M3U at least don't drain 150W @ idle. I mean, the thing is in a 85% CPU + 360 Gb RAM operation right now for more than 8hs and it is silent as dead, mildly fever as a coma patient. Seeing this show me how much I love my Katinka.
Ok-Internal9317@reddit
how to set for it to run at P8?
Eelroots@reddit
Ask an AI to monitor them.
Borkato@reddit
What do you run with this?
KalonLabs@reddit
Everything
FullOf_Bad_Ideas@reddit
nice I have a similar setup, but 8x 3090 ti - https://pixeldrain.com/u/aSYhykGP
what models do you run? I've found Qwen 3.5 397B exl3 ~3.5bpw to be the best pick for it right now, it's smart and pretty fast.
FalconX88@reddit
We got some cheap 3080s and now have two nodes with 8 GPUs each. Not as good but still fun to use.
dzedaj@reddit
which motherboard and risers do you use?
FullOf_Bad_Ideas@reddit
here's my config
X399 Taichi, TR1920X, 3 sticks of 32GB RAM, 3 1600/1650W power supplies, 2 add2psu sata power adapters, mining rig that holds up to 12 gpu's - 6 in the top row and 6 in the bottom, single 500gb ssd for now. PCI-E risers and two bifurcation boards from pcie 3.0 x16 to 4x x4. 2 gpu's are on pcie-e 3.0 x8 and 6 are on PCI-E 3.0 x4.
risers - I bought a lot of them, some worked, some didn't or broke. I am not sure which are installed where right now. Most risers are pci-e 3.0 x16 (despite using only 4-8 lanes), 1 riser in use is 3.0 x4, I have a few spare risers and a few broken ones. Majority of risers are unbranded, one is from Thermaltake and 1 in use is ADT-Link, 2/3 spares are ADT-Link.
camwasrule@reddit
Looking at this hardware setup myself or similar. What kind of prompt eval speeds do you get with the big boy models? I currently have 4x gpus but what to upgrade to taichi for the 8x or 16x lanes. I am open to grabbing another 4x3090
FullOf_Bad_Ideas@reddit
here are some metrics from running Qwen 3.5 397B 3.65bpw quant in CC today, I think it was 131k ctx, 6,5 cache.
and later
That's just what I have open in console history right now, been using Opus for the last few hours when I was downloading a different quant (3.51bpw where I can do 8,5 cache and 262ctx at the same quality and speeds).
Which taichi? X399 is just PCI-E 3.0, so it won't be super quick, but it's definitely a cheap upgrade to make, since it has almost 10 years now as a platform. I was planning to buy just 4 GPUs but I got too fast and bought 8 in a few weeks lol. Good thing I did, they're harder to source right now and if I wanted to sell them, I think I'd be looking at 20% profit. If I knew I'd be getting 8 GPUs I'd get some better motherboard. It works, it's ok, but I think I'm still limited by PCI-E in a lot of places like training.
r0cketio@reddit
At this point why not just bite the bullet and get an RTX 6000 Pro? It's got as much memory as 4 of those cards and leaves you room for expansion, plus consumes way less power per token.
thamind2020@reddit
I want to make love with your AI
DrVonSinistro@reddit
Switch WDDM --> TCC.
fizzy1242@reddit
No TCC on 3090.
DrVonSinistro@reddit
I forgot about this.. I guess my RTX A2000 is «business» GPU which is why I was able to set it to TCC like my P40s.
SnowyOwl72@reddit
i hate nvidia for removing nvlink from consumer GPUs. what a waste
Ok_Try_877@reddit
Any idea why number 6 is at 111W with nothing in memory and no volatile-GPU-Util?
TechnoByte_@reddit
Because of Windows
Ok_Try_877@reddit
It uses 0 MB of VRam... Windows, even with nothing installed or running, takes over a few hundred MB, and mine right now, with browsers and a few apps running, is using over 2GB
Zidrewndacht@reddit
Not always:
As long as the cards aren't driving DWM, they can have zero VRAM usage when unused. This is with a lot of stuff running in the desktop, but the displays are connected to the iGPU (and hardware-accelerate browser and other apps using iGPU/system RAM).
Ok_Try_877@reddit
Sure, I know they can have 0 VRAM, but the guy above said, its prob cos of Windows.... They can't be running Windows and use 0 VRAM
Zidrewndacht@reddit
Well, they can, my screenshot above shows two 3090s with zero VRAM usage each in Windows as well.
Ok_Try_877@reddit
i mean it being the main displays and accelerator for windows. He implied it was using a 111w because it was running windows. I think you stated you are running windows on your igpu?
Zidrewndacht@reddit
Yeah, OP's strange power usage is something else. The cards using >100W with zero VRAM usage are for sure not cards driving any display/rendering the desktop (as they're guaranteed to be unused at that point). They're just not sleeping (at P8 state) as they should.
Oh, by "running Windows" I thought you meant "having Windows as the OS". Yes, in my screenshot the machine is "running Windows" but displays are driven by the iGPU. I just wanted to point out that Windows doesn't "waste" VRAM on unused cards by itself (as I've seen people believe they'd be losing available VRAM just by using Windows).
Ok_Try_877@reddit
I used to have a 3090 and 4090 in Windows 11, but used one for gaming / display. Then used them both if need the vram for LLMs.
I think you are right, it’s got trapped in that state and burning power without doing anything.
ionizing@reddit
to confirm, as I understand it the cuda kernal alone takes \~439mb on my 3090 in linux. so dedicated for inference one should expect \~ 0.439 GB of usage showing just sitting with nothing else even running on the card.
a_beautiful_rhind@reddit
Its like nvidia-persistenced forgot about some cards.
miki4242@reddit
I think you mean the Windows equivalent of that?
Ok_Try_877@reddit
Yeah.. feels like some kind of bug/crash... But you wouldn't want a couple of cards unused sitting at that wattage 24/7... $$$$
Might be time for a reboot.
Voxandr@reddit
He is beginning to believe !!!!
semangeIof@reddit
do your breakers trip if they're under load? how many pcie lanes do they get each? 1?
fragment_me@reddit
If I'm spending 8K on GPUs I'm at least getting an electrician to add a 240V circuit for these lol.
EthanMiner@reddit
I have 8x 3090s, and I do have a 240v circuit I am running it on. I think the whole thing pulls a max of 2500w under load including CPU and maybe 10 fans (might need to double check). I believe the most the outlet can handle is 3,750w because of the adapter I have on it to use normal outlets on multiple PSUs. This will definitely need to be power limited to run on a normal outlet.
Havage@reddit
Jesus, that's like an AC worth of power draw.
Medium_Chemist_4032@reddit
In pipeline parallel/layer splits in runners, only preprocessing goes full blast and draws near power limit (I go 200 per gpu). During decoding, most gpus go 60 on average. That's an example from ik_llama on 4 gpus.
DarkSideOfGrogu@reddit
Probably need one of those too to keep the damn room from burning up.
FullOf_Bad_Ideas@reddit
I have similar setup, 8x 3090 ti.
I was doing a few week-long training runs recently. 14B tokens pushed through a 4B model on each run. ~2200W constant usage for 100-120 hours.
It was doing the heating for my apartment, room was 35C if I left it closed, it was pretty loud, I could see the electric cables in the walls being a bit heated up through termal camera, walls that the hot air was directed to had hotspots. You could not see the heat from outside of the building tho with thermal camera, I checked that out of curiosity.
I had windows open most of the time, and my apartment was 26-28C.
I am not sure yet how it'll work out for me during summer. I use portable AC during summer, since I can't easily mount proper AC on the building's walls.
AVX_Instructor@reddit
Just make undervolting your or set slight low power limits and you get much lower power consume and noise level
FullOf_Bad_Ideas@reddit
This is already with undervolting and lowered power limits to the max that wouldn't make clocks go much lower! I tweaked this per-GPU to make sure I was still getting good MFU - the last one or two GPUs needed more power and boosted throughput a lot than if I clocked all GPUs the same, due to pipeline parallel setup.
TGP on them is 450/480W and 3090 Tis have micro-spikes to 800W. Micro-spikes are unlikely to overlap in time between GPUs, I think.
If all of them ran at max power I'd be looking at wall plug usage of 4300W.
I have 16A 230V breakers.
AVX_Instructor@reddit
I would still recommend trying to sacrifice the GPU clock, because the 3090's bottleneck is mainly memory bandwidth, but the GPU frequency is not as important, and lowering the GPU frequency allows you to significantly reduce power consumption. I even do this for gaming scenarios where the GPU clock is important. Simply lowering the GPU frequency by 10-15% gives you a 30-40% reduction in power consumption (in theory).
nethcadashshmokh@reddit
100% worth trying.
1-_-0-_-1@reddit
Gaming in the Winter, with my relatively basic setup and the vents closed, my room still increases ~5°F lol. Can only imagine.
l_dang@reddit
Funny story, a place I worked with before built their server room anew. Got electrician and HVAC cert everything. Everything going well the first 4 months. Then the room keeps overheat during summer. Turn out they did their verification works in December…
Beginning-Window-115@reddit
yeah ideally you have solar 💀
EthanMiner@reddit
There is an AC in the room, no solar, but there is an exhaust vent too. Tbh it only runs when I use it for work on sensitive data, so maybe 10/15 queries a day. The heat isn’t too bad.
CautiousJunket5332@reddit
Holy shit
semangeIof@reddit
I feel like if I'm spending 8k on GPUs a consumer card over a half decade old isn't gonna be apart of the pile 😭
like especially now. we have the Arc B70s but even if you hate SYCL and want faster Vulkan we have AMD's offering at $250 more while being way more power efficient than a 3090
atp CUDA is less and less necessary and if it is relevant to you you should shell out for something newer than Ampere. 3090s are old horses
m31317015@reddit
And yet 3090s are still cheaper than B70s and R9700s
Baldur-Norddahl@reddit
B70 may be cheaper if you factor in power.
relmny@reddit
What about factoring memory bandwidth?
Also, you can "easily" underpower a 3090.
Baldur-Norddahl@reddit
The 3090 has 50% more bandwidth but B70 has 50% more compute. Not a clear case, at least on paper. In practice I believe 3090 is significantly faster due to better software support.
However B70 has 32 GB VRAM vs 24 GB VRAM on 3090. Faster doesn't matter if you can't load the model.
m31317015@reddit
Yes, but you can power limit easily on linux
54id56f34@reddit
Where are you finding Sub $950 3090s? Pleas let me know.
m31317015@reddit
HK, 5500-6000 HKD per card, USD700-770
stoppableDissolution@reddit
Around $800 in Poland
Desm0nt@reddit
Belarus =) 700-850$ for used Palit/Afox/Zotac =)
One-Macaron6752@reddit
Oh, I love these comments where OP is only trying to convince himself... 😊
Capable_Site_2891@reddit
The real contenders are 5090s, and 3090s. 5090s because nvfp4 is a step change, but you’d get two for the price of 8 3090s.
No_Afternoon_4260@reddit
Those 3090 are 1k now?
fragment_me@reddit
Yes.
Clear-Ad-9312@reddit
and a uninterruptible power supply unit, no way am I hooking it up straight to the wall
NinjaOk2970@reddit
Also nvlink
ManufacturerHuman937@reddit
I am 1/8th of the man you are.
Baldur-Norddahl@reddit
This made me go investigate if we finally have RTX 3090 beat by a new GPU. Alas not quite. It appears that RTX 3090 is about 900 to 1000 euro on eBay. While B70 is 1100 including VAT.
If you buy it as a company to save VAT and we factor in that B70 has 32 GB VRAM, it is 27.3 euro/GB VRAM for B70 compared to 38 euro/GB for RTX 3090.
I won't bother doing the calculation per GB/s memory bandwidth because RTX 3090 is like three times faster. I also don't know how it scales with compute.
TechnoByte_@reddit
I paid 650 euro for a 3090
year2039nuclearwar@reddit
I just bought 2 for about £900 each, so yes, you definitely got in early
Ok-Measurement-1575@reddit
Same. Those days are gone, it seems.
I'm happy with 4 anyway. Honest.
Baldur-Norddahl@reddit
I am just saying what I see on eBay at right this instant. Of course you can get a better deal if you wait for it or leave bidding early until the day you get a lucky one.
But if you are going to get 8x like the OP, you might also need to factor in the risk of getting bad hardware or outright scams. There are clear advantages to buying new here.
FullOf_Bad_Ideas@reddit
3090 is 920 GB/s and B70 is 608 GB/s, it's not 3x faster
B70 should have similar compute to AMD R9700 AI
3090 has MAMF TFLOPS of about 80. R9700 AI has 138 TFLOPs. B70 should also be 130-140 TFLOPs. BF16 dense.
fragment_me@reddit
On paper the B70 is interesting but the software just isn’t there yet. I think the R9700 is a great buy since Vulkan benchmarks show it’s maturing.
Baldur-Norddahl@reddit
Yes sorry had 5090 stats locked in my mind when I wrote that.
Ok-Measurement-1575@reddit
Some of the benchmarks suggest 3090 is more than 3x faster TG, although it was nice to see it beating a 3090 on PP, albeit on a 4b model.
Weird-Consequence366@reddit
How many times a day do the debt collectors call?
skrshawk@reddit
And if you're a debt collector press zero, cause that's what you're gonna get.
Weird-Consequence366@reddit
This is the way
WolpertingerRumo@reddit
Sie, you have a problem. Give me your RSA-Keys, you‘re in no condition to run inference.
Key-Currency1242@reddit (OP)
New to Reddit. Didn’t expect such a response. This is an energetic and highly-engaged crowd! I’ll try to answer some of your questions. I hade to restructure things to get it all working, so it would as a bit messy still when I grabbed that shot. Briefly: I wandered into all of this. If I was starting again, I might have gone for bigger GPUs instead of this 3090 party. Then again, maybe not—3090 is kind of a sweet spot in many ways. Yes I throttle everything, though I’ve been working at 285. Interesting that many people suggest 220. I’ll give that a try. I manage power by not managing power. I have high W hookup in my home office, a decent solar array and a couple of powerwall batteries. Thanks!
andy2na@reddit
you should really set a powerlimit, max clock and clock offset to essentially undervolt each 3090. But I guess you probably dont care about electricity costs haha
Ok-Measurement-1575@reddit
There's no need unless you're running unique inference engines on each card.
They'll never see full power over pcie.
aschroeder91@reddit
It's crazy that you running them all at 350 watts, i always set my 3090s to 220 to not blow my line lol.
Have you had any luck running distributed large video models? I have a handful of 3090s too that could load some of the larger video models VRAM wise, but I haven't come accross good tooling for distrubuted generation.
arthor@reddit
research power limiting your gpus
thisoilguy@reddit
Amazing how little power they take.
funpirates@reddit
Which chassis can hold 8x 3090?
FullOf_Bad_Ideas@reddit
not OP
I have 8x 3090 ti and I hold them in this rig - https://pixeldrain.com/u/G2YkGqaj
here's how they look like loaded up, 6 GPU in the top and 2 in the bottom - https://pixeldrain.com/u/aSYhykGP
Lorakszak@reddit
Winter has ended, how do you use all that heat?
overand@reddit
That's more than I paid for a 2004 Mazda Miata with under 25,000 miles - and based on how much joy i've gotten from that car (with how little maintenance) I'm gonna say:
If somehow, for some unfathomable reason, you have the choice between 8 RTX 3090 GPUs or a mazda miata, get the mazda miata. (Maybe if you're lucky you can get the miata and a pair of 3090s, and you'll be able to run 70B dense models just fine!)
Will anyone ever be in that situation? Heck if I know.
TechnoByte_@reddit
Can the Mazda tell me how many 'R's there are in 'strawberry'? I don't think so
-dysangel-@reddit
is that an African strawberry or a European strawberry?
twiiik@reddit
2?
Jolly-Event7578@reddit
You sound like a 24B model 😅
readfreeh@reddit
How does that work? How many board lanes and extension cables do you hAve?
Fun_Librarian_7699@reddit
You should not use CUDA 13.2
https://www.reddit.com/r/unsloth/s/HLGmgJ3v1p
Pattinathar@reddit
8x 3090s is insane — 192GB VRAM total. You could run a 70B model fully loaded with room to spare, or even split a 120B across all 8. What are you planning to run on this? Multi-GPU inference or fine-tuning?
Medium_Chemist_4032@reddit
So what' are you running?
MentalRegular5335@reddit
I am about to set up a build with 4 Intel Arc GPUs and I thought mine would be extreme for a regular guy with consumer hardware, but yours? 😂😂😂
ieatdownvotes4food@reddit
oh baby... now do nvidia-smi -pl 500 and get this party started
One-Macaron6752@reddit
That would be rather dumb and highly inefficient, helping extremely little with inference. I run a similar setup with Linux LACT where under locking/ volting does the magic.
ieatdownvotes4food@reddit
but I wanted to feel the heat from here.. ;)
Im_Still_Here12@reddit
I did this when I built LTC mining rigs back in the day. 8 rigs with 4 3070ti cards per rig for 32 GPUs in total. Was fun.
betam4x@reddit
I really need to get a 2 slot motherboard. Being able to use my 3090 along side my 4090 would be super useful.
YourVelourFog@reddit
I don’t think I’ve seen a sign yet that reads “will give head for tokens”
robertpro01@reddit
Which mobo is that?
misha1350@reddit
The mothership
MentalRegular5335@reddit
😂😂😂
TheK0tYaRa@reddit
Was the rehab themed about using windows for the task?
CaptBrick@reddit
Was kind of jealous, then I saw you’re running windows and I puked in my mouth a little
Delyzr@reddit
Yup only thing I would send hin to rehab for
ComplexType568@reddit
EXACTLY?? was about to ask that
popsumbong@reddit
pics of build?
WithoutReason1729@reddit
Your post is getting popular and we just featured it on our Discord! Come check it out!
You've also been given a special flair for your contribution. We appreciate your post!
I am a bot and this action was performed automatically.
ScoreUnique@reddit
Man you should set power limit to 230, you're losing power on nothing
lewd_peaches@reddit
Training is addicting, isn't it? What dataset are you planning to tackle next?
TechnoByte_@reddit
You sound like a LLM, you also seem to spam posts and comments promoting OpenClaw everywhere
Long_comment_san@reddit
Things we do for our internet waifus
Ok_Try_877@reddit
Those idle powers are painful unless it's winter :-) I used to have a 3090 and a 4090 in my Epyc server, but being on 24/7 and not used most of the time, adds up! Also, in the summer, it was quite annoying, adding to the heat in the room!
novanet-central@reddit
What motherboard do you use?
Visual_Synthesizer@reddit
a wise choice!
inthesearchof@reddit
Do you dream about rtx 6000 pros?
lethalratpoison@reddit
deepspace86@reddit
So this is why I can't find a 3090 to replace mine :(
SocietyTomorrow@reddit
I had to settle for a 10GB 3080 to hold me by until I financially recover from Uncle Sam's violation of my wallet, would be nice if I could have something faster than my collection of Pascal Quadro cards and V100.
PS. I know its a waste of electricity, but I have big time solar surplus. I bit that bullet years ago.
celsowm@reddit
Jordan says: Stop! Get some help!
theSnoozeDoctor@reddit
What wattage do the cards pull when all loaded up?
NimbusFPV@reddit
who needs rehab when you can run rehab inference locally.