The tried to make me go to rehab. I said no no no…

[-]

segmond@reddit

gpu 3 and 6 have 0 loaded on them, but yet over 100w. check your stuff out ... here's mine fully loaded at idle

[-]

JustSayin_thatuknow@reddit

Came here to say exactly this, no. 3 and 6 don’t have any load yet are drawing a lot of power, worth investigating…

[-]

Cupakov@reddit

a light mode terminal, jesus christ dude

[-]

some people have trouble reading white text on black background. like me. short blocks like 2-3 sentences are okay, but everything about that gets very hard to read. you can google astigmatism if you‘re interested.

[-]

Zc5Gwu@reddit

I use light mode on my laptop because it's more readable in bright sunlight and adverse conditions. I don't use black on white though, I use solarized light.

[-]

relmny@reddit

Or simply prefer black over white...

[-]

fragment_me@reddit

Some people do, but the vast majority of people see better when black text is laid on a white background.

[-]

Moist-Length1766@reddit

because the black terminal doesn't emit the sun directly onto your face

[-]

Ok-Education1394@reddit

That's true power consumption!

[-]

EthanMiner@reddit

Normally a bad riser cable for me when this happens.

[-]

dzedaj@reddit

which motherboard and risers do you use?

[-]

Grdosjek@reddit

i think hacker hacked you and color inverted your image.

[-]

year2039nuclearwar@reddit

He’s running it on windows

[-]

Turbulent_Pin7635@reddit

Oh! Boy! For sure the rig is faster, but my M3U at least don't drain 150W @ idle. I mean, the thing is in a 85% CPU + 360 Gb RAM operation right now for more than 8hs and it is silent as dead, mildly fever as a coma patient. Seeing this show me how much I love my Katinka.

[-]

Ok-Internal9317@reddit

how to set for it to run at P8?

[-]

Eelroots@reddit

Ask an AI to monitor them.

[-]

Borkato@reddit

What do you run with this?

[-]

KalonLabs@reddit

Everything

[-]

FullOf_Bad_Ideas@reddit

nice I have a similar setup, but 8x 3090 ti - https://pixeldrain.com/u/aSYhykGP

what models do you run? I've found Qwen 3.5 397B exl3 ~3.5bpw to be the best pick for it right now, it's smart and pretty fast.

[-]

FalconX88@reddit

We got some cheap 3080s and now have two nodes with 8 GPUs each. Not as good but still fun to use.

[-]

dzedaj@reddit

which motherboard and risers do you use?

[-]

FullOf_Bad_Ideas@reddit

here's my config

X399 Taichi, TR1920X, 3 sticks of 32GB RAM, 3 1600/1650W power supplies, 2 add2psu sata power adapters, mining rig that holds up to 12 gpu's - 6 in the top row and 6 in the bottom, single 500gb ssd for now. PCI-E risers and two bifurcation boards from pcie 3.0 x16 to 4x x4. 2 gpu's are on pcie-e 3.0 x8 and 6 are on PCI-E 3.0 x4.

risers - I bought a lot of them, some worked, some didn't or broke. I am not sure which are installed where right now. Most risers are pci-e 3.0 x16 (despite using only 4-8 lanes), 1 riser in use is 3.0 x4, I have a few spare risers and a few broken ones. Majority of risers are unbranded, one is from Thermaltake and 1 in use is ADT-Link, 2/3 spares are ADT-Link.

[-]

camwasrule@reddit

Looking at this hardware setup myself or similar. What kind of prompt eval speeds do you get with the big boy models? I currently have 4x gpus but what to upgrade to taichi for the 8x or 16x lanes. I am open to grabbing another 4x3090

[-]

FullOf_Bad_Ideas@reddit

here are some metrics from running Qwen 3.5 397B 3.65bpw quant in CC today, I think it was 131k ctx, 6,5 cache.

Process: 113152 cached tokens and 3676 new tokens at 431.46 T/s, Generate: 22.61 T/s, Context: 116828 tokens)                        
2026-04-11 19:55:06.882 INFO:     Metrics (ID: 25e01b0b979c4beaa5b2fec977ac4130-tool): 63 tokens generated in 3.47 seconds (Queue:   
0.05 s, Process: 116736 cached tokens and 94 new tokens at 174.07 T/s, Generate: 21.92 T/s, Context: 116830 tokens)                  
2026-04-11 19:55:06.884 INFO:     Finished chat completion streaming request 25e01b0b979c4beaa5b2fec977ac4130                        
2026-04-11 19:55:07.065 INFO:     192.168.1.24:44794 - "POST /v1/chat/completions HTTP/1.1" 200                                      
2026-04-11 19:55:07.066 INFO:     Received chat completion streaming request 8651a87245e24309910111f79dda2e10                        
2026-04-11 19:55:12.813 INFO:     Detected tool call in chat completion request 8651a87245e24309910111f79dda2e10                     
2026-04-11 19:55:13.837 INFO:     Metrics (ID: 8651a87245e24309910111f79dda2e10): 2 tokens generated in 5.47 seconds (Queue: 0.07 s, 
Process: 116736 cached tokens and 2185 new tokens at 411.49 T/s, Generate: 22.56 T/s, Context: 118921 tokens)                        
2026-04-11 19:55:27.950 INFO:     192.168.1.24:59654 - "POST /v1/chat/completions HTTP/1.1" 200                                      
2026-04-11 19:55:27.951 INFO:     Received chat completion streaming request 79b02522c7e04d7b9e90109993e3e577                        
2026-04-11 19:58:56.243 INFO:     Finished chat completion streaming request 79b02522c7e04d7b9e90109993e3e577                        
2026-04-11 19:58:56.385 INFO:     Metrics (ID: 79b02522c7e04d7b9e90109993e3e577): 2630 tokens generated in 207.93 seconds (Queue:    
0.07 s, Process: 118784 cached tokens and 1358 new tokens at 384.7 T/s, Generate: 12.87 T/s, Context: 120142 tokens)                 
2026-04-11 19:59:15.984 INFO:     Metrics (ID: 8651a87245e24309910111f79dda2e10-tool): 3365 tokens generated in 242.92 seconds       
(Queue: 0.05 s, Process: 118784 cached tokens and 139 new tokens at 204.41 T/s, Generate: 13.89 T/s, Context: 118923 tokens)         
...ds (Queue: 
0.05 s, Process: 38912 cached tokens and 108 new tokens at 240.0 T/s, Generate: 27.52 T/s, Context: 39020 tokens) 
2026-04-11 20:09:53.355 INFO:     Finished chat completion streaming request ca21bcf111f34b3c850679891c13fca5
2026-04-11 20:09:53.429 INFO:     192.168.1.24:53780 - "POST /v1/chat/completions HTTP/1.1" 200
2026-04-11 20:09:53.430 INFO:     Received chat completion streaming request f130587862be47a6bbf79601dec3d2d8
2026-04-11 20:09:55.070 INFO:     Detected tool call in chat completion request f130587862be47a6bbf79601dec3d2d8
2026-04-11 20:09:55.429 INFO:     Metrics (ID: f130587862be47a6bbf79601dec3d2d8): 2 tokens generated in 1.56 seconds (Queue: 0.05 s, 
Process: 38912 cached tokens and 535 new tokens at 371.53 T/s, Generate: 29.14 T/s, Context: 39447 tokens)

and later

2026-04-11 21:51:11.981 INFO:     192.168.1.24:43728 - "POST /v1/chat/completions HTTP/1.1" 200
2026-04-11 21:51:11.982 INFO:     Received chat completion streaming request 03ed7f82f97e4e48b3a9e6a1664f99c3
2026-04-11 21:51:22.807 INFO:     Finished chat completion streaming request 03ed7f82f97e4e48b3a9e6a1664f99c3
2026-04-11 21:51:22.809 INFO:     Metrics (ID: 03ed7f82f97e4e48b3a9e6a1664f99c3): 146 tokens generated in 10.59 seconds (Queue: 0.07 
s, Process: 96256 cached tokens and 1764 new tokens at 422.01 T/s, Generate: 23.03 T/s, Context: 98020 tokens) 
2026-04-11 22:10:11.753 INFO:     192.168.1.24:43258 - "POST /v1/chat/completions HTTP/1.1" 200
2026-04-11 22:10:11.754 INFO:     Received chat completion streaming request 151bc633e5b648d5824c4e324d787409
2026-04-11 22:10:15.376 INFO:     Detected tool call in chat completion request 151bc633e5b648d5824c4e324d787409
2026-04-11 22:10:16.016 INFO:     Metrics (ID: 151bc633e5b648d5824c4e324d787409): 2 tokens generated in 3.39 seconds (Queue: 0.14 s, 
Process: 97792 cached tokens and 1331 new tokens at 419.87 T/s, Generate: 23.96 T/s, Context: 99123 tokens) 
2026-04-11 22:10:17.666 INFO:     Metrics (ID: 151bc633e5b648d5824c4e324d787409-tool): 39 tokens generated in 2.08 seconds (Queue: 
0.05 s, Process: 99072 cached tokens and 53 new tokens at 155.88 T/s, Generate: 23.05 T/s, Context: 99125 tokens) 
2026-04-11 22:10:17.667 INFO:     Finished chat completion streaming request 151bc633e5b648d5824c4e324d787409
2026-04-11 22:10:17.847 INFO:     192.168.1.24:43258 - "POST /v1/chat/completions HTTP/1.1" 200
2026-04-11 22:10:17.848 INFO:     Received chat completion streaming request 29894919b6ac482d8599d2eb865b745f
2026-04-11 22:10:21.169 INFO:     Detected tool call in chat completion request 29894919b6ac482d8599d2eb865b745f
2026-04-11 22:10:22.001 INFO:     Metrics (ID: 29894919b6ac482d8599d2eb865b745f): 41 tokens generated in 3.1 seconds (Queue: 0.05 s, 
Process: 99072 cached tokens and 310 new tokens at 242.19 T/s, Generate: 23.14 T/s, Context: 99382 tokens) 
2026-04-11 22:10:55.544 INFO:     Metrics (ID: 29894919b6ac482d8599d2eb865b745f-tool): 769 tokens generated in 34.17 seconds (Queue: 
...e: 
0.05 s, Process: 102400 cached tokens and 223 new tokens at 278.75 T/s, Generate: 21.21 T/s, Context: 102623 tokens) 
2026-04-11 22:22:40.531 INFO:     Finished chat completion streaming request 5302190143be46129fc550b8cb3bd767
2026-04-11 22:23:20.891 INFO:     192.168.1.24:49390 - "POST /v1/chat/completions HTTP/1.1" 200
2026-04-11 22:23:20.892 INFO:     Received chat completion streaming request baaee34b20bd41109a3032a595045afc
2026-04-11 22:23:25.827 INFO:     Detected tool call in chat completion request baaee34b20bd41109a3032a595045afc
2026-04-11 22:23:26.866 INFO:     Metrics (ID: baaee34b20bd41109a3032a595045afc): 48 tokens generated in 4.69 seconds (Queue: 0.15 s,
Process: 102400 cached tokens and 878 new tokens at 359.84 T/s, Generate: 22.87 T/s, Context: 103278 tokens) 
2026-04-11 22:23:29.540 INFO:     Metrics (ID: baaee34b20bd41109a3032a595045afc-tool): 63 tokens generated in 3.47 seconds (Queue: 
0.05 s, Process: 103168 cached tokens and 158 new tokens at 225.71 T/s, Generate: 23.19 T/s, Context: 103326 tokens)

That's just what I have open in console history right now, been using Opus for the last few hours when I was downloading a different quant (3.51bpw where I can do 8,5 cache and 262ctx at the same quality and speeds).

want to upgrade to taichi for the 8x or 16x lanes

Which taichi? X399 is just PCI-E 3.0, so it won't be super quick, but it's definitely a cheap upgrade to make, since it has almost 10 years now as a platform. I was planning to buy just 4 GPUs but I got too fast and bought 8 in a few weeks lol. Good thing I did, they're harder to source right now and if I wanted to sell them, I think I'd be looking at 20% profit. If I knew I'd be getting 8 GPUs I'd get some better motherboard. It works, it's ok, but I think I'm still limited by PCI-E in a lot of places like training.

[-]

r0cketio@reddit

At this point why not just bite the bullet and get an RTX 6000 Pro? It's got as much memory as 4 of those cards and leaves you room for expansion, plus consumes way less power per token.

[-]

thamind2020@reddit

I want to make love with your AI

[-]

DrVonSinistro@reddit

Switch WDDM --> TCC.

[-]

fizzy1242@reddit

No TCC on 3090.

[-]

DrVonSinistro@reddit

I forgot about this.. I guess my RTX A2000 is «business» GPU which is why I was able to set it to TCC like my P40s.

[-]

SnowyOwl72@reddit

i hate nvidia for removing nvlink from consumer GPUs. what a waste

[-]

Ok_Try_877@reddit

Any idea why number 6 is at 111W with nothing in memory and no volatile-GPU-Util?

[-]

TechnoByte_@reddit

Because of Windows

[-]

Ok_Try_877@reddit

It uses 0 MB of VRam... Windows, even with nothing installed or running, takes over a few hundred MB, and mine right now, with browsers and a few apps running, is using over 2GB

[-]

Zidrewndacht@reddit

Not always:

As long as the cards aren't driving DWM, they can have zero VRAM usage when unused. This is with a lot of stuff running in the desktop, but the displays are connected to the iGPU (and hardware-accelerate browser and other apps using iGPU/system RAM).

[-]

Ok_Try_877@reddit

Sure, I know they can have 0 VRAM, but the guy above said, its prob cos of Windows.... They can't be running Windows and use 0 VRAM

[-]

Zidrewndacht@reddit

Well, they can, my screenshot above shows two 3090s with zero VRAM usage each in Windows as well.

[-]

Ok_Try_877@reddit

i mean it being the main displays and accelerator for windows. He implied it was using a 111w because it was running windows. I think you stated you are running windows on your igpu?

[-]

Zidrewndacht@reddit

Yeah, OP's strange power usage is something else. The cards using >100W with zero VRAM usage are for sure not cards driving any display/rendering the desktop (as they're guaranteed to be unused at that point). They're just not sleeping (at P8 state) as they should.

Oh, by "running Windows" I thought you meant "having Windows as the OS". Yes, in my screenshot the machine is "running Windows" but displays are driven by the iGPU. I just wanted to point out that Windows doesn't "waste" VRAM on unused cards by itself (as I've seen people believe they'd be losing available VRAM just by using Windows).

[-]

Ok_Try_877@reddit

I used to have a 3090 and 4090 in Windows 11, but used one for gaming / display. Then used them both if need the vram for LLMs.

I think you are right, it’s got trapped in that state and burning power without doing anything.

[-]

ionizing@reddit

to confirm, as I understand it the cuda kernal alone takes \~439mb on my 3090 in linux. so dedicated for inference one should expect \~ 0.439 GB of usage showing just sitting with nothing else even running on the card.

[-]

a_beautiful_rhind@reddit

Its like nvidia-persistenced forgot about some cards.

[-]

miki4242@reddit

I think you mean the Windows equivalent of that?

[-]

Ok_Try_877@reddit

Yeah.. feels like some kind of bug/crash... But you wouldn't want a couple of cards unused sitting at that wattage 24/7... $$$$

Might be time for a reboot.

[-]

Voxandr@reddit

He is beginning to believe !!!!

[-]

semangeIof@reddit

do your breakers trip if they're under load? how many pcie lanes do they get each? 1?

[-]

fragment_me@reddit

If I'm spending 8K on GPUs I'm at least getting an electrician to add a 240V circuit for these lol.

[-]

EthanMiner@reddit

I have 8x 3090s, and I do have a 240v circuit I am running it on. I think the whole thing pulls a max of 2500w under load including CPU and maybe 10 fans (might need to double check). I believe the most the outlet can handle is 3,750w because of the adapter I have on it to use normal outlets on multiple PSUs. This will definitely need to be power limited to run on a normal outlet.

[-]

Havage@reddit

Jesus, that's like an AC worth of power draw.

[-]

Medium_Chemist_4032@reddit

In pipeline parallel/layer splits in runners, only preprocessing goes full blast and draws near power limit (I go 200 per gpu). During decoding, most gpus go 60 on average. That's an example from ik_llama on 4 gpus.

[-]

DarkSideOfGrogu@reddit

Probably need one of those too to keep the damn room from burning up.

[-]

FullOf_Bad_Ideas@reddit

I have similar setup, 8x 3090 ti.

I was doing a few week-long training runs recently. 14B tokens pushed through a 4B model on each run. ~2200W constant usage for 100-120 hours.

It was doing the heating for my apartment, room was 35C if I left it closed, it was pretty loud, I could see the electric cables in the walls being a bit heated up through termal camera, walls that the hot air was directed to had hotspots. You could not see the heat from outside of the building tho with thermal camera, I checked that out of curiosity.

I had windows open most of the time, and my apartment was 26-28C.

I am not sure yet how it'll work out for me during summer. I use portable AC during summer, since I can't easily mount proper AC on the building's walls.

[-]

AVX_Instructor@reddit

Just make undervolting your or set slight low power limits and you get much lower power consume and noise level

[-]

FullOf_Bad_Ideas@reddit

This is already with undervolting and lowered power limits to the max that wouldn't make clocks go much lower! I tweaked this per-GPU to make sure I was still getting good MFU - the last one or two GPUs needed more power and boosted throughput a lot than if I clocked all GPUs the same, due to pipeline parallel setup.

TGP on them is 450/480W and 3090 Tis have micro-spikes to 800W. Micro-spikes are unlikely to overlap in time between GPUs, I think.

If all of them ran at max power I'd be looking at wall plug usage of 4300W.

I have 16A 230V breakers.

[-]

AVX_Instructor@reddit

I would still recommend trying to sacrifice the GPU clock, because the 3090's bottleneck is mainly memory bandwidth, but the GPU frequency is not as important, and lowering the GPU frequency allows you to significantly reduce power consumption. I even do this for gaming scenarios where the GPU clock is important. Simply lowering the GPU frequency by 10-15% gives you a 30-40% reduction in power consumption (in theory).

[-]

nethcadashshmokh@reddit

100% worth trying.

[-]

1-_-0-_-1@reddit

Gaming in the Winter, with my relatively basic setup and the vents closed, my room still increases ~5°F lol. Can only imagine.

[-]

l_dang@reddit

Funny story, a place I worked with before built their server room anew. Got electrician and HVAC cert everything. Everything going well the first 4 months. Then the room keeps overheat during summer. Turn out they did their verification works in December…

[-]

Beginning-Window-115@reddit

yeah ideally you have solar 💀

[-]

EthanMiner@reddit

There is an AC in the room, no solar, but there is an exhaust vent too. Tbh it only runs when I use it for work on sensitive data, so maybe 10/15 queries a day. The heat isn’t too bad.

[-]

CautiousJunket5332@reddit

Holy shit

[-]

semangeIof@reddit

I feel like if I'm spending 8k on GPUs a consumer card over a half decade old isn't gonna be apart of the pile 😭

like especially now. we have the Arc B70s but even if you hate SYCL and want faster Vulkan we have AMD's offering at $250 more while being way more power efficient than a 3090

atp CUDA is less and less necessary and if it is relevant to you you should shell out for something newer than Ampere. 3090s are old horses

[-]

m31317015@reddit

And yet 3090s are still cheaper than B70s and R9700s

[-]

Baldur-Norddahl@reddit

B70 may be cheaper if you factor in power.

[-]

relmny@reddit

What about factoring memory bandwidth?

Also, you can "easily" underpower a 3090.

[-]

Baldur-Norddahl@reddit

The 3090 has 50% more bandwidth but B70 has 50% more compute. Not a clear case, at least on paper. In practice I believe 3090 is significantly faster due to better software support.

However B70 has 32 GB VRAM vs 24 GB VRAM on 3090. Faster doesn't matter if you can't load the model.

[-]

m31317015@reddit

Yes, but you can power limit easily on linux

[-]

54id56f34@reddit

Where are you finding Sub $950 3090s? Pleas let me know.

[-]

m31317015@reddit

HK, 5500-6000 HKD per card, USD700-770

[-]

stoppableDissolution@reddit

Around $800 in Poland

[-]

Desm0nt@reddit

Belarus =) 700-850$ for used Palit/Afox/Zotac =)

[-]

One-Macaron6752@reddit

Oh, I love these comments where OP is only trying to convince himself... 😊

[-]

Capable_Site_2891@reddit

The real contenders are 5090s, and 3090s. 5090s because nvfp4 is a step change, but you’d get two for the price of 8 3090s.

[-]

No_Afternoon_4260@reddit

Those 3090 are 1k now?

[-]

fragment_me@reddit

Yes.

[-]

Clear-Ad-9312@reddit

and a uninterruptible power supply unit, no way am I hooking it up straight to the wall

[-]

NinjaOk2970@reddit

Also nvlink

[-]

ManufacturerHuman937@reddit

I am 1/8th of the man you are.

[-]

Baldur-Norddahl@reddit

This made me go investigate if we finally have RTX 3090 beat by a new GPU. Alas not quite. It appears that RTX 3090 is about 900 to 1000 euro on eBay. While B70 is 1100 including VAT.

If you buy it as a company to save VAT and we factor in that B70 has 32 GB VRAM, it is 27.3 euro/GB VRAM for B70 compared to 38 euro/GB for RTX 3090.

I won't bother doing the calculation per GB/s memory bandwidth because RTX 3090 is like three times faster. I also don't know how it scales with compute.

[-]

TechnoByte_@reddit

I paid 650 euro for a 3090

[-]

year2039nuclearwar@reddit

I just bought 2 for about £900 each, so yes, you definitely got in early

[-]

Ok-Measurement-1575@reddit

Same. Those days are gone, it seems.

I'm happy with 4 anyway. Honest.

[-]

Baldur-Norddahl@reddit

I am just saying what I see on eBay at right this instant. Of course you can get a better deal if you wait for it or leave bidding early until the day you get a lucky one.

But if you are going to get 8x like the OP, you might also need to factor in the risk of getting bad hardware or outright scams. There are clear advantages to buying new here.

[-]

FullOf_Bad_Ideas@reddit

I won't bother doing the calculation per GB/s memory bandwidth because RTX 3090 is like three times faster.

3090 is 920 GB/s and B70 is 608 GB/s, it's not 3x faster

I also don't know how it scales with compute.

B70 should have similar compute to AMD R9700 AI

3090 has MAMF TFLOPS of about 80. R9700 AI has 138 TFLOPs. B70 should also be 130-140 TFLOPs. BF16 dense.

[-]

fragment_me@reddit

On paper the B70 is interesting but the software just isn’t there yet. I think the R9700 is a great buy since Vulkan benchmarks show it’s maturing.

[-]

Baldur-Norddahl@reddit

Yes sorry had 5090 stats locked in my mind when I wrote that.

[-]

Ok-Measurement-1575@reddit

Some of the benchmarks suggest 3090 is more than 3x faster TG, although it was nice to see it beating a 3090 on PP, albeit on a 4b model.

[-]

Weird-Consequence366@reddit

How many times a day do the debt collectors call?

[-]

skrshawk@reddit

And if you're a debt collector press zero, cause that's what you're gonna get.

[-]

Weird-Consequence366@reddit

This is the way

[-]

WolpertingerRumo@reddit

Sie, you have a problem. Give me your RSA-Keys, you‘re in no condition to run inference.

[-]

Key-Currency1242@reddit (OP)

New to Reddit. Didn’t expect such a response. This is an energetic and highly-engaged crowd! I’ll try to answer some of your questions. I hade to restructure things to get it all working, so it would as a bit messy still when I grabbed that shot. Briefly: I wandered into all of this. If I was starting again, I might have gone for bigger GPUs instead of this 3090 party. Then again, maybe not—3090 is kind of a sweet spot in many ways. Yes I throttle everything, though I’ve been working at 285. Interesting that many people suggest 220. I’ll give that a try. I manage power by not managing power. I have high W hookup in my home office, a decent solar array and a couple of powerwall batteries. Thanks!

[-]

andy2na@reddit

you should really set a powerlimit, max clock and clock offset to essentially undervolt each 3090. But I guess you probably dont care about electricity costs haha

[-]

Ok-Measurement-1575@reddit

There's no need unless you're running unique inference engines on each card.

They'll never see full power over pcie.

[-]

aschroeder91@reddit

It's crazy that you running them all at 350 watts, i always set my 3090s to 220 to not blow my line lol.
Have you had any luck running distributed large video models? I have a handful of 3090s too that could load some of the larger video models VRAM wise, but I haven't come accross good tooling for distrubuted generation.

[-]

arthor@reddit

research power limiting your gpus

[-]

thisoilguy@reddit

Amazing how little power they take.

[-]

funpirates@reddit

Which chassis can hold 8x 3090?

[-]

FullOf_Bad_Ideas@reddit

not OP

I have 8x 3090 ti and I hold them in this rig - https://pixeldrain.com/u/G2YkGqaj

here's how they look like loaded up, 6 GPU in the top and 2 in the bottom - https://pixeldrain.com/u/aSYhykGP

[-]

Lorakszak@reddit

Winter has ended, how do you use all that heat?

[-]

overand@reddit

That's more than I paid for a 2004 Mazda Miata with under 25,000 miles - and based on how much joy i've gotten from that car (with how little maintenance) I'm gonna say:

If somehow, for some unfathomable reason, you have the choice between 8 RTX 3090 GPUs or a mazda miata, get the mazda miata. (Maybe if you're lucky you can get the miata and a pair of 3090s, and you'll be able to run 70B dense models just fine!)

Will anyone ever be in that situation? Heck if I know.

[-]

TechnoByte_@reddit

Can the Mazda tell me how many 'R's there are in 'strawberry'? I don't think so

[-]

-dysangel-@reddit

is that an African strawberry or a European strawberry?

[-]

twiiik@reddit

2?

[-]

Jolly-Event7578@reddit

You sound like a 24B model 😅

[-]

readfreeh@reddit

How does that work? How many board lanes and extension cables do you hAve?

[-]

Fun_Librarian_7699@reddit

You should not use CUDA 13.2

https://www.reddit.com/r/unsloth/s/HLGmgJ3v1p

[-]

Pattinathar@reddit

8x 3090s is insane — 192GB VRAM total. You could run a 70B model fully loaded with room to spare, or even split a 120B across all 8. What are you planning to run on this? Multi-GPU inference or fine-tuning?

[-]

Medium_Chemist_4032@reddit

So what' are you running?

[-]

MentalRegular5335@reddit

I am about to set up a build with 4 Intel Arc GPUs and I thought mine would be extreme for a regular guy with consumer hardware, but yours? 😂😂😂

[-]

ieatdownvotes4food@reddit

oh baby... now do nvidia-smi -pl 500 and get this party started

[-]

One-Macaron6752@reddit

That would be rather dumb and highly inefficient, helping extremely little with inference. I run a similar setup with Linux LACT where under locking/ volting does the magic.

[-]