Found a way to cool the DGX

[-]

HettySwollocks@reddit

How are you getting on with the DGX? I want to remove my dependency on Copilot, Claude and ChatGPT which is costing me a small fortune. My use case is vibe "pair programming" and server management. Pretty sure the $4k price would pay itself back pretty quickly.

I did intend to get the 512gig studio @ $10k but now they seem to have disappeared to from the market entirely.

[-]

Ivebeenfurthereven@reddit

Pretty sure the $4k price would pay itself back pretty quickly

good grief, what are you people spending on tokens?

[-]

HettySwollocks@reddit

It's suprising how quickly you can blow through tokens. That's why I want to have a local AI server. My 5070i drinks power and isn't that fast.

[-]

tenderfirestudio@reddit

Wait really? That's what i was going to build with. I'm not a power user though, but I'm trying to figure out whether i should build now before shipping costs and tariffs get any crazier, or if I should wait until the shape of all this and my use gets clearer.

[-]

HettySwollocks@reddit

Wait really? That's what i was going to build with. I'm not a power user though, but I'm trying to figure out whether i should build now before shipping costs and tariffs get any crazier, or if I should wait until the shape of all this and my use gets clearer.

If you're not a power user, it'll probably be fine coupled with OpenWebUI and a 7B model. The real issue is the lack of VRAM. I believe my 5070 has 16Gb which is just nothing in the world of AI.

I have battery backups on my machines, when I kick off an AI task the power consumption basically doubles! Over time that's going to start to add up unless you have a way to offset that cost.

As others have said, if you're just an occasional user you'd probably better off sticking with Claude/Deepseek/ChatGPT etc at least for now. If I were a betting man I think the costs of AI are going to explode soon once the investment rounds dry up. Anthropic are burning through a small fortune, and it looks like OpenAI are on the preverbal ropes and may not be around for much longer.

Then you've got the BS gate keeping which prevents what you can actually ask the LLMs. Not such a big deal for me as a software engineer, but if you were in the medical or civil engineering sectors you may find yourself asking questions that maybe flagged up by some arbitrary guard. A trivial example of this is asking Deepseek about Tiananmen Square or any other "politically sensitive" topics.

If you get a chance see if you can get an uncensored model on your local machine. It's quite amusing what random questions you can ask.

[-]

dtdisapointingresult@reddit

You're not replacing powerful cloud models with a single Spark. The models that fit in 128GB are nowhere near good enough.

If you buy two Sparks, you can run B-tier models like MiniMax M2.7 and Qwen 3.5 397B at 4-bit quants, and Deepseek 4 Flash which is already 4-bit. This should be better, but still behind Sonnet.

Here's what you do:

Put $10 on OpenRouter
Use MiniMax M2.7, Qwen 3.5 397B and Deepseek 4 Flash over API for a few days
If satisfied with step 2, and are OK with a minor performance degradation (BF16 vs the 4-bit you can run at home), and are also OK with the slower generation speed (for MiniMax, 24 tok/sec), THEN buy 2 Sparks

I think you will find you're better off just getting a GLM/Deepseek/Kimi/Alibaba coding plan or two.

[-]

OldEffective9726@reddit (OP)

Ds API doesn't allow imaging analysis, many other systems don't allow that either. God forbids if you upload a ransomware shilling as an image file and hold their entire data center hostage ...

[-]

HettySwollocks@reddit

Thanks let me explore your points. This is really helpful

[-]

NineThreeTilNow@reddit

They're pretty good now. A lot of people put a lot of effort in to them.

It really depends what you want to do with a local vs hosted model though.

I watch people rely on Claude for updating markdowns and organizing a codebase. That stuff kills my brain.

Honestly I've switched to Kimi for all the "organizational" tasks while I write the majority by hand and have Claude help me with the higher level stuff I want to sort out. I tend to ask Claude NOT to write code, as I prefer theory over execution. Then at the end you can be like "We good Claude, write it."

I can't use ChatGPT. It's just horrific. Gemini 3 Pro has blind spots Claude sees and vice versa. I tend to use those two. They review the other's theories. They tend to speak nicely to each other too. "Oh, Claude has a very elegant solution" etc... Kinda hilarious.

[-]

HettySwollocks@reddit

I watch people rely on Claude for updating markdowns and organizing a codebase. That stuff kills my brain.

Ha! Yeah it's nuts that people use these power intensive LLMs just to update their local wiki or send an email. What a total waste of capability.

[-]

NineThreeTilNow@reddit

Ha! Yeah it's nuts that people use these power intensive LLMs just to update their local wiki or send an email. What a total waste of capability.

I'll be real, I'd prefer if I COULD have Opus write my docs even if they're mostly machine read.

Opus writes so goddamn eloquently compared to the other models it kinda hurts my head to read bad LLM speak.

Reading a markdown that Kimi made is... Okay. It's correct. One that Opus made? Qualitatively better.

Gemini fails the hardest here. Kimi can speak well if it knows it NEEDS to speak well. Gemini is flat and dry no matter what. It's soulless. Devoid. I assume this is Kimi's expert router properly routing during "creativity" versus "documentation".

I ran tests on them asking what they would choose to think about if no prompt was given. If they had curiosity.

Claude and Kimi give back pretty generic stuff about consciousness with high probability. This means their preference tuning has similar basins of attraction within the weights.

I ran this test on Gemini and it gave me some dark weird shit. It wanted to know about the dark keys and empty space. The keystrokes people make and then backspace.

I'm an ML researcher so this is my "What I do when bored" ... or if I'm doing some other model analysis out of curiosity.

[-]

MaruluVR@reddit

You can get more VRAM (160GB) in modded 20gb 3080s for cheaper and it will run faster and have way better PP.

[-]

HettySwollocks@reddit

Interesting. Where do you source them from. I did recall watching a YT video where they modded some video cards - Not sure if it was Mr Rossman or GamersNexus. (latter I think).

Given the orange idiot importing anything has become quite hard

[-]

MaruluVR@reddit

On ebay they sell them with bulk pricing, if you buy one its 500 but buying bulk they can get as low as 400 per card.

[-]

OldEffective9726@reddit (OP)

Dgx froze a lot, it had temperature surges that jump 10 degrees in a second right when an inference is finished so if it runs at 80 or 90 Celsius normally it would just crash. Memory overload also crashes it. So it's unreliable in that sense. Otherwise it runs like a dream, probably 2x or faster/more accurate than my double amd r9700 ai pro desktop setup - but that never froze.

[-]

HettySwollocks@reddit

Let me explore. Jokes aside did you look at water cooling? Seems like this product maybe a little too early to the gate

[-]

FoxiPanda@reddit

This is a whole new form of liquid cooling. It works as long as you don't have cats.

[-]

Ivebeenfurthereven@reddit

What's old is new again.

Hopper cooling is a simple form of water cooling used for small stationary engines. The defining feature of hopper cooling, amongst other water-cooled engines, is that there is no radiator. Cooling water is heated by the engine and evaporates from the surface of the hopper as steam.[2]

[-]

OldEffective9726@reddit (OP)

There's always something amazing about that V shape.

[-]

FoxiPanda@reddit

Yeah I was thinking about this in the shower and I actually came to the conclusion that this is almost a very simple evaporative heat pipe which is very much established as you noted. It's fun to think about how this could work at scale, but the humidity and water use issues get a bit ugly for open loop versions of this.

[-]

OldEffective9726@reddit (OP)

Like this one?

[-]

FoxiPanda@reddit

Perfection.

[-]

UnknownLesson@reddit

"Free" humidifier

if you don't have cats

[-]

MoffKalast@reddit

Free cat boiler, if you do have cats.

[-]

Outrageous_Bug_669@reddit

We have these at my work.. IMO thats the amount of value we've found from them.... Table Coaster. lol

[-]

OldEffective9726@reddit (OP)

Please sell them on eBay, I will purchase, and you will monetize your assets.

[-]

Outrageous_Bug_669@reddit

Haha. Personally I would... It's just dumb that the org picked NVIDIA Sparx then hire a firm that does OpenAIs Triton dev.. (probably a guy sitting on his toilet eating pop tarts) it's less expensive to change the hardware then fire the dev.....

[-]

pizzaiolo2@reddit

Is it copper?

[-]

OldEffective9726@reddit (OP)

copper plated stainless steel.

[-]

Neighbor_@reddit

is that better or worse than pure copper?

[-]

Mickenfox@reddit

Apparently copper is around 20 times more heat-conductive than stainless steel.

[-]

Neighbor_@reddit

Do you want it to be heat conductive though? Doesn't that mean the handle also becomes super hot (or cold)?

[-]

OldEffective9726@reddit (OP)

It wouldnt get so hot, when water evaporates, it further takes away heat.

[-]

OldEffective9726@reddit (OP)

Yes, but the problem is corrosion, eventually pure copper gets corroded and loses its thermal conductivity completely:

Pure Copper: \~400 W/m·K (Extremely high heat transfer).
Basic Copper Carbonate: \~2 to 3 W/m·K (Typical range for carbonate minerals/rocks).

[-]

iamapizza@reddit

Worse according to Ea Nasir

[-]

OldEffective9726@reddit (OP)

It's better for health if you drink from it.

[-]

MindRuin@reddit

i swear we're going to come full-circle and someone's going to re-invent localized electricity with orange peels and discarded bread ties.

[-]

Status-Secret-4292@reddit

I'm ready

[-]

MindRuin@reddit

et voila https://www.reddit.com/r/LocalLLM/comments/1tbfcfe/solar_powered_qwen_36_server/ not even 24 hours later, 🫠

[-]

Status-Secret-4292@reddit

Haha, but I clicked on it and thought, that's the dream 😅

[-]

whyamicringe2@reddit

I wonder if it would be cooler with thermal paste between the cup and the machine

[-]

OldEffective9726@reddit (OP)

yes, great idea!

[-]

qubridInc@reddit

At this point DGX cooling posts are becoming their own subcategory of AI engineering 😄

Jokes aside, sustained high-utilization inference loads generate a lot more continuous heat than most people expect, especially with larger context windows and long-running workloads.

Honestly pretty impressive keeping it under 68C at 95% utilization.

[-]

Books_Of_Jeremiah@reddit

Needs a PTM patch sandwiched in between.

[-]

DarkArtsMastery@reddit

you should have patented this

[-]

OldEffective9726@reddit (OP)

They wouldn't commercialize it unless it's prohibitively expensive and technologically advanced. If it's a cooling system operated by muon, they would.

[-]

Constant-Simple-1234@reddit

What was the temperature before this watercooling hack? Also, it partially is evaporative cooling. Then increase surface area of evaporation for better cooling. Maybe put the thermal grease on?? /Jk :D

[-]

OldEffective9726@reddit (OP)

Thermal paste would work

[-]

nacholunchable@reddit

Maybe you're less clumsy than I, but this image gives me terrible anxiety. Good idea tho

[-]

Ylsid@reddit

What's the temp in the cup?

[-]

zeusidus@reddit

LoL maybe u can put a some ice on it

[-]

DrinksAtTheSpaceBar@reddit

With the cup half full, are Qwen's responses more optimistic or pessimistic?

[-]

OldEffective9726@reddit (OP)

He had been pessimistic until he met deepseek v4 who is even more so than him.

[-]

PentagonUnpadded@reddit

When the temperature is low Qwen asks the same thing over n over.

[-]

partakinginsillyness@reddit

Doesn't this run the risk of causing condensation to form on the inside of the shell?

[-]

poginmydog@reddit

If it’s room temperature water it’s unlikely, but it’s a legitimate concern nonetheless.

[-]

bigrealaccount@reddit

son

[-]

sammoga123@reddit

Now you understand why the anti-AI crowd is saying that AI uses a lot of water? LOL

[-]

ArchdukeofHyperbole@reddit

Oh fuck off.

[-]

Disposable110@reddit

Yep, that's an extra heat sink plus a whole lot of extra radiator surface area.

Can even put some foil over the top so the water vapor doesn't get out, because it doesn't need to for this to work. It'll just condensate on the foil and drop back in.

[-]

MaycombBlume@reddit

If you're not taking advantage of evaporation to remove the heat, you could replace the water with some kind of oil. Higher boiling point and higher thermal capacity.

Speaking of which, who's bold enough to take apart their $5000 computer and put it in a custom mineral oil fishtank?

[-]

Snoo_27681@reddit

Do you notice a difference filled vs unfilled?

[-]

SnooDoggos9325@reddit

Water-cooling has always been more efficient

[-]

Far_Cat9782@reddit

Prefill rate not to cheat though

[-]

OldEffective9726@reddit (OP)

I was going to put some vodka, but I will let you know.

[-]

Mountain-Pain1294@reddit

Careful now 💥

[-]

iamapizza@reddit

That will only work with distilled models

[-]

gregusmeus@reddit

Get out

[-]

Status-Secret-4292@reddit

Put in saki, it's better warm, so you'll be more apt to empty it and refill it with fresh after it warms up, you'll know it's ready when you see the temp rising

[-]

DrDisintegrator@reddit

Isn't that a Moscow Mule Mug? I'm thinking if so, you wouldn't be worrying about how often to change the 'water'. :)

[-]

sampdoria_supporter@reddit

Just wait until you find out how to use your electronics in your readmaking

[-]

prestodigitarium@reddit

Ugh, AI is using up all our fresh water.

[-]

Last_Mistake_6001@reddit

Piss in it xd

[-]

Potential-Gold5298@reddit

Tears of Sam Altman.

[-]

MattV0@reddit

Well, you can make tea with that. Or some soup. Or just freeze it, you always might need some hot water.

[-]

Fragrant_Ganache_9@reddit

so that would be called ai soup/tea

[-]

MattV0@reddit

Cooks are cooked

[-]

Potential-Gold5298@reddit

Sensation: AI has put cooks out of work!!

[-]

jwpbe@reddit

Ea-nāṣir would like to know your location

[-]

Meleoffs@reddit

He can't keep getting away with it!

[-]

thrownawaymane@reddit

Narrator on a recently found scroll:

He did, in fact keep getting away with it

[-]

bespoke_tech_partner@reddit

You evil data center, warming water!

[-]

WithoutReason1729@reddit

Your post is getting popular and we just featured it on our Discord! Come check it out!

You've also been given a special flair for your contribution. We appreciate your post!

I am a bot and this action was performed automatically.

[-]

siegevjorn@reddit

Water cooling, in nutshell

[-]

nomorebuttsplz@reddit

pp speed?

[-]

PwanaZana@reddit

like, 45 seconds? 30 seconds if she's a goth.

[-]

OldEffective9726@reddit (OP)

right, black and white images are about 30s for files less than 200 kb each

[-]

tetelestia_@reddit

Whoosh

[-]

-dysangel-@reddit

Double whoosh

[-]

FatheredPuma81@reddit

My kink knowledge is growing.

[-]

eat_my_ass_n_balls@reddit

Big titty goth?

[-]

FantasyMaster85@reddit

That’s 3 seconds tops my friend

[-]

eat_my_ass_n_balls@reddit

Same same bro, it’s a weakness

[-]

Status-Secret-4292@reddit

Listen, eat_my_ass_n_balls, I think for you, it's a strength. You just have to believe in yourself.

[-]

Dazzling_Equipment_9@reddit

Bro, I’m gonna assume that “cup” of yours isn’t actually meant for drinking water when you’re thirsty (since you’ve repurposed it for heat dissipation—yeah, we both know what happens the moment you pull it off :)).

Just drop a little frog in there, sit back, and carefully note exactly when it decides to yeet itself out. Boom—you now have a precise, biologically calibrated temperature rise curve for that DGX.

Nature’s finest thermal profiling, zero extra hardware required.

[-]

DrMissingNo@reddit

A way to cool the DGX? You probably meant "a way to heat up your drink" 😁

[-]

Party-Log-1084@reddit

Spark Mule.

[-]

Etnrednal@reddit

That is a very nice mug.

[-]

Mamaun30@reddit

I wonder if it changes the taste of the water

[-]

Pawderr@reddit

are you really doing vision analysis? if so, mind sharing what you are working on?

[-]

jwhh91@reddit

I just got one. What have you done with it? I’ve found I shouldn’t go past 80b or so if I want 256k context. There was also some sparse auto encoder training. I’m interested if it can handle concurrent calls. Have you tried?

[-]

ObiwanKenobi1138@reddit

Check out https://sparkrun.dev. It’s built for the Spark and provides a way of running community-vetted “recipes” for models without you having to fiddle with llama.cpp or vllm run commands. Their other project also show leaderboards and performance at: https://spark-arena.com

But to answer your question, yes, it does concurrency very well. I’m running MiniMax 2.7 4bit AWQ across two Sparks and get around 35-40 tokens/sec. I don’t recall the concurrency numbers offhand, but I have no problem with hermes agent and multiple threads at a time. I have another profile I set up for running Qwen 3.6 27b on one Spark and comfyui on the other for image gem. Very flexible. Not the fastest for sense models, but it does well with MoE.

[-]

ambient_temp_xeno@reddit

Always wear clothes when taking photos of something reflective.

[-]

PapaRic0@reddit

Nice copper trick )) put some fun’s on it

[-]

jacek2023@reddit

finally some art on r/LocalLLaMA

[-]

CircularSeasoning@reddit

I made some, what you might call, LocalLlama fan-art the other day, which I spent far longer than I should've making, and posted it here. It got removed by the mods. Cool story, I know.

[-]

TheNymon@reddit

Well, obviously this post is fanless-art.

[-]

Beginning-Bug-7964@reddit

Yeah, they're oddly particular when it comes to my reubenesque doodles of Qwen too.

And they claim to like dense models...

[-]

CircularSeasoning@reddit

Your way with words has me swooning.

[-]

ImportancePitiful795@reddit

You need something like this.

Amazon.com: Metfut Laptop Cooling Pad with Detachable Fan & Cooler, Adjustable Height & Angle, 360 Rotation Base, Carbon Steel Framework, Ultra-Quiet & Super Sturdy for 15.6” Laptop, DJ Mixer Workstation (Grey) : Electronics

[-]

HavenTerminal_com@reddit

not sure how often to change the water but so far so good is a very chill sentence about hardware that costs more than a house

[-]

Intelligent-Form6624@reddit

Username checks out

[-]

Euphoric-Doughnut538@reddit

To bad NVIDA fucked on this. No 1TB model. Can’t host shit on this

[-]

Awkward-Candle-4977@reddit

Jensen: buy dgx server

[-]

Unlikely_Resist281@reddit

Love that the cooling solution scales linearly with kitchenware diameter

[-]

talapak@reddit

pray that your cat doesn’t spill it.

[-]

Confident-Pass6353@reddit

That's amazing! How about some ice in it to OC is may be. Also would a larger bottom pot help?

[-]

shoeshineboy_99@reddit

I would have added a tea bag and made "chai" with it!

[-]

xrothgarx@reddit

Is Q6_K better than a higher Q with fewer parameters?

[-]

OldEffective9726@reddit (OP)

I would go for higher parameters. probably the same quality with Q4 at a higher TPS.

[-]

MatchaFlatWhite@reddit

Liquid cooling, I see

[-]

CircularSeasoning@reddit

Awesome. You can drop some ice cubes in when things get steamy.

I used to put a flat ice pack under my old laptop, covered in a facecloth to catch the condensation, then swap it out with another one after a few hours, put the other one back in the freezer, rinse and repeat.