15 years from, power like this will likely be running on your laptop. 15 years ago it would probably have ranked in the global top 500 supercomputers list.
There was a 6 year gap between 690 and 3090, and 3090 is a little over 4 times as powerful as 690. I don't think we will have laptop with the power of 15 x 3090 in 11 to 15 years from now. 4090 is only 76% more powerful than 3090 (with same VRAM), and the upcoming 5090 will have a similar boost in performance (or lower) with only slightly better VRAM. That's 3 x performance jump (at most) in 4 years.
You'll probably find dedicated AI hardware instead of GPUs by then. They will have a lot more performance and lower power consumption due to architectural changes. Personally I think mixed memory and pipelined compute will be the kicker for it.
That's actually pretty interesting, like have a dedicated GPU for visual rendering AND a AIPU for generating/calculating AI output.
PCI slot probably has enough bus bandwidth left to tailor for these kind of things. Especially with PCI5 with double the performance (bandwith,transfer and freq)
If it fits in memory (which you would presume it does) then ai actually has quite low bandwidth demands.
Like a llm is literally just the text in and out, you could do that at 9600bps and be faster than most people can read.
Exactly what I was going to say - Apple's got their own silicon running their AI and who knows how many M2 Ultras they're packing onto each board? I also think it won't be long before somebody develops an ASIC that has a native app like Ollama. Let's hope they're a bit quieter than a mining rig if it happens :)
Rousseau wrote "Qu'ils mangent de la brioche" in 1767, Marie Antoinette was born in 1755. It's frankly quite unlikely that she was the princess Rousseau was talking about.
Bill said 256k ought to be enough
No. Bill Gates, like any other software engineer back then, would never have said anything of the sort - quite the opposite. Managing to get a computer running with an address space large enough to handle 640kB of memory was quite a feat - and a very welcome one. The quote (wrongfully) attributed to him is about 640kB, not 256.
my post says 640 - not 256 ? It was IBM who was quoted on this . Gates merely agreed
perhaps the quote attributed to Marie was her recap of Rousseau's remark as a double entendre - that would exhibit the former Austrian Royal's learned reads of french authors as she was being groomed by the court's best
educators to speak and read french - it was very important for the new queen to become a francophile to show les peuples she had embraced her country and its customs as their new monarch. Birthday May 16th - same day as Jean D 'Arc and Nero - they all certainly lead extreme lives
Regardless of the amount - Bill Gates never said anything of the sort nor agreed. No software engineer at the time would have ever said anything of the sort when they were all struggling to enable more memory to be used.
Regarding Marie Antoinette, her repeating something she's read hardly make it a quote by her.
The world today has 6.8 billion people. That's headed up to about nine billion. Now, if we do a really great job on new vaccines, health care, reproductive health services, we could lower that by, perhaps, 10 or 15 percent. But there, we see an increase of about 1.3.
Yes, and? It's a logical consequence of improved health worldwide. Countries with better healthcare have fewer children, most likely because parents know they'll survive. The context was a presentation about lowering CO2 emissions, back in 2010. What's the problem with this quote?
That wasn't the context, the context was overpopulation. Believe your own eyes not what "fact checkers" (paid by him) tell you 😂. Unlike cattle being walked to the slaughter you can read and listen to him yourself.
I’m pretty sure that I was lectured in comp sci class about how there are physical limitations on how small we can make gates and connections for chips. That limit was many times larger than the current 3nm.
We may hit economic limits before physical ones. After 7nm nodes Moore's law stopped and the transistor price did not halve. Each new node costs $20-30billion USD to develop. If people aren't willing to pay much more money for new generations of compute and are fine with 'good enough' at whatever node we are at, then another $20 to $30 billion might not be a great bet to make.
Are you thinking of Quantum effects? (like quantum tunneling, where electrons jump through Gates and Channel they classically shouldn’t)
Say you made a 10 layer chip with 10nm transistors,
You wouldn't get any quantum tunneling like you would at 1nm, but you would get transistor density equivalent of 1nm.
Stacking is complicated and does hurt thermals on the inner layers, but with 1nm equivalent tech you could run things slower and more efficiently to compensate.
While we are no longer growing at the same clip as we were before, a quick look at the T500 performance curve will show you that we are still growing at an exponential rate. (Note that the Y axis is logarithmic, so a straight line indicates exponential growth, even if the slope has changed).
Now, it is true that (to paraphrase Aristotle) nature does not tolerate indefinite exponential growth, so it is a certainty that sooner or later Moore's Law must come to an end.
But that day has not yet arrived. Much like the Norwegian Blue, Moore's Law is not dead, it is just resting! :-)
The history of a single corporation dominating an entire sphere of computing is littered with the bones of has-beens. Silicon Graphics, 3Dfx, Sun, DEC are dust. IBM, Intel, Compaq, Xerox, and Nokia had de-facto monopolies and are now competing or have given up on the market altogether. If we talk about software, then it becomes hard to even come up with a list because it changes so fast that being outcompeted and abandoning the market becomes a challenge to determine.
Either way, the chances that nVidia remains dominant in AI hardware and software computing for another 15 years is nothing something I would put money on, given the track record of other corporations trying to do something similar. Word on the inside is that Jensen knows this and is sucking as much revenue as possible out of their market right now, future be damned.
Each RTX3090 can reach 35.5TFlops (fp32), so 15 of them would get you to around 530TFlops. This should get you into the Top 500 list as recently as 2016, and get you to the top of the list in 2007, duking it out with IBM's 212,992 processor BlueGene/L monster.
15 years ago (in 2009), a single RTX3090 would get you into the list.
2009 the gtx 295 had 896 MB of accesible vram because of cut down memory buss but lets call it 1gb. A gain of 24x, even if we would land at just a 10x gain of vram 240gb doesn't sound too bad. :D
But nvidia greed will probably keep the growth as slow as possible but even a 5x increase isn't too bad. Or we start seeing custom circuits that starts to force them to start going pop pop pop with the vram for us.
I am just working on some very interesting things and I believe this to be the right investment at this time for me. Also, it doesn't hurt that GPUs are a hot commodity, especially given Nvidia's neglect of the end-user market. So worst case scenario I'd sell them and lose a little bit in the entire setup.
Hey everyone, just thought I should post this here while I am taking a break from putting it all together and contemplating my life decisions 😅
I am adding 6 more 3090s to my 8x3090 setup. I have been working on a very interesting project with LLMs and Agentic Workflows -I talked about a bit in another blogpost- and realized my AI Basement Server needed some more juice to it...
I am probably going to write a post about this upgrade later this week, including how I got the PCIe connections to work properly, but let me know if you have any other questions to tackle in this upcoming blogpost.
I am also open to suggestions of how to avoid moving into the basement myself, so let me know :"D
What's the use case for this setup. Read a bit of the blog post but just wondering what end goal you have in mind. Is there a particular software idea you are going to build with this or is this whole project just for the sake of building and learning?
If you are looking for a possible idea I've got something that would be excellent. A far all mankind thing and not so much for all the riches thing.
> I am also open to suggestions of how to avoid moving into the basement myself, so let me know :"D
Show per posts of machines much more expensive than yours and show her it could have been much worse. XD
This makes my 12x look (slightly) tame.
What are you using to power everything? I've got 3x EVGA 1600w+ Gold PSUs for the 12 3090s and have found that any time I'm doing anything taxing I trip the protection circuitry in them. Running 3x 3090s per PSU seems to be working well so far.
Show her posts of machines much more expensive than yours to demonstrate that it could have been much worse. XD
But babe, I am not as bad as the guy with 8x H100 stuck on his hand, she definitely wouldn't appreciate that 😂
On my 8x I went for 3x Superflower 1600w Platinum. Superflower are the manufacturer of Evga's PSUs and they're really good.
Now with the upgrade, I am going for 5x 1600w. And yes, managing full PCIe4 speeds for all cards, I plan on writing extensively on that in my upcoming blogpost this weekend.
Nice ! I like the frame : would mind sharing some info about your rig's frame ? (Where do you source the part to attach the components to the metal frame ?) I'll try to do something similar for my ×8 GPU.
As someone who's had trouble running 3 cards on PCI-E, I'd be interested to hear what you're doing there. I'm currently looking at using one of the extra NVME slots to run a PCI-E adapter.
This is great! I started playing with agent zero that the creator posted here and GitHub a while back, I love seeing similar constructions! And the hardware!
I’m running a single tiny model running on a steam deck pretending to be a bunch of large competent models, and you’ve got a flipping data center in your basement…
FWIW, i've had luck with c-payne risers, but for the more distant runs I should have purchased the redrivers instead of a simple riser. I'm stuck at PCIe3 instead of PCIe4 for 4 of the cards because of it. You may want to take a look at the ROMED8-T2 board. I'd had the H12SSL for a minute and returned it for the other.
I went with the ROMED8-T2 over the H12SSL primarily because I wanted 12x GPUs and it has 7 PCIe4 16x slots that I could bifurcate. The H12SSL only has 5 16x slots and 2 8x slots. The seventh slot on my rig runs a 4x NVME card. I couldn't do all that on the H12SSL.
I don't know how I missed the official rebar on this one, thanks so much!
These boards are an extra $200 but you do get the two full x16 vs the x8 on the Supermicro 🤔
Did you observe any difference with riser/redriver compatibility between the two boards? I got some cheap-ass dual width x8x8 boards on top of 15-20cm "pcie4" risers from AliExpress, not exactly premium gear over here
lol you remind me of the guy who put his chats in with his girlfriend and asked the LLM how irrational each person was.. I’m guessing that guys single now.
It's hallucinations all the way down. Each side had their own pretraining, some let one fine-tune the other, the lucky ones either had compatible pretraining to begin with or come to an understanding from mutual fine-tuning.
Otherwise it's just hallucinations and slop with extra guardrails through social norms.
BROO!! Are you preparing for winter!!?? All these RTXs will keep you warm!? That's a very crazy setup! I read your blog, that was so awesome. I have one doubt, how can I input a very large prompt or context?
Like I have recordings of my professor's class, which I turned them into texts via WhisperX, but I am not able to feed any model. Even if I feed it, like in Gpt4all, I am not getting anything out any useful, like just summarise what he taught. Nothing useful. I tried LLama3.2 3B instruct, it does talk very good and unique but it is not working as I wanted. Maybe I did something wrong or maybe I should forget all these and make notes in class...😞🥲
Never heard of it. Why what is that? Anyways, today I found Google AI Studio, and I was blown away! I did what I wanted, idk why didn't I find it before 😞. I did hear that Gemini had 1M token for context but I never understood what it meant. Now with the new update it got, it has 2M tokens, which can fit 10-15 books and it is Gemini 1.5 Pro for free on their server! And also after giving ≈1M, it outputs the way you want. But has 8k tokens limit as output, which you can increase but I think you should not, but instead just generate new output by just copying the output and paste it in text field and say "please continue" and it will say what is left. Really got my job done. Okay I will check out notebooklm...
BRO WHY DIDN'T YOU TELL ME EARLIER IT WAS WAY TOO FAST!!!
I just did a quick test and it was quick as f. But I don't know which one is better... I will need to take some time. Anyways, here is a quick comparison between Google AI Studio vs Notebook.
https://rlim.com/sVl_jhxml3
No, a popular meme about how car-culture drives government infrastructure "problem-solving" to keep expanding main transportation arteries instead of addresing the real issues behind traffic-jams
Are 14 3090s better than 7 4090s? Which one's more cost effective in your opinion? How large is your model in terms of parameters based on the rig that you have?
I'm deploying to the cloud bc I don't want to heavily invest in this type of setup yet but I'm curious in terms of training, energy costs, cpus, cooling, etc
Let’s say you have a business with a few hundred employees. Your Intellectual property is what makes you money. People using ChatGPT (teams at least so your input isn’t used to train the model). At 150 users $30 a month that’s about $54000 a year. Your data isn’t used to train another llms model and you control the data (can check used chat history etc)
Don't forget the cloud is elastic. You generally pay for what you actually use. It might be difficult to justify if 10K budget gives you 5 years of cloud spend vs being stuck with 10K of depreciating hardware.
It actually can depending on the model and the context. Check out my 2nd blogpost where I go in depth about that https://ahmadosman.com/blog/serving-ai-from-the-basement-part-ii/
Would also be curious if OP had to redo the electrical in their house to run this setup. That setups gotta be around 6000W. I’d have to unplug my oven to power that thing.
We normally have 10A and 16A (for plugs) fuses in my country. So in my home all the plugs in each room are on one fuse. So I could technically use two power outlets that are like 2m from each other to run it 😅.
But yeah it's still a lot of power and I would probably be checking it with a thermal camera if the fuses wouldn't trip.
I’d consider running a 20A 240v circuit to where your build is (if it’s a bit more reasonable than OP’s build). Unless you’re renting, it’s going to be a lot safer.
you can run 5 on each circuit
Input Power=0.85350 watts≈412 watts Input Amperage=412 watts120 volts≈3.43 amps\text{Input Amperage} = \frac{412 \text{ watts}}{120 \text{ volts}} \approx 3.43 \text{ amps}Input Amperage=120 volts412 watts≈3.43 amps
Total Amperage=5×3.43 amps=17.15 amps
need to make sure. I'd run it on 10awg just so your wires don't heat up.
Back in the day I installed (without permission lol) 2 extra dryer outlets in my apartment for Ethereum mining. This setup could probably run from a single 30A outlet.
Is this for a job or money making venture that isn’t crypto related? I’m honestly clueless. I have installed and used ollama a bit on my pc and it is neat but I’m way out of my league with understanding your goal here
Hey guys, I am currently sitting in the floor of my basement troubleshooting 2 GPUs not running as expected. Once done I am going to sleep for a day and then write a blogpost and share the pictures and the process with you. Stay tuned 🫡
I am currently sitting in the floor of my basement troubleshooting 2 GPUs not running as expected, once done I am gonna sleep for a day and then write a blogpost and share the pictures and the process with you guys. Stay tuned.
What do you guys with such serious setups do with LLM at home? And if you’re using for work, wouldn’t you want something more professionally built up for reliability and uptime?
, wouldn’t you want something more professionally built up for reliability and uptime
A lot of crypto miners can say that this kind of construction works ok and better than a DC type setup at home which may concentrate heat output too much.
Hey i am a total noob to the world of LocalLLaMA, may i ask what do you want to achieve by such an investment? I am asking out of pure curiosity since i see this and makes me wonder what kind of results i can expect from running localLlama on my gaming pc vs this powerhouse lol
I am currently working on a project that requires both batch inference and training. I did the math and I would have burned the same amount of money in a few months of compute renting so this was the right move for me.
I have a post that touches on that https://ahmadosman.com/blog/serving-ai-from-the-basement-part-ii/
I am genuinely curious why people are building LLM clusters in their basement. Is this compute something you can sell as a service profitably, like on vast.ai? If you genuinely need LLM to power your business, wouldn't it be better to just use API or one of those Model-as-a-service vendor like fireworks.ai?
I'm also puzzled. It seems to me people are just treating this as pc-building; everyone's excited about the process but no one is actually playing games.
I know most people here wont like to hear that, but you would be way cheaper and more flexible just using OpenRouter and pay for the api costs. If you are not training models such a setup is just waste of money. But if youre having fun, maybe worth it.
Sorry for Capitalizing on this opportunity to crack jokes.
Spoiler alert: (Referring to the AI as "Her" is the joke, for that one outlier who still doesn't get it. Don't worry. There will be more jokes to get later in this AI-generated future.)
Oh man, you may have a chance to use your “powerful” server to test out latest Tencent Hunyuan 389B Q4 (if GGUF’d) to unique & sincere explaination to her
I see your llama 405b badge. Are you running that? Is this round about 14-15 3090s total? That’s about ~330 to 350GB right? If so, what quantization and context size are you running? For example the fp16 I use on openrouter is 800+GB in size. 128k context to boot probably bumps it up to a terabyte. Just curious. Because it sounds like you don’t have enough GPUs still haha :)
Forgive me for being thick, but what’s the advantage of this? What can you pull off that someone with a local llm + 4090 or a chatgpt subscription etc can’t?
Nice! But I counted 14 cards, I suggest you to get 2 more for a nice power of two quantity (16). It would be perfect then.
But jokes aside, it is good rig even with 14 cards, and should be able to run any modern model including Llama 405B. I do not know what backend you are using, but may be a good idea to give TabbyAPI a try if you did not already. I run "./start.sh --tensor-parallel True" to start TabbyAPI to enable tensor parallelism, it gives noticeable performance boost with just four GPUs, so probably will be even better with 14. Also, with plenty of VRAM to spare it is a good idea to use speculative decoding, for example, https://huggingface.co/turboderp/Llama-3.2-1B-Instruct-exl2/tree/2.5bpw could work well as a draft model for Llama 405B.
Well they aren't building them anymore - neither 3090 or 4090, and the big offload from crypto coin boom is long past.
We're actually in a weird situation where older GPUs with lots of vram are possibly going to get more expensive if any of the rumors regarding the 5000 prices are true.
It is not really the thing I am basing my decision one, I think Claude 3.5 Sonnet is great for coding, but I am also very concerned about data privacy and I know there will be trends where they'll increase their prices.
I am currently working on a project that requires both batch inference and training. I have a post that touches on that https://ahmadosman.com/blog/serving-ai-from-the-basement-part-ii/
I am currently working on a project that requires both batch inference and training. I did the math and I would have burned the same amount of money in a few months of compute renting so this was the right move for me.
Actually the cablr management is my best work to date, can be seen a lot cleaner without the fans (which are actually very well managed too). Check my first blogpost for cleaner pics: https://ahmadosman.com/blog/serving-ai-from-the-basement-part-i/
I laugh when I hear all those « Nvidia’s greed prevents the normal unlimited growth of my VRAMed personal toy », and I probably cry as much as op’s wife (you should try to batch generate her to cherry pick ;) ) when I see so many 3090 for a single personal computer… I feel speechless. Hopefully I can enjoy your title humor and laugh at your loacalllama Guinness record.
But the ressources on earth are not infinite, and the greed/power of some tends to make prices and unwanted consequences grow as well. But sometimes it’s not who we think (Nvidia?)… when mining came… no cards were left and prices began to grow. The ressources to make those cards are less and less easy to find/produce and it always has a cost (water, epower, geopolitic, farmers around, children mining some coltan etc…). Not trending considerations on localllama I imagine.
Ok now you can use Meta 405B at 15t/s… what for ? Solve climate change ? War conflicts ? Poverty ? Inequity ? Racism ? Trumps education ? Waifu upscale ? -I know you probably have good projects (you wouldn’t put that much on those expensive cards). Enjoy.
Buy yourself an expensive watch - something which she will instantly recognize as a luxury item (Rolex, Omega, Cartier...). It will distract her of these hardware boxes, as they are boring AF anyway. If she criticize your expences, you either demonstratively sell the watch and restore the family budget, or you give it to her as a gift with love (if she likes it). Anyway, all the action will be around the watch, and these GPU's will be forever forgotten.
Wait why would you spend such an insane amount of money to create this setup? I can't help but wonder if what you're doing will really benefit from it, versus spending a fraction of this money to access services instead.
Even buying them at a steep discount this is going to be expensive.
Is there any legit practical reason to do this rather than just paying for API usage? I can't imagine you need Llama 405b to run NSFW RP and even if you did it can't be moving faster than 1-2 t/s which would kill the mood.
Privacy is the commonly cited reason, but for inference only the break-even price vs cloud services is in the 5+ year range. If you're training however, things change a bit and the break even point can shift down to a few months for certain things.
Using AWS Bedrock Llama3.1-70b (to compare against something that can be run on the rig), it costs $0.99 for a million output tokens (half that if using batched mode). XMasterrrr's rig probably cost over $15k. You'd need to generate 15 billion tokens of training data to reach break even. For comparison, Wikipedia is around 2.25 billion tokens. The average novel is probably around 120k tokens so you'd need to generate 125,000 novels to break even. (Assuming my math is correct.)
I have 12x3090 and can fit 405b@4.5bpw w/16k context (32k Q4 cache) The tok/s though is around 6 with a draft model. With a larger quant that will drop a bit.
I might be too drunk to do math right now, but that sounds like about twice the cost of current API pricing over a period of 5 years. Not terrible for controlling your own infrastructure and and guaranteed privacy, but still pretty rough.
On the other hand, that's roughly half the training data of llama3 in 5 years, literally made in your basement. It kind of puts things in perspective.
Hobby, and privacy are big ones, but the math can work out on the cost side if you are frequently inferencing, especially with large batches. Like, if you want to use an LLM to monitor something all day every day.
E.g. Qwen2-VL, count the squirrels you see on my security cameras -> LLama 405B, tell Rex he's a good boy and how many squirrels are outside -> TTS
The API prices are often pretty steep. However, maybe you can find free models on OpenRouter that do what you need.
Could you use that to build your dream game? Making the agents present you each major decision and how it's implemented in the game? Then you could approve or not and keep building.
“Babe, we can’t buy a new car this year”. But you can use something like ChatGPT for free now. Not with an app, though. You have to use the Terminal on your computer.”
ngl, if I had infinite money, I'd do that, use all of them but one on a local server, wondering what I'd do with and probably ending with the biggest LLM model I'd find on it just for the fun and one for my gaming PC just to watch videos and browse reddit (but tbh, if I had infinite money, it'll be 4090s but you get the gist lol)
Just mumble something about "got it pretty cheap," she will assume that means something like $50 each and get only a bit mad about you wasting hundreds of dollars.
Yeah she’s gonna be real disappointed with that wire management, as someone that has lived the life of an Electrician I could see why it would bother her. You gotta keep it clean and tidy.
I’m curious on what kind of motherboards that support that many GPU. Are those same as mining rig? Appreciate if anyone has some references/matterials for this
Guys I’m wondering. What is the strategy here to make money? Putting them on vast.ai or something similar you would need a lot of time for ROI isn’t it?
How to avoid moving down there? Given what you’ve spent so far, you can probably afford to furnish a nice little space for her down there to annoy whatever hobbies she has.
Dead_Internet_Theory@reddit
If you can afford that, and you could just go and buy it, you have nothing to explain.
~~If she complains too much tell her you can simulate a wife who doesn't.~~
Make some local webUI thing and let her use it! Get her hooked too!
loopmotion@reddit
You don't. Just cover it and say it's a new efficient heater
mishuevosmehuelen@reddit
How many watts does that consume?
Yes.
auradragon1@reddit
If I was rich, this is what I’d do too.
iLaux@reddit
One day. One day we'll get this type of setup bro.
rustedrobot@reddit
15 years from, power like this will likely be running on your laptop. 15 years ago it would probably have ranked in the global top 500 supercomputers list.
Pazzeh@reddit
I would be shocked if that turns out to be true
Tzeig@reddit
There was a 6 year gap between 690 and 3090, and 3090 is a little over 4 times as powerful as 690. I don't think we will have laptop with the power of 15 x 3090 in 11 to 15 years from now. 4090 is only 76% more powerful than 3090 (with same VRAM), and the upcoming 5090 will have a similar boost in performance (or lower) with only slightly better VRAM. That's 3 x performance jump (at most) in 4 years.
zyeborm@reddit
You'll probably find dedicated AI hardware instead of GPUs by then. They will have a lot more performance and lower power consumption due to architectural changes. Personally I think mixed memory and pipelined compute will be the kicker for it.
novus_nl@reddit
That's actually pretty interesting, like have a dedicated GPU for visual rendering AND a AIPU for generating/calculating AI output.
PCI slot probably has enough bus bandwidth left to tailor for these kind of things. Especially with PCI5 with double the performance (bandwith,transfer and freq)
zyeborm@reddit
If it fits in memory (which you would presume it does) then ai actually has quite low bandwidth demands. Like a llm is literally just the text in and out, you could do that at 9600bps and be faster than most people can read.
PeteInBrissie@reddit
Exactly what I was going to say - Apple's got their own silicon running their AI and who knows how many M2 Ultras they're packing onto each board? I also think it won't be long before somebody develops an ASIC that has a native app like Ollama. Let's hope they're a bit quieter than a mining rig if it happens :)
_noregret_@reddit
what? 690 was released in 2012 and 3090 in 2020.
Tzeig@reddit
So it was, that means the gap was 8 years and the jump in performance only 400%.
Al-Horesmi@reddit
AI becomes much more compact over time. Also, the architecture becomes more suited to AI.
__JockY__@reddit
“640k [of RAM] should be enough for anyone.” — Bill Gates, 1996
kernald31@reddit
For what it's worth, he most likely never said that. Like Marie Antoinette and "let them eat cake".
Reasonable_War_1431@reddit
she said, " let them eat brioche" if I recall Bill said 256k ought to be enough ...
kernald31@reddit
Rousseau wrote "Qu'ils mangent de la brioche" in 1767, Marie Antoinette was born in 1755. It's frankly quite unlikely that she was the princess Rousseau was talking about.
No. Bill Gates, like any other software engineer back then, would never have said anything of the sort - quite the opposite. Managing to get a computer running with an address space large enough to handle 640kB of memory was quite a feat - and a very welcome one. The quote (wrongfully) attributed to him is about 640kB, not 256.
Reasonable_War_1431@reddit
my post says 640 - not 256 ? It was IBM who was quoted on this . Gates merely agreed
perhaps the quote attributed to Marie was her recap of Rousseau's remark as a double entendre - that would exhibit the former Austrian Royal's learned reads of french authors as she was being groomed by the court's best educators to speak and read french - it was very important for the new queen to become a francophile to show les peuples she had embraced her country and its customs as their new monarch. Birthday May 16th - same day as Jean D 'Arc and Nero - they all certainly lead extreme lives
kernald31@reddit
Regardless of the amount - Bill Gates never said anything of the sort nor agreed. No software engineer at the time would have ever said anything of the sort when they were all struggling to enable more memory to be used.
Regarding Marie Antoinette, her repeating something she's read hardly make it a quote by her.
Reasonable_War_1431@reddit
she had more ears listening to her mouth than eyes reading Rousseau - That would be why she was cast as the author if that quote
soytuamigo@reddit
Yep, unlike
Which he did say.
kernald31@reddit
Yes, and? It's a logical consequence of improved health worldwide. Countries with better healthcare have fewer children, most likely because parents know they'll survive. The context was a presentation about lowering CO2 emissions, back in 2010. What's the problem with this quote?
soytuamigo@reddit
That wasn't the context, the context was overpopulation. Believe your own eyes not what "fact checkers" (paid by him) tell you 😂. Unlike cattle being walked to the slaughter you can read and listen to him yourself.
__JockY__@reddit
Yeah I know, but never let accuracy get in the way of a Reddit post!
kernald31@reddit
I guess hallucinations do make sense in r/LocalLLaMA!
Future_Brush3629@reddit
I thought he said 64K
__JockY__@reddit
The alleged quote has always been 640k, however the veracity of the quote has always been questioned.
kernald31@reddit
He said neither.
visarga@reddit
In his defense, he was thinking of Arduino
drosmi@reddit
I’m pretty sure that I was lectured in comp sci class about how there are physical limitations on how small we can make gates and connections for chips. That limit was many times larger than the current 3nm.
Eisenstein@reddit
We may hit economic limits before physical ones. After 7nm nodes Moore's law stopped and the transistor price did not halve. Each new node costs $20-30billion USD to develop. If people aren't willing to pay much more money for new generations of compute and are fine with 'good enough' at whatever node we are at, then another $20 to $30 billion might not be a great bet to make.
justintime777777@reddit
We are at the very early stages of 3d stacking transistors in compute chips.
NAND (SSD’s) is already stacked with hundreds of layers. Even if we can’t go much smaller, we can go way denser.
thrownawaymane@reddit
Isn't part of the problem relativistic effects at those sizes? That won't go away with 3d stacks.
justintime777777@reddit
Are you thinking of Quantum effects? (like quantum tunneling, where electrons jump through Gates and Channel they classically shouldn’t)
Say you made a 10 layer chip with 10nm transistors,
You wouldn't get any quantum tunneling like you would at 1nm, but you would get transistor density equivalent of 1nm.
Stacking is complicated and does hurt thermals on the inner layers, but with 1nm equivalent tech you could run things slower and more efficiently to compensate.
Odd-Interaction-453@reddit
It still is for Linus Torvalds, just saying, lol.
XMasterrrr@reddit (OP)
Ummmm, no.
__JockY__@reddit
Yeah it didn’t age well. I’ve got 120GB of 3090s and still wanting more…
…hence my question about how you’re powering all this! My lowly 1600W EVGA just can’t cope.
BiGEnD@reddit
If would’ve been no. 6, or am I reading it wrong?
Purplekeyboard@reddit
I doubt that. Moore's Law is basically dead.
justintime777777@reddit
We are at the very early stages of 3d stacking transistors in compute chips.
NAND models (SSD’s) are already stacked with hundreds of layers. Even if we can’t go smaller, we can go way denser.
zuilserip@reddit
While we are no longer growing at the same clip as we were before, a quick look at the T500 performance curve will show you that we are still growing at an exponential rate. (Note that the Y axis is logarithmic, so a straight line indicates exponential growth, even if the slope has changed).
Now, it is true that (to paraphrase Aristotle) nature does not tolerate indefinite exponential growth, so it is a certainty that sooner or later Moore's Law must come to an end.
But that day has not yet arrived. Much like the Norwegian Blue, Moore's Law is not dead, it is just resting! :-)
Eisenstein@reddit
Moore's law is dead because it literally says 18months == twice as many transistors for the same price. That died at 7nm nodes.
TenshiS@reddit
Until the next breakthrough
Intraluminal@reddit
THIS! And very few people seem to realize that. Still, quantum computing may work...
shroddy@reddit
If Nvidia keeps its Cuda moat, in 15 years, power like this will require half as many cards, but each card will cost 3 times as much.
Eisenstein@reddit
The history of a single corporation dominating an entire sphere of computing is littered with the bones of has-beens. Silicon Graphics, 3Dfx, Sun, DEC are dust. IBM, Intel, Compaq, Xerox, and Nokia had de-facto monopolies and are now competing or have given up on the market altogether. If we talk about software, then it becomes hard to even come up with a list because it changes so fast that being outcompeted and abandoning the market becomes a challenge to determine.
Either way, the chances that nVidia remains dominant in AI hardware and software computing for another 15 years is nothing something I would put money on, given the track record of other corporations trying to do something similar. Word on the inside is that Jensen knows this and is sucking as much revenue as possible out of their market right now, future be damned.
kalloritis@reddit
And each will require its own 1600W PSU
zuilserip@reddit
Each RTX3090 can reach 35.5TFlops (fp32), so 15 of them would get you to around 530TFlops. This should get you into the Top 500 list as recently as 2016, and get you to the top of the list in 2007, duking it out with IBM's 212,992 processor BlueGene/L monster.
15 years ago (in 2009), a single RTX3090 would get you into the list.
masterlafontaine@reddit
Only if we leave silicon behind
J-IP@reddit
2009 the gtx 295 had 896 MB of accesible vram because of cut down memory buss but lets call it 1gb. A gain of 24x, even if we would land at just a 10x gain of vram 240gb doesn't sound too bad. :D
But nvidia greed will probably keep the growth as slow as possible but even a 5x increase isn't too bad. Or we start seeing custom circuits that starts to force them to start going pop pop pop with the vram for us.
Either way I'm eager!
Quartich@reddit
2009 last place of the top 500 was around 20 Tflops, 23.3 peak. 501 Kilowatts as well. Back in July 2007 it was 32nd on the list.
Tanvir1337@reddit
That day never comes
danishkirel@reddit
The size of a Mac mini though.
Fantastic-Juice721@reddit
If you were really rich, you will have ppl taking care of these setups for you..
auradragon1@reddit
The fun is putting it together. If I wanted, I can click a button and rent some H100s.
XMasterrrr@reddit (OP)
I am just working on some very interesting things and I believe this to be the right investment at this time for me. Also, it doesn't hurt that GPUs are a hot commodity, especially given Nvidia's neglect of the end-user market. So worst case scenario I'd sell them and lose a little bit in the entire setup.
marialchemist@reddit
Yes based on this cost basis you might have been better with 7 4090s
StackOwOFlow@reddit
3090s have depreciated sharply though
vanisher_1@reddit
Investment in your knowledge/education or just a product as an entrepreneur? 🤔
el0_0le@reddit
For some people, being rich is playing now and crying later.
Mahrkeenerh1@reddit
at this point, is it not more beneficial to go with server gpus?
No_Afternoon_4260@reddit
Wich gpu do you know with better vram/price? I don t
isitaboat@reddit
unless you need vram density per card, this is a good setup
No_Afternoon_4260@reddit
Yes that's right
Herr_Drosselmeyer@reddit
If you were really rich, you'd have a server with a bunch of H100s in the basement.
Dry_Parfait2606@reddit
Buy an AMD EPYC 9004 with 12 CCDs and highest clock speed, and 5x x16 slots all 4x4x4x4 Bifurcation able.
ChunkyHabeneroSalsa@reddit
This is way more machine than I've ever used and I'm an ML engineer in the computer vision space.
lblblllb@reddit
How much did you spend on this
Kids_Learning_Corner@reddit
My dream setup!!
mmeeh@reddit
Which model(s) are you running?
iLaux@reddit
LocalLlama home server final boss. The most impressive I've seen to date.
sourceholder@reddit
It's just the beginning :
https://www.danylkoweb.com/content/images/Bond-BlockchainSetup.jpg
ucefkh@reddit
What's movie
Bderken@reddit
Bond
Brostradamus--@reddit
, James Bond.
TenshiS@reddit
Chain
Ekkobelli@reddit
, Block Chain.
lssong99@reddit
First prompt: explain why I have a home AI rig to my wife.
XMasterrrr@reddit (OP)
Thank you! 😂 I should totally see if I can get that added as a special flair — I’ll wear it with pride until someone dethrones me!
Character_Cut2408@reddit
how much entire setup including everything costed you ?
ip2368@reddit
What fans are those. As a former miner, I'd replace with 3000rpm noctuas
MikeLPU@reddit
There was a guy who bought a server for 150000. :)
ucefkh@reddit
Beautiful
redbull-hater@reddit
Show me the bill please
Chemical-Wafer3133@reddit
can't imagine how much this would cost.
sergen213@reddit
Yes honey, it mines bitcoin... Yes, yes the one that Elon Musk promoting....
Deep_Fried_Aura@reddit
"Baby our power bill came in, but I'm not sure how the fit a Harry Potter book inside the envelope".
RegularBre@reddit
Holy Shit!! You're doing this all to serve image generation over the internet?
XMasterrrr@reddit (OP)
Hey everyone, just thought I should post this here while I am taking a break from putting it all together and contemplating my life decisions 😅
I am adding 6 more 3090s to my 8x3090 setup. I have been working on a very interesting project with LLMs and Agentic Workflows -I talked about a bit in another blogpost- and realized my AI Basement Server needed some more juice to it...
I am probably going to write a post about this upgrade later this week, including how I got the PCIe connections to work properly, but let me know if you have any other questions to tackle in this upcoming blogpost.
I am also open to suggestions of how to avoid moving into the basement myself, so let me know :"D
BigCompetition1064@reddit
Curious how you power it?
R-Rogance@reddit
What's wrong with moving? You will be closer to your waifu.
eggs-benedryl@reddit
At least you'll be warm
Due_Town_7073@reddit
It makes the house warmer.
goj1ra@reddit
It makes the planet warmer.
marieascot@reddit
The people of Valencia want your address.
eggs-benedryl@reddit
lmao
_Fluffy_Palpitation_@reddit
Just think of the savings on the heat bill.
Rc202402@reddit
you remind me of the Linus Tech Tips swimming pool heater video
XMasterrrr@reddit (OP)
😂😂😂
Medium_Chemist_4032@reddit
4,2 kilowatts? Perhaps a sauna as a side hustle?
OrdoRidiculous@reddit
Connect the water coolers to some under floor heating.
CheatCodesOfLife@reddit
For what you're doing, do you notice a difference between BF16 and Q8/8BPW with llama 3.1?
seventhtao@reddit
What's the use case for this setup. Read a bit of the blog post but just wondering what end goal you have in mind. Is there a particular software idea you are going to build with this or is this whole project just for the sake of building and learning?
If you are looking for a possible idea I've got something that would be excellent. A far all mankind thing and not so much for all the riches thing.
LordTegucigalpa@reddit
Is this for fun or do you make money from a service you offer?
rustedrobot@reddit
> I am also open to suggestions of how to avoid moving into the basement myself, so let me know :"D
Show per posts of machines much more expensive than yours and show her it could have been much worse. XD
This makes my 12x look (slightly) tame.
What are you using to power everything? I've got 3x EVGA 1600w+ Gold PSUs for the 12 3090s and have found that any time I'm doing anything taxing I trip the protection circuitry in them. Running 3x 3090s per PSU seems to be working well so far.
Are you managing full PCIe4 speeds for all cards?
XMasterrrr@reddit (OP)
But babe, I am not as bad as the guy with 8x H100 stuck on his hand, she definitely wouldn't appreciate that 😂
On my 8x I went for 3x Superflower 1600w Platinum. Superflower are the manufacturer of Evga's PSUs and they're really good.
Now with the upgrade, I am going for 5x 1600w. And yes, managing full PCIe4 speeds for all cards, I plan on writing extensively on that in my upcoming blogpost this weekend.
rustedrobot@reddit
Sweet! Can't wait to read it. Def need to unblock a few bottlenecks in my rig.
un_passant@reddit
Nice ! I like the frame : would mind sharing some info about your rig's frame ? (Where do you source the part to attach the components to the metal frame ?) I'll try to do something similar for my ×8 GPU.
AlphaEdge77@reddit
ChatGPT Web SEARCH (Ahh who am I kidding, use Google) "aluminum framing"
rustedrobot@reddit
I'll try to post a write-up in this sub some time soon.
some1else42@reddit
Not sure where you live, but I've seen someone make heated flooring with something similar back in the early GPU mining days.
daedalus1982@reddit
You may have answered it elsewhere but do you mind me asking the approximate cost per 3090 that you ended up paying?
GraybeardTheIrate@reddit
As someone who's had trouble running 3 cards on PCI-E, I'd be interested to hear what you're doing there. I'm currently looking at using one of the extra NVME slots to run a PCI-E adapter.
L0WGMAN@reddit
This is great! I started playing with agent zero that the creator posted here and GitHub a while back, I love seeing similar constructions! And the hardware!
I’m running a single tiny model running on a steam deck pretending to be a bunch of large competent models, and you’ve got a flipping data center in your basement…
weallwinoneday@reddit
When AI isnt running, will you mine crypto with this?
synth_mania@reddit
It would likely be unprofitable
El_Minadero@reddit
Put it in a R2D2 shaped trashcan
Mass2018@reddit
I built my wife her own server that she gets to use for her own LLMs. It was remarkably effective.
kryptkpr@reddit
Very interested in riser specifics, eyeing up an H12SSL build to merge my two machines
rustedrobot@reddit
FWIW, i've had luck with c-payne risers, but for the more distant runs I should have purchased the redrivers instead of a simple riser. I'm stuck at PCIe3 instead of PCIe4 for 4 of the cards because of it. You may want to take a look at the ROMED8-T2 board. I'd had the H12SSL for a minute and returned it for the other.
kryptkpr@reddit
What trouble did you run into with the H12SSL?
Four of my GPUs require ReBAR and this was the only motherboard I could find with official vendor BIOS support.
Hunting in the forum's reveals there is a secret BIOS for the Asrock board which enables this? But all links were dead and it seems kinda sketchy.
rustedrobot@reddit
Looks like as of BIOS 3.70 the ROMED8-T2 has rebar support:
https://www.asrockrack.com/general/productdetail.asp?Model=ROMED8-2T#Download
I went with the ROMED8-T2 over the H12SSL primarily because I wanted 12x GPUs and it has 7 PCIe4 16x slots that I could bifurcate. The H12SSL only has 5 16x slots and 2 8x slots. The seventh slot on my rig runs a 4x NVME card. I couldn't do all that on the H12SSL.
kryptkpr@reddit
I don't know how I missed the official rebar on this one, thanks so much!
These boards are an extra $200 but you do get the two full x16 vs the x8 on the Supermicro 🤔
Did you observe any difference with riser/redriver compatibility between the two boards? I got some cheap-ass dual width x8x8 boards on top of 15-20cm "pcie4" risers from AliExpress, not exactly premium gear over here
robogame_dev@reddit
I like the skeletal setup!
Is that covered by your regular home insurance or do you need a rider for it?
kmouratidis@reddit
Any experiments with PCIe bandwidth throttling? E.g. trying x1 instead of x16, gen 3 instead of gen 4, etc.
Gab1159@reddit
Leave some for us my friend :(
David202023@reddit
Looks amazing. 1. What do you have in mind doing with that? 2. Do you have shares at con edison? How much electricity it is going to need?
Zediatech@reddit
"Baby, with all this power and knowledge processing, I will be closer to understanding what it is you really want when you text me"
XMasterrrr@reddit (OP)
LMAO. No way I say that. I am trying to save my ass here man
Blunt_White_Wolf@reddit
Let me rewrite that in a more positive way:
"Baby, all this power and knowledge processing will allow me to learn to better understand your needs and make you even happier"
zyeborm@reddit
You used a llm for that didn't you 😁
Blunt_White_Wolf@reddit
LLM? I used 17 years of marriage :)
Rc202402@reddit
Here I upvoted to 18. no no, its ok, no need to thank me. I dont want RTX Cards, you can wish me a happy marriage in return sir :)
Blunt_White_Wolf@reddit
I do with you a happy marriage. Someone that understands marriage is a constant negociation is becoming a rare thing. Take care and stay safe!
Rc202402@reddit
I wish you a happy and marriage that lasts lifetime as well :)
AKAkindofadick@reddit
Let the model explain. You'll certainly be in good standing when they take over
Zediatech@reddit
Damn, you're right, this is much better. :P
StevenSamAI@reddit
Just make sure you have a good answer when she asks "and what did you get for me?"
NobleKale@reddit
'The LLM simulates a husband who isn't a selfish shitlord, so...'
Jesus359@reddit
“She can decide dinner for the both of us!”
Jisamaniac@reddit
Your ass belongs to the LLM, lil man.
UNITYA@reddit
you are done nothing will help you ))
UltimaPathfinder@reddit
I'm a lawyer. I'll be there if you do.
More-Acadia2355@reddit
It'll choose the restaurant on date night.
TroyDoesAI@reddit
I will be better able to predict what you want to eat babe.😻
Top-Salamander-2525@reddit
Not enough GPUs for that…
TroyDoesAI@reddit
Bruh 😎 ikr right! At the end of the day the answer is always 🌮or 🍕or 🍱 but we like to guess.
Top-Salamander-2525@reddit
Aren’t those all euphemisms for what she would like you to eat?
nabokovian@reddit
You won the internet
Larimus89@reddit
lol you remind me of the guy who put his chats in with his girlfriend and asked the LLM how irrational each person was.. I’m guessing that guys single now.
marieascot@reddit
The text said "The grocery store has refused our joint credit card. P.S. I am leaving you, manchild"
Drited@reddit
The right answer is always "I'm sorry".
Whoa I just realised that Anthropic trains claude on smart dude texts to their girlfriends.
GeneralComposer5885@reddit
** With the “her” being his mother 👵🏻
OrdoRidiculous@reddit
what you want for dinner*
Biomimetec@reddit
With this power I'll be able to read your mind and figure out what you actually want to eat.
__VenomSnake__@reddit
Garbage In, Garbage Out
lopahcreon@reddit
Just to be clear sweetie, I’ll be closer, I still won’t fucking know and it might all be a hallucination anyway.
brewhouse@reddit
It's hallucinations all the way down. Each side had their own pretraining, some let one fine-tune the other, the lucky ones either had compatible pretraining to begin with or come to an understanding from mutual fine-tuning.
Otherwise it's just hallucinations and slop with extra guardrails through social norms.
jfelixdev@reddit
🙄 'mute(brewhouse);'
k2ui@reddit
Bahahahahah
Own_Egg6811@reddit
🤣 woman are smart tho right 😂😂
Future_Brush3629@reddit
there goes the house mortage
brinkjames@reddit
“It becomes self-aware at 2:14 a.m. Eastern time, August 29th. In a panic, they try to pull the plug.” ….
Due_Ebb_3245@reddit
BROO!! Are you preparing for winter!!?? All these RTXs will keep you warm!? That's a very crazy setup! I read your blog, that was so awesome. I have one doubt, how can I input a very large prompt or context?
Like I have recordings of my professor's class, which I turned them into texts via WhisperX, but I am not able to feed any model. Even if I feed it, like in Gpt4all, I am not getting anything out any useful, like just summarise what he taught. Nothing useful. I tried LLama3.2 3B instruct, it does talk very good and unique but it is not working as I wanted. Maybe I did something wrong or maybe I should forget all these and make notes in class...😞🥲
LifeTitle3951@reddit
Have you tried notebooklm?
Due_Ebb_3245@reddit
Never heard of it. Why what is that? Anyways, today I found Google AI Studio, and I was blown away! I did what I wanted, idk why didn't I find it before 😞. I did hear that Gemini had 1M token for context but I never understood what it meant. Now with the new update it got, it has 2M tokens, which can fit 10-15 books and it is Gemini 1.5 Pro for free on their server! And also after giving ≈1M, it outputs the way you want. But has 8k tokens limit as output, which you can increase but I think you should not, but instead just generate new output by just copying the output and paste it in text field and say "please continue" and it will say what is left. Really got my job done. Okay I will check out notebooklm...
LifeTitle3951@reddit
Please do. Its the buzz in the llm community and the podcast feature is insanely good.
It basically takes notes and let's you interact with the data through the chatbox. You can have your own study session and analysis.
Due_Ebb_3245@reddit
BRO WHY DIDN'T YOU TELL ME EARLIER IT WAS WAY TOO FAST!!!
I just did a quick test and it was quick as f. But I don't know which one is better... I will need to take some time. Anyways, here is a quick comparison between Google AI Studio vs Notebook. https://rlim.com/sVl_jhxml3
LifeTitle3951@reddit
Just try the podcast feature of notebooklm. It will blow your mind to the 10th dimension.
Caffdy@reddit
Just one more lane bro, one more lane will fix it, I swear
polikles@reddit
line of what? PCI-E, train, code, coke?
MoffKalast@reddit
Yes
invisiblelemur88@reddit
Factorio?
Caffdy@reddit
No, a popular meme about how car-culture drives government infrastructure "problem-solving" to keep expanding main transportation arteries instead of addresing the real issues behind traffic-jams
invisiblelemur88@reddit
Ahhhh, a Robert Moses solution, got it.
Cless_Aurion@reddit
What would... trains be in this context... Anthropic's API? lol
alasdairvfr@reddit
THATS WHY I CANT FIND CHEAP FTW3 3090S ANYWHERE!
Desperate-Ad-4308@reddit
It was a gift Or It’s a work project, they give it to me
roderickchan@reddit
Tell her its investment, mining machine 😂
UrgentlyNerdy@reddit
Load a chat app on it, and have it come up with an explanation for you!
Ok_Cryptographer4348@reddit
assert dominance
it's your money bro
it's your way or the highway
unless you live in a state like california, if then, gg
visarga@reddit
You scratched the table, u're in trouble.
Street-Coyote9075@reddit
On the bright side, you will not need to run the heat in your house this winter.
Jolly_Lie5906@reddit
bro bought every last evga rtx 3090
marialchemist@reddit
Are 14 3090s better than 7 4090s? Which one's more cost effective in your opinion? How large is your model in terms of parameters based on the rig that you have?
I'm deploying to the cloud bc I don't want to heavily invest in this type of setup yet but I'm curious in terms of training, energy costs, cpus, cooling, etc
_pwnt@reddit
just curious why people spend ~$12k USD on such setups to run local LLMs?
genuinely curious about the benefits, etc.
ladle_of_ages@reddit
I would imagine it's to run customised models that aren't curtailed by rules and guidelines. Also privacy.
throwaway_didiloseit@reddit
12k for that? Seems a bit.... stupid, right?
ladle_of_ages@reddit
"Forbidden" knowledge could be much more valuable than the cost of entry.
throwaway_didiloseit@reddit
You are not gonna get any forbidden knowledge from LLMs, that's my point
I_will_delete_myself@reddit
Bro just rent a local server if you need that many GPUS
Rokett@reddit
What is this for? Personal hobby, a business, curiosity? I see people building these but I don't get the reason behind it.
Few years back, people were mining crypto with it, I got that. Im still confused on running local llm and dropping like $10k
FuckedUpImagery@reddit
So they can write very realistic literary porn
JaizonIzRael@reddit
Let’s say you have a business with a few hundred employees. Your Intellectual property is what makes you money. People using ChatGPT (teams at least so your input isn’t used to train the model). At 150 users $30 a month that’s about $54000 a year. Your data isn’t used to train another llms model and you control the data (can check used chat history etc)
EarthquakeBass@reddit
Uh, why not just get a GPU server and run Llama there?
SufficientLong2@reddit
Why would all these people need to used ChatGPT?
AdAdministrative5330@reddit
I don't get it. You can get a tenant on Azure that also keeps your data private.
JaizonIzRael@reddit
Private from the LLM itself. So you’d be paying for azure compute lol. You’d be better off with ChatGPT
AdAdministrative5330@reddit
Yes, Azure gives you access to GPT4 models and data privacy. I think OpenAI gives enterprise accounts data privacy.
JaizonIzRael@reddit
And that’s great. But your talking monthly costs that will add up to having a power house server running llms locally in months
AdAdministrative5330@reddit
Don't forget the cloud is elastic. You generally pay for what you actually use. It might be difficult to justify if 10K budget gives you 5 years of cloud spend vs being stuck with 10K of depreciating hardware.
Rokett@reddit
I don't think this setup could handle 150 people. Can it?
XMasterrrr@reddit (OP)
It actually can depending on the model and the context. Check out my 2nd blogpost where I go in depth about that https://ahmadosman.com/blog/serving-ai-from-the-basement-part-ii/
Rokett@reddit
I read on your I think first post that you use deepseek.
It used to be so great but now the quality has been very bad for me. I use it for coding and all I'm getting is strait gargabe.
What is your experience is like? I use the api
photosealand@reddit
To save on AI monthly subscription costs? :P (jk)
__JockY__@reddit
Power. I’m interested in the minutiae of how you’re powering this. It’s very relevant to my future decisions!
EasternMountains@reddit
Would also be curious if OP had to redo the electrical in their house to run this setup. That setups gotta be around 6000W. I’d have to unplug my oven to power that thing.
jeremyloveslinux@reddit
~25A at 240v. Comparable to an electric dryer, oven, or EV charging. Nuts for a home computer though.
DanzakFromEurope@reddit
Amusing that I could probably run it pretty easily (almost plug and play) in Europe.
jeremyloveslinux@reddit
You’d be maxing out two normal (13A) circuits. It isn’t an insignificant amount of power.
DanzakFromEurope@reddit
We normally have 10A and 16A (for plugs) fuses in my country. So in my home all the plugs in each room are on one fuse. So I could technically use two power outlets that are like 2m from each other to run it 😅.
But yeah it's still a lot of power and I would probably be checking it with a thermal camera if the fuses wouldn't trip.
wheres__my__towel@reddit
Worth. I’m considering running extension cords from each of my circuits to my setup
jeremyloveslinux@reddit
I’d consider running a 20A 240v circuit to where your build is (if it’s a bit more reasonable than OP’s build). Unless you’re renting, it’s going to be a lot safer.
oodelay@reddit
the power cords are tesla chargers 😂
xantham@reddit
you can run 5 on each circuit
Input Power=0.85350 watts≈412 watts Input Amperage=412 watts120 volts≈3.43 amps\text{Input Amperage} = \frac{412 \text{ watts}}{120 \text{ volts}} \approx 3.43 \text{ amps}Input Amperage=120 volts412 watts≈3.43 amps
Total Amperage=5×3.43 amps=17.15 amps
need to make sure. I'd run it on 10awg just so your wires don't heat up.
xantham@reddit
he's going to need 3 circuits otherwise the breakers will pop. unless he's running it on a 60amp 220v line with 220v power supplies
Caffeine_Monster@reddit
Also where the hell all the PCIe bandwidth is coming from. Surely more than one motherboard?
justintime777777@reddit
If it was me I would do an epyc 7x x16 slot board then use 2x8 bifurcation risers for 8x to all 14 gpus from a single system.
justintime777777@reddit
Back in the day I installed (without permission lol) 2 extra dryer outlets in my apartment for Ethereum mining. This setup could probably run from a single 30A outlet.
spamzauberer@reddit
10000 hamsters in wheels.
gaganse@reddit
Or one giant human sized one if you need the exercise. I know I sure do reading after reading this damn subr- all year.
sourceholder@reddit
I hope there's triple-redundant power for the basement sump-pump & back-up unit.
saraba2weeds@reddit
Would you like one more male girlfriend?
halixness@reddit
Is it my impression or that rack is slightly overloaded and tilted
XMasterrrr@reddit (OP)
Just camera angle, didn't have much space and needed to lean a bit
halixness@reddit
btw 14xRTX3090, way to ho for Llama 405B
ifdisdendat@reddit
I don’t understand. What is your use case ? Mining ?
LordCommanderKIA@reddit
New here, just want to ask why did you not considered using workstation gpu like rtx 6000 for that price ?
refinancemenow@reddit
Is this for a job or money making venture that isn’t crypto related? I’m honestly clueless. I have installed and used ollama a bit on my pc and it is neat but I’m way out of my league with understanding your goal here
XMasterrrr@reddit (OP)
Hey guys, I am currently sitting in the floor of my basement troubleshooting 2 GPUs not running as expected. Once done I am going to sleep for a day and then write a blogpost and share the pictures and the process with you. Stay tuned 🫡
DeltaSqueezer@reddit
Curious to see how this will all fit together. Even connecting everything will take some effort! Please update with photos!
XMasterrrr@reddit (OP)
I am currently sitting in the floor of my basement troubleshooting 2 GPUs not running as expected, once done I am gonna sleep for a day and then write a blogpost and share the pictures and the process with you guys. Stay tuned.
MeretrixDominum@reddit
Finally you have enough VRAM to get a right proper AI waifu
Caffdy@reddit
electricsheep2013@reddit
Where is this from? Want to watch unless it is just an a.i. generated image
Caffdy@reddit
It's Bocchi The Rock, but the meme is parodying a scene from Blade Runner 2049, both are worth the watch tbh
ezqu@reddit
bocchi the rock
ozspook@reddit
Runtimeracer@reddit
Damn you were faster... I was gonna write sth like "I think she'll like it if 'her' means your AI Waifu" 😄
ChengliChengbao@reddit
next step is a holographic system, we going bladerunner style
Rc202402@reddit
I think i got OP Covered
Hey u/XMasterrrr get one of these Volumetric Displays and build an AI Assistant (tell your wife its a 3d flower vase):
https://www.voxon.co/product-page/voxon-vx2
bosbrand@reddit
Visualize the words 'electric bill'...
tailcallrecursion@reddit
Now you need to explain her to this.
liviubarbu_ro@reddit
wow. that’s impressive! why do you need to run locally such a monster llm? you can generate cooking receipts with just one of those … 😆
Grouchy_Gate_9765@reddit
What do you guys with such serious setups do with LLM at home? And if you’re using for work, wouldn’t you want something more professionally built up for reliability and uptime?
hughk@reddit
A lot of crypto miners can say that this kind of construction works ok and better than a DC type setup at home which may concentrate heat output too much.
chakalakasp@reddit
Just tell her you’re an AI startup. Also, mention it on X so that you get millions of dollars in unsolicited VC money
hughk@reddit
And that is just for the power....
Useful_Hovercraft169@reddit
Those AI girlfriends never look like their profile pic
Enough-Profit-681@reddit
Who is Evga? Explain that first..
Newtonip@reddit
Tell her it's your new heating system for the basement.
hughk@reddit
Great if you live somewhere like Siberia or Northern Canada. Otherwise, you have a new Sauna.
ijustlikeelectronics@reddit
God I thought crypto mining was coming back for a sec
hughk@reddit
It is one of the sources for GPUs and components for building open rigs. More coins now are being created that are GPU antagonistic.
Larimus89@reddit
3080s ?
Man here I am struggling to justify 1x 3090 to my partner.
Curly_Grass_2296@reddit
Hey i am a total noob to the world of LocalLLaMA, may i ask what do you want to achieve by such an investment? I am asking out of pure curiosity since i see this and makes me wonder what kind of results i can expect from running localLlama on my gaming pc vs this powerhouse lol
XMasterrrr@reddit (OP)
I am currently working on a project that requires both batch inference and training. I did the math and I would have burned the same amount of money in a few months of compute renting so this was the right move for me.
I have a post that touches on that https://ahmadosman.com/blog/serving-ai-from-the-basement-part-ii/
Curly_Grass_2296@reddit
Oh i see, thanks!
fforever@reddit
do that through the number of shoes possible to buy in future
xxvegas@reddit
I am genuinely curious why people are building LLM clusters in their basement. Is this compute something you can sell as a service profitably, like on vast.ai? If you genuinely need LLM to power your business, wouldn't it be better to just use API or one of those Model-as-a-service vendor like fireworks.ai?
SufficientLong2@reddit
I'm also puzzled. It seems to me people are just treating this as pc-building; everyone's excited about the process but no one is actually playing games.
killver@reddit
I know most people here wont like to hear that, but you would be way cheaper and more flexible just using OpenRouter and pay for the api costs. If you are not training models such a setup is just waste of money. But if youre having fun, maybe worth it.
TeardowntheWall1989@reddit
Then explain the electric bill later? lol
Legal-Menu-429@reddit
Just say it’s a bitcoin mining rig
Ekkobelli@reddit
Honey, I need this to shrink the kids.
IVRYN@reddit
So this is what the crypto miners are doing now
SickElmo@reddit
Plot twist: This is AI generated :D
XMasterrrr@reddit (OP)
My bank account would beg to differ 😅
sixtyeightmk2@reddit
Get a good smoke detector
wingsinvoid@reddit
What hashes do you get from that? What are you using? Claymore?
GaryMatthews-gms@reddit
Nah you don't have to explain anything mate! You can hide it here so she never finds out... :p
Glxblt76@reddit
I'm out of the loop. Why do people do this? What benefit do they get from it? Is it to use AI to mine/stake cryptos?
throwaway_didiloseit@reddit
Because people are dumb, like to buy things and get carried by the hype. I guarantee you this dude will not get even 10% of his money back from these
killerstreak976@reddit
EVGA huh? Nice
DesignToWin@reddit
No need to explain it to Her. Her already understands and approves of your passion for the relationship.
DesignToWin@reddit
Sorry for Capitalizing on this opportunity to crack jokes.
Spoiler alert: (Referring to the AI as "Her" is the joke, for that one outlier who still doesn't get it. Don't worry. There will be more jokes to get later in this AI-generated future.)
shulke@reddit
The question she should ask is why not 4090 super
lanbanger@reddit
Because millionaire vs billionaire, I guess.
polikles@reddit
yes hun/mom, I need this for my school project
depending on her sense of humor you may be allowed to live forever in the basement to chase your dreams and hobbies
Pfaeff@reddit
You need to fire her up first.
Groundbreaking_Rock9@reddit
Tell her you're mining crypto. She won't suspect that you're training models
hugthemachines@reddit
"Because I'm worth it"
fasti-au@reddit
Odd choice. Expensive way to make private when you rent gpu online cheaper
swiftninja_@reddit
What in abomination
Extension_Flounder_2@reddit
Really confused how you didn’t run out of pcie lanes ..
_sqrkl@reddit
Lol this looks like my first mining rig, except I had a giant ass industrial fan blowing on a shoe rack full of radeon hd 7970s.
gaspoweredcat@reddit
ye gods thats a beast! im more than slightly jealous
katatondzsentri@reddit
When I bought the stuff for my homelab, I just told her after assembly: "This is Pete. Pete lives with us now."
She showed a confused face and left without a word.
Websting@reddit
How do you even begin to explain something like that to a significant other?
Unfair_Trash_7280@reddit
Oh man, you may have a chance to use your “powerful” server to test out latest Tencent Hunyuan 389B Q4 (if GGUF’d) to unique & sincere explaination to her
CadeOCarimbo@reddit
Just... Why?
DK305007@reddit
Can it run cyberpunk?
godev123@reddit
I see your llama 405b badge. Are you running that? Is this round about 14-15 3090s total? That’s about ~330 to 350GB right? If so, what quantization and context size are you running? For example the fp16 I use on openrouter is 800+GB in size. 128k context to boot probably bumps it up to a terabyte. Just curious. Because it sounds like you don’t have enough GPUs still haha :)
persona0@reddit
What's the power bill look like with this monstrosity running?
XMasterrrr@reddit (OP)
Since it's helping with heating the house this winter I won't bother checking that 😬😅
persona0@reddit
Lol multi tasking unlike that
ssjumper@reddit
Are you millionaire? Damn
Itchy_elbow@reddit
Honey I need help… I have no explanation
IcezN@reddit
You need to explain it to me, too. What are you doing with this beast?
vulcan4d@reddit
Don't kid us, she is long gone.
MrHistoricalHamster@reddit
Forgive me for being thick, but what’s the advantage of this? What can you pull off that someone with a local llm + 4090 or a chatgpt subscription etc can’t?
MrHistoricalHamster@reddit
Just tell her it’s a crypto mining rig and it turns electricity into cash, then pray to god she doesn’t try work out the numbers XD.
Lissanro@reddit
Nice! But I counted 14 cards, I suggest you to get 2 more for a nice power of two quantity (16). It would be perfect then.
But jokes aside, it is good rig even with 14 cards, and should be able to run any modern model including Llama 405B. I do not know what backend you are using, but may be a good idea to give TabbyAPI a try if you did not already. I run "./start.sh --tensor-parallel True" to start TabbyAPI to enable tensor parallelism, it gives noticeable performance boost with just four GPUs, so probably will be even better with 14. Also, with plenty of VRAM to spare it is a good idea to use speculative decoding, for example, https://huggingface.co/turboderp/Llama-3.2-1B-Instruct-exl2/tree/2.5bpw could work well as a draft model for Llama 405B.
_-101010-_@reddit
I believe he mention using tools like vLLM and Aphrodite, which support tensor parallelism, enabling effective utilization of multiple GPUs.
Lissanro@reddit
Yes, but with tensor parallelism combined with speculative decoding it should be even faster.
Hearcharted@reddit
This is madness 🤔
Kraken1010@reddit
Mention that you’ll save a lot on the heating bill!
Cless_Aurion@reddit
Not just her, but your kid when they become college-aged.
Latter_Lime_9964@reddit
If you have that kind of money, nope, no explanation is needed.
identicalBadger@reddit
What are you doing with all this gpu power?
therightjon@reddit
This made me mining for alt coin back 2016 - 2018. It was so thrilling.
ortegaalfredo@reddit
Tell her that the god-emperor Llama-405B forced you to do it.
son_et_lumiere@reddit
"Hey, honey! Guess what? You know that new car that you've been eyeing recently? Well, guess what I got!.... no not that."
_-101010-_@reddit
"Hey honey, I spent the basement remodel money on building an AI cluster". The bare stud look will help with airflow!
ethertype@reddit
Where do you buy 3090s in bulk in late 2024?
sedition666@reddit
I had noticed a massive dip in availability as well. People hoarding them before the 5000 series drop maybe?
Caffeine_Monster@reddit
Well they aren't building them anymore - neither 3090 or 4090, and the big offload from crypto coin boom is long past.
We're actually in a weird situation where older GPUs with lots of vram are possibly going to get more expensive if any of the rumors regarding the 5000 prices are true.
PraxisOG@reddit
This guy is the dip in avaliability
ZoraandDeluca@reddit
Microcenter
ethertype@reddit
6 months ago, yes. Not today.
DeltaSqueezer@reddit
Yeah. I was wondering this too. And what sort of prices do you get. Where I am, new 3090s cost almost the same as new 4090s!
LtCommanderDatum@reddit
If you have money, they will come.
Quartich@reddit
I see 4 count $780 USD free shipping, used, on ebay
Vikare_Mandzukic@reddit
Approximate price?
for comparison, pls
CheapCrystalFarts@reddit
OP has MONEY money.
takuarc@reddit
Tell her this is for heating
IdeaAlly@reddit
Thanks to your gigantic purchase, new tech that puts these in the dust are coming out in just 2 weeks at half the price!! 😆 /usedToHappen
XMasterrrr@reddit (OP)
???
IdeaAlly@reddit
nevermind lol its a joke dude, new shit at half the price coming out after making a huge purchase... never happened to you?
Disastrous_Tomato715@reddit
Do you find it competitive with Claude 3.5 sonnet?
XMasterrrr@reddit (OP)
It is not really the thing I am basing my decision one, I think Claude 3.5 Sonnet is great for coding, but I am also very concerned about data privacy and I know there will be trends where they'll increase their prices.
I am currently working on a project that requires both batch inference and training. I have a post that touches on that https://ahmadosman.com/blog/serving-ai-from-the-basement-part-ii/
Disastrous_Tomato715@reddit
Awesome. Thanks!
Status_Contest39@reddit
Don't tell her or you're dead.
XMasterrrr@reddit (OP)
Unfortunately a financial institution already done that deed 😬
bladecg@reddit
This makes me miss my mining rigs 😭
AppropriateYam249@reddit
I hope you guys can make it work ! (You and your electricity bill)
XMasterrrr@reddit (OP)
Thank you kind sir
realsteakbouncer@reddit
Admittedly I'm kind of daft, but I thought using multiple GPUs for llms was basically pointless because each GPU has to hold the full LLM anyway.
XMasterrrr@reddit (OP)
Not accurate at all, you might want to check my blog I have some good writings on these topics: htrpa://www.ahmadosman.com
coreyman2000@reddit
Why not just get a couple h200?
XMasterrrr@reddit (OP)
Way more expensive
HG21Reaper@reddit
Tell her you trying to run doom on it
XMasterrrr@reddit (OP)
I am going to have the best AI based Minecraft server out there babe 😂
LanguageLoose157@reddit
Man, getting insane crypto farm vibe here.
XMasterrrr@reddit (OP)
Checkout my Blogposts if you're interested in confirming it is not that: https://ahmadosman.com
de4dee@reddit
"we will co author a romance movie and use multimodals to actually generate it and we will watch it together"
(don't try this at home)
XMasterrrr@reddit (OP)
Babe, just wait until you see the system prompt I have written for the scenes, I am using your favorite colors 😃
vTuanpham@reddit
straight to jail
XMasterrrr@reddit (OP)
😂😂😂
Safe_Ad_2587@reddit
Wait, how will you make money with this?
XMasterrrr@reddit (OP)
I am currently working on a project that requires both batch inference and training. I did the math and I would have burned the same amount of money in a few months of compute renting so this was the right move for me.
tokyoagi@reddit
Maybe what you should explain is that cable management.
XMasterrrr@reddit (OP)
Actually the cablr management is my best work to date, can be seen a lot cleaner without the fans (which are actually very well managed too). Check my first blogpost for cleaner pics: https://ahmadosman.com/blog/serving-ai-from-the-basement-part-i/
Working_Berry9307@reddit
Where do you guys get the money, I can't even comprehend it
chickenofthewoods@reddit
Crypto, bro, bitchcoins and eitherorium.
Hipcatjack@reddit
Being single.
SecuredStealth@reddit
Selling drugs bro
Low88M@reddit
I laugh when I hear all those « Nvidia’s greed prevents the normal unlimited growth of my VRAMed personal toy », and I probably cry as much as op’s wife (you should try to batch generate her to cherry pick ;) ) when I see so many 3090 for a single personal computer… I feel speechless. Hopefully I can enjoy your title humor and laugh at your loacalllama Guinness record. But the ressources on earth are not infinite, and the greed/power of some tends to make prices and unwanted consequences grow as well. But sometimes it’s not who we think (Nvidia?)… when mining came… no cards were left and prices began to grow. The ressources to make those cards are less and less easy to find/produce and it always has a cost (water, epower, geopolitic, farmers around, children mining some coltan etc…). Not trending considerations on localllama I imagine. Ok now you can use Meta 405B at 15t/s… what for ? Solve climate change ? War conflicts ? Poverty ? Inequity ? Racism ? Trumps education ? Waifu upscale ? -I know you probably have good projects (you wouldn’t put that much on those expensive cards). Enjoy.
Doomtrain86@reddit
So… like what do you use it for? I love the setup and the engineering feat in it, but what’s a workflow you use it for? I’m genuinely curious!
MismatchedAglet@reddit
No _you_ don't. It can explain itself now.
RadSwag21@reddit
Can you really get more LLM power than Claude 3.5 or 4o out of these?
Like apart from privacy and security, are there any other benefits to running your own rig? Walk me through it like I'm an idiot. Which ... I am.
YordanTU@reddit
Buy yourself an expensive watch - something which she will instantly recognize as a luxury item (Rolex, Omega, Cartier...). It will distract her of these hardware boxes, as they are boring AF anyway. If she criticize your expences, you either demonstratively sell the watch and restore the family budget, or you give it to her as a gift with love (if she likes it). Anyway, all the action will be around the watch, and these GPU's will be forever forgotten.
stonediggity@reddit
Absolute beast
Kep0a@reddit
How big of a model can you run with this setup?
unknownpoltroon@reddit
I don't know who "Her" is, but to quote my dad: "Oh, yeah, this is gonna come up in the divorce."
orbitranger@reddit
Just tell her you are a crypto bro and it will one day pay off
AutomaticDriver5882@reddit
What motherboard is going to handle all this?
amitbahree@reddit
How much does this cost?
constPxl@reddit
gamers then: grrr those pesky cryptominers!!
gamers now: grrr those pesky localllmers!!
maz_net_au@reddit
I hope this is to power the reddit shitposting bot that someone posted here a month or 2 ago. Imagine the mess it could make!
ParaboloidalCrest@reddit
"Her" being an artificial character you spawn by those cards, will sure understand.
FaceDeer@reddit
And if she doesn't, just edit her context.
SariGazoz@reddit
impressive , but why ?
Confident-Ant-8972@reddit
Sweet, now you don't need to pay for that $20/mo subscription!
MathmoKiwi@reddit
Just think of the savings! Every month!
opi098514@reddit
Ooooohhhh so this is why I can’t find any 3090s.
Joe__H@reddit
Ooops. Somebodies gonna be in trouble.
SGAShepp@reddit
Don't worry, you won't have to for much longer.
cosmic_timing@reddit
I need you to explain it to me :D
munderbunny@reddit
Wait why would you spend such an insane amount of money to create this setup? I can't help but wonder if what you're doing will really benefit from it, versus spending a fraction of this money to access services instead.
Ancient-Carry-4796@reddit
Me, trying to snag a 24GB GPU to run a super slow LLM locally but finding it too pricey:
👁️👄👁️
JosephLouthan-@reddit
Okay now you got me. Is there a comprehensive guide to building local llama servers (restricted by space, time, budget or no limits)?
LostNtranslation_@reddit
Halloween decorations I purchased at huge discount due to it being Nov! Congrats!!!!!
yautja_cetanu@reddit
So cool!!
MinecraftPlayer_1@reddit
she tensorflow on my gpu till i overheat
eggs-benedryl@reddit
No need, I'll take them off your hands.
goj1ra@reddit
I'll even pay for the shipping.
LevianMcBirdo@reddit
Her as in your own Scarlett Johansson LLM?
Ok_Combination_6881@reddit
what cpu are you using for this??
tamereen@reddit
Especially if it runs all night in the bedroom :)
fqye@reddit
Winter is coming. This is state of art home heating system.
ArtifartX@reddit
Makes my 5x GPU server (fitting all within a tower case) look like a little baby.
a_beautiful_rhind@reddit
Well.. she's an AI, right? So if you just start a new chat, she won't remember.
ares0027@reddit
My thought process;
(This process has nothing to do with. This is how my stupid “brain” worked and just wanted to share. Not accusing anyone anything or something)
Rndmdvlpr@reddit
Are we talking about your wife or the beast of an AI girlfriend you made?
Illustrious-Lake2603@reddit
Daang. I cant even get one @___@
ThePloppist@reddit
Even buying them at a steep discount this is going to be expensive.
Is there any legit practical reason to do this rather than just paying for API usage? I can't imagine you need Llama 405b to run NSFW RP and even if you did it can't be moving faster than 1-2 t/s which would kill the mood.
rustedrobot@reddit
Privacy is the commonly cited reason, but for inference only the break-even price vs cloud services is in the 5+ year range. If you're training however, things change a bit and the break even point can shift down to a few months for certain things.
kremlinhelpdesk@reddit
What if you're nonstop churning out synthetic training data?
rustedrobot@reddit
Using AWS Bedrock Llama3.1-70b (to compare against something that can be run on the rig), it costs $0.99 for a million output tokens (half that if using batched mode). XMasterrrr's rig probably cost over $15k. You'd need to generate 15 billion tokens of training data to reach break even. For comparison, Wikipedia is around 2.25 billion tokens. The average novel is probably around 120k tokens so you'd need to generate 125,000 novels to break even. (Assuming my math is correct.)
kremlinhelpdesk@reddit
At 8bpw, 405b seems like it would fit, though. Probably not with sufficient context for decent batching, but 6bpw might be viable.
rustedrobot@reddit
I have 12x3090 and can fit 405b@4.5bpw w/16k context (32k Q4 cache) The tok/s though is around 6 with a draft model. With a larger quant that will drop a bit.
kremlinhelpdesk@reddit
I might be too drunk to do math right now, but that sounds like about twice the cost of current API pricing over a period of 5 years. Not terrible for controlling your own infrastructure and and guaranteed privacy, but still pretty rough.
On the other hand, that's roughly half the training data of llama3 in 5 years, literally made in your basement. It kind of puts things in perspective.
Pedalnomica@reddit
Hobby, and privacy are big ones, but the math can work out on the cost side if you are frequently inferencing, especially with large batches. Like, if you want to use an LLM to monitor something all day every day.
E.g. Qwen2-VL, count the squirrels you see on my security cameras -> LLama 405B, tell Rex he's a good boy and how many squirrels are outside -> TTS
The API prices are often pretty steep. However, maybe you can find free models on OpenRouter that do what you need.
Select-Career-2947@reddit
Probably they’re running a business that utilises them for R&D or customer data they needs to be kept private
EconomyPrior5809@reddit
yep, grinding through tens of thousands of legal documents, etc.
weallwinoneday@reddit
Whats going on here
Darkz0r@reddit
That's amazing.
Could you use that to build your dream game? Making the agents present you each major decision and how it's implemented in the game? Then you could approve or not and keep building.
That's what I would do hehe.
icystew@reddit
“Winter is coming and we were spending too much on heat so I figured I’d do the smart thing”
trollsmurf@reddit
"I'll run an LLM that will provide you with advice. Mostly hallucinated, but still. It will also enhance our electricity bill."
No_Goat_5701@reddit
Must be nice to be rich
Frizzoux@reddit
Bro is a millionaire
Hubrex@reddit
"But dear, all of this hardware to run an AI locally will be obsolete in less than a year. It's worth it, you'll see!"
Would be the tactless and suicidal way.
"This hobby is cheaper than that speedboat I was saving for."
Will save your nuts. Just don't tell her the rig will suck more juice than running an oven 24/7. And it'll heat your home this winter!
Comms@reddit
That one eBay seller: Cha-ching!
Pleasant-PolarBear@reddit
Then explain to the power company why your house is pulling 8,000 more watts suddenly
incjr@reddit
but can it run Crysis?
SecuredStealth@reddit
Can it “create” Crysis?
martinerous@reddit
It can hallucinate Crysis.
squareOfTwo@reddit
That's the new IQ120 question. Very good.
GoodKarma70@reddit
"It's for science"
Herr_Drosselmeyer@reddit
You're so screwed but at least you'll have a really high quality digital waifu to console you and calculate the alimony payments. ;)
khidot@reddit
Yikes, looks a bit dusty!
AgTheGeek@reddit
Oh damn! You guys have figured out how to run multiple GPUs for models?! Damn I’ve been away a minute and things get better so fast!
Anyone got a good tutorial on running multiple GPUs? I don’t have the best but they’re not terrible 🤣
goatchild@reddit
gg
tspwd@reddit
“Babe, we can’t buy a new car this year”. But you can use something like ChatGPT for free now. Not with an app, though. You have to use the Terminal on your computer.”
gcubed@reddit
We're coming into winter and this is a great way to cut down or heating bill.
QuantumTyping33@reddit
holy rich
laveshnk@reddit
Look how they massacred my boy...
*sad EVGA noises*
fahadirshadbutt@reddit
Needed it for homework
ZoobleBat@reddit
Have the llm do it.
LatestLurkingHandle@reddit
Move to Dubai and get solar
Armym@reddit
As someone who also owns a similar abomination. You win sir. How can you even connect this to one motherboard?
FuriousBugger@reddit
Don’t overthink it. Keep it short and sweet. Like something you could fit on a tombstone.
iamlazyboy@reddit
ngl, if I had infinite money, I'd do that, use all of them but one on a local server, wondering what I'd do with and probably ending with the biggest LLM model I'd find on it just for the fun and one for my gaming PC just to watch videos and browse reddit (but tbh, if I had infinite money, it'll be 4090s but you get the gist lol)
kremlinhelpdesk@reddit
Infinite money, and a 4090 is still too expensive to game on.
Raywuo@reddit
So baby, the mitochondria is the powerhouse of the cell ...
Pristine_Swimming_16@reddit
hey honey, you can run nsfw now.
kremlinhelpdesk@reddit
Life goals.
roz303@reddit
At least it isn't crypto mining?
throwaway_didiloseit@reddit
You can repurpose them for that as soon as the AI bubble bursts
roz303@reddit
Sad but true lmao
proxiiiiiiiiii@reddit
Just write a good prompt and explaining it to her will be really easy
throwaway_didiloseit@reddit
Someone is gonna regret this in less than a year.
Bchi1994@reddit
I don’t understand this… if you can’t pool 3090 VRAM, what is the point?
Upset-Ad-8704@reddit
What is your use case for this? Genuinely curious to see whether I should start drafting my explanations.
denyicz@reddit
Bro. If you are not Rich af or you're planning to make money from it or you arent researcher; you wasted your money.
yoshiK@reddit
Just mumble something about "got it pretty cheap," she will assume that means something like $50 each and get only a bit mad about you wasting hundreds of dollars.
e79683074@reddit
And if you ever get past this part, you'll also have a lot of explaining to do when the first electrical bill comes
On-The-Red-Team@reddit
I hope you have solar power setup for commercial infrastructure use. Otherwise, your power bill is going to be more than some peoples house payments.
ali0une@reddit
Maybe you could just ask your LLM?
Psychological-One-6@reddit
Just build a boat shed, and put it in there, and say you bought a boat.
VitorCallis@reddit
cool, but can it run crysis?
Slimxshadyx@reddit
This is pretty awesome, can I ask what it is you are using it for? I know language models, but what specifically?
TroyDoesAI@reddit
I mean did you explain that they are EVGA?
TroyDoesAI@reddit
Yeah she’s gonna be real disappointed with that wire management, as someone that has lived the life of an Electrician I could see why it would bother her. You gotta keep it clean and tidy.
Massive_Robot_Cactus@reddit
This reminds me of the computer in the movie Pi.
ambient_temp_xeno@reddit
Demon Seed.
Over-Dragonfruit5939@reddit
“Baby, I’m uploading my consciousness to the cloud.”
Capable-Reaction8155@reddit
Is this personal or are you part of a company GD
Synyster328@reddit
I hope that thing is anchored so it doesn't take off when the fans start going.
gaganse@reddit
You don’t. Just throw a sheet over it. The powebill…
dacash1@reddit
better call Saul
Theverybest92@reddit
Just say it's to run a holographic instance of your favorite Pr0n model =D.
Natural-Fan9969@reddit
Ask whatever model you are using to give you a nice explanation to give her.
Blind_Dreamer_Ash@reddit
Good luck
vd853@reddit
That's like $30k right?
trisul-108@reddit
It doesn't matter what you say, she will never hear a single word, just the fans.
Express-Dig-5715@reddit
explain the power bill, not the hardware, tell that you bought it for 500bucks, but power bill will be good one.
LtCommanderDatum@reddit
"When I win the lottery, I won't say anything, but there will be signs."
falls asleep on a pile of 3090s
Low-Ad4807@reddit
I’m curious on what kind of motherboards that support that many GPU. Are those same as mining rig? Appreciate if anyone has some references/matterials for this
AdDizzy8160@reddit
... why not, let her explain it to her?
badabimbadabum2@reddit
That thing is her.
Den32680@reddit
Glad someone found a use for all the antiquated eth mining rigs
norsurfit@reddit
"As an AI language model, I am afraid I am not allowed to answer that question, honey!"
Roubbes@reddit
Ask Llama 405B how to explain it to her.
DigThatData@reddit
That pile of boxes in the corner looks like a fire hazard
Plane_Ad9568@reddit
Anyone making money of these ?
ThisWillPass@reddit
Babe, think of all the deals this can get us to resell once I get it going.
griff_the_unholy@reddit
Honestly, at this point I wouldn't even try.
junior600@reddit
And then there’s me, who would be happy just to have even one of those. :>
UnusualK19@reddit
Why do you need it?
MrTurboSlut@reddit
i have so many questions. what are you using this for?
volschin@reddit
You will have it warm in winter. 😂😅
mlon_eusk-_-@reddit
Pov day one after winning lottery
hurrdurrmeh@reddit
I am SO HARD rn.
AdamLevy@reddit
Now you can ask LLaMA to do all explanations on your behalf
two5309@reddit
What motherboard are you using? I see in your post that it was a 7 slot, are you splitting the lanes for the new ones?
dhrumil-@reddit
Bro i just want one I'll be happy lol
Level-Acid@reddit
Tell her you need to talk to someone
nefarkederki@reddit
Guys I’m wondering. What is the strategy here to make money? Putting them on vast.ai or something similar you would need a lot of time for ROI isn’t it?
CantankerousOrder@reddit
How to avoid moving down there? Given what you’ve spent so far, you can probably afford to furnish a nice little space for her down there to annoy whatever hobbies she has.
thisoilguy@reddit
Winter is coming, need a new heater 🤣
son_et_lumiere@reddit
"well, the furnace went out, and the repair man said it'd be $15k to install a new furnace. So, I thought why not just handle two things at once"
devious_204@reddit
Why are you asking us when you have one hell of a crazy llm rig right there
iamthewhatt@reddit
Oh this? Its uh... crypto. Yeah, crypto. wipes away sweat
SupplyChainNext@reddit
No you don’t
rishiarora@reddit
Wow. Congrats man
TamSchnow@reddit
„This is just a fancy space heater which can also do other stuff“
SirPizzaTheThird@reddit
Would be fun to see a demo of the output and the use case in action.