Got Really lucky and need your advice
Posted by Amos-Tversky@reddit | LocalLLaMA | View on Reddit | 53 comments
So, I got the chance to get either a rig of like 8 RTX PRO 6000s or the GB300. Which should I take? Its gonna be used by like 10 people, but im the primary user.
beasthunterr69@reddit
Def RTX Pro
AnonsAnonAnonagain@reddit
Um. No! Dude!? GB300 is a RTX Pro 6000 killer
victoryposition@reddit
288GB vram vs 768GB vram. Depends what he's running. Kimi 2.6 would be slow as shit on 288GB
a_beautiful_rhind@reddit
Yea, I'll take the more vram. Especially for concurrent requests.
entsnack@reddit
Check the power consumption and cooling requirements, my guess is the GB300 comes out on top. I find Nvidia workstation and consumer GPUs too power hungry without underclocking.
DataGOGO@reddit
GB300 absolutely no question.
SomeGuy20257@reddit
IMHO 8 PRO 6000s so you can flexibly scale down (sell some)
Bitter_Housing2603@reddit
Why would u need to scale down in a team of 10 people
SomeGuy20257@reddit
In the future, when that ten possibly becomes less.
seamonn@reddit
Did you just extrapolate job loss due to AI?
SomeGuy20257@reddit
No, I was not even thinking about that, I was saying factors like team size can change, point of failures etc… basic risk management.
FatheredPuma81@reddit
Because some companies fail just enough to only need to let go of a few people?
Blaze6181@reddit
8 pros is the sweet spot. Keep em all. Run big models fast. You'll never need a subscription or external API call again.
brickout@reddit
In the nicest possible way, fuck off.
Just kidding, good for you, and I can't possibly help you decide. That's way out of my pay grade.
Ribido@reddit
Probably the unpopular opinion here but I think the DGX Station is the better call. I've worked with a lot of RTX6000 pros and something few people talk about is how they aren't in the datacenter family and don't get the same support out the gate with new models. I recently moved to H200s and the increased speed from HBM3 is noticeable and the ability to have a larger VRAM pool via NVLINK is really nice.
All that being said, I haven't used the station, I don't know if it'll have the compatibility issues the 6000s have, but if its treated me like datacenter I'd go for that. The only thing I'd take over the station would be an H200 setup as they're well established, but that's not your question.
Amos-Tversky@reddit (OP)
I’d love to get Hoppers, but they really aren’t available. And getting pre-owned isn’t really a possibility considering we’re financing this.
gotaroundtoit2020@reddit
8xRTX6000 gets you more overall HBM memory but you'll be talking over the PCI bus between the GPUs so you'll likely want to double check the interconnect between those GPUs.
There are the typical many small vs one big arguments. A failure of one small doesn't take you out. Having to share with the others allows you to more easily partition the many small (though I believe the GB300 does have MIG support).
The GB300 being basically the same across the OEMs means that you have some additional community benefit (like on the sparks) and it's running DGX OS so the similar arguments for the Spark / nvidia ecosystem / step up to the big clusters would apply too.
The GB300 has the 800G network ports so when you get funding for your second GB300 you could connect them up like you can connect up multiple sparks. :)
8xRTX6000 is 4800W (or half that if you are talking Max-Q), not including the server chassis itself. The GB300 is 1600W power supply. So it's possible to have the GB300 under your desk.
The GB300 has a BMC port but haven't found any docs on what remote management capability it actually has. Depending on where the water pump is, it may need to stand vertically and so that'll kill a bunch of space if you put it in a standard rack. Your server probably has dual power supplies and other redundant components.
I don't know how the software stack is going to handle the mix of 252GB of HBM and the 496GB of LPDDR. That may result in software issues that will need to get worked out while doing ep/tp of 8 should be fairly common.
The 8xRTX6000 is probably shipping now. No idea on when GB300s will actually be shipping.
xXy4bb4d4bb4d00Xx@reddit
rtx pros, i run a farm of them - very profitable and useful
Amos-Tversky@reddit (OP)
Profitable 👀
xXy4bb4d4bb4d00Xx@reddit
yeah with 8x rtx youll make about 110 USD per day on open markets at basement rates
anomaly256@reddit
4.5 years to break even isn't profit it's moderate subsidy 😛
xXy4bb4d4bb4d00Xx@reddit
for a new player like OP, its nice extra money esp if their power rates are good
i started on open markets and grew to doing direct longer-term dedicated compute deals with mid sized enterprises - thats where it becomes really profitable
yes2matt@reddit
How do you compete with the big boys?
1kaze@reddit
Sick setup
FatheredPuma81@reddit
I love how I have to use AI to actual get the info I'm looking for because Nvidia says stuff like the GB300 having 20TB of GPU Memory (if it does go for that). Anyways it looks like the GB300 is probably way faster (is my guess) and uses 1/4 the power but has way less VRAM (288GB vs 768GB)?
Not sure how accurate any of this is but RTX 6000s seems like the no brainer to me since they allow you to run literally any model on the market right now at Q8_0. Worst case scenario if it's too slow you split it into 2.
kmouratidis@reddit
252 GB HBM3e + 496 GB, LPDDR5X, 6400 MT/s, SOCAMM
FatheredPuma81@reddit
Okay I would go with this then because 20TB+17TB is better than that pathetic 252GB+496GB you mentioned.
|GPU Memory | Bandwidth|20 TB | Up to 576 TB/s| |:-|:-| |CPU Memory | Bandwidth|17 TB LPDDR5X | 14 TB/s|
https://www.nvidia.com/en-us/data-center/gb300-nvl72/
And its not like the 496GB of RAM on a 72 core ARM chip changes anything the RTX 6000s will still be faster with larger models if you offload to CPU.
Ambitious-Profit855@reddit
Same table, 3 to was above: "Configuration 72 NVIDIA Blackwell Ultra GPUs, 36 NVIDIA Grace CPUs"
72 of them end up at 20TB. Even your link mentions "nvl72"
I don't know why I feel the need to correct a llama 2 bot, guess this is what the Internet has come to.
FatheredPuma81@reddit
The irony of an actual bot trained off Twitter calling me a bot...
Like yea no shit? That's why I decided I wasn't dealing with it and let Claude decide what configuration this guy was using because I was not interested in figuring out what the typical configuration for hardware I'd never touch was...
AnonsAnonAnonagain@reddit
You must not be very good at searching.
https://nvdam.widen.net/s/jnkrzwnqhj/dgx-station-datasheet
truth_is_power@reddit
llama2 bot...holy crap don't hold back
FatheredPuma81@reddit
Oh and that much VRAM would also let you sideload things like image generation models if you choose to run a model that fits on say 4 of the GPUs.
AnonsAnonAnonagain@reddit
Get the DGX Workststion GB300. You would be foolish not to!
It’s got 7TB/s VRAM and like 390GB/s unified RAM as well.
Plus you can always add RTX Pro 6000 to it later!
Kooshi_Govno@reddit
First read this article: https://medium.com/data-science-collective/benchmarking-llm-inference-on-nvidia-b200-h200-h100-and-rtx-pro-6000-66d08c5f0162
The B300 is the 200 with more VRAM.
second: 8x rtx pro vs HOW MANY GB300? just one? Is it even possible to buy just one?
if it's 8 v 1, i.e. your total system budget is 200k or less, go for the 6000s.
if there's no budget limit, and you can buy 8x B300, then you'd be an idiot not to. They're so much faster per dollar it's insane.
source: I did this exact analysis for my company this week.
Amos-Tversky@reddit (OP)
B300 is the data center GPU. GB300 I’m talking about is the DGX GB300 Workstation(should’ve said). Budget is around 100K
Kooshi_Govno@reddit
Ah, that's on NVidia for their confusing naming. When I searched GB300, a cluster of 72 GB300s kept coming up.
In that case I agree with everyone else. The 6000s win out. One B300 is roughly 4x faster than one 6000 according to the benchmark article. I'll assume the same is true for the GB300.
In that case, 8x 6000 still beats it.
Beware the price of system RAM though, it'll eat the rest of your budget before you know it.
at $100k, you might only be able to fit 4 6000s in your system budget, in which case, the DGX GB300 outperforms in both performance and RAM at 748GB.
I1lII1l@reddit
whats NSFW about this?
Sofakingwetoddead@reddit
getting lucky
milkipedia@reddit
It's obscene that this kind of hardware is just falling into his lap
milkipedia@reddit
It's obscene that this kind of hardware is just falling into his lap
Amos-Tversky@reddit (OP)
It was either this or spoiler. This seemed more appropriate.
mahiatlinux@reddit
They aren't mandatory 😂
Tsofuable@reddit
Why would those be the only options?
Amos-Tversky@reddit (OP)
B300 is the data center GPU. GB300 I’m talking about is the DGX GB300 Workstation(should’ve said)
BestSentence4868@reddit
GB300 running nvfp4 kimi
flobernd@reddit
It does not fit. Kimi K26 checkpoint is in native int4 -> it requires exactly 8x RTX Pro 6000 Blackwell to run. On these cards you’ll get between 70-120 t/s with VLLM and MTP.
LulzyAnimal@reddit
I depends on what you want to run. Single Kimi - rtx6k, just for vram, m2.7 (as something that fits 250gb) that serves 10 ppl ultra-fast - GB300, parallel qwen to 10 ppl - rtx6k again as it has better combined bw/compute when run multiple independent instances in parallel. etc. also training will be a very different story than just inference, but I guess it's not your case, at least as of now.
Dany0@reddit
GB300 for training large models and research, RTX Pro 6000 for small experiments and inference. It's really not a contest
Also there are two GB300s (if you don't count the giant 72gpu server unavailable even to the gpu rich), the dgx station kind with one gpu and the server rack with two gpus, the server rack makes way more sense in power/perf
FormalAd7367@reddit
Sell some of them and put $ on index etf? why would you need 8 rtx 6000 at home? have enough power supply? have air cond 24/7?
Agreeable_System_785@reddit
Also think about cooling and energy cost when buying these type of systems that are 100% on.
oli266@reddit
Equivalent-Repair488@reddit
My chest tingles everytime I see one of these posts.
i_am__not_a_robot@reddit
I would definitely get the DGX GB300.