Mismatch GPU worth it?

Posted by Ill_Ad_4604@reddit | LocalLLaMA | View on Reddit | 8 comments

I have a RTX 8000, RTX 4000 Ada and a half dozen or so P2200's would it be worth using them together in a cluster or would the P2200's bottleneck everything so I would be better off using the cards independently for different things that the load can fit on that card?

Too many GPU 💥

🎉

[-]

a_beautiful_rhind@reddit

Depends on the backend. Probably the 4k and the 8k together are worth it. The pascal cards are only 30gb all together. Perhaps in a separate machine because it needs different driver and torch. Or possibly to run TTS/STT, embedding, or image support models. How expensive is your electricity.

[-]

Ill_Ad_4604@reddit (OP)

I don't pay for the electricity at that location

[-]

a_beautiful_rhind@reddit

Then go all out.

[-]

the-supreme-mugwump@reddit

Can run a smaller model on each, have your prompt go to each model and vote on best reply or route prompts by complexity or specialty to certain models. Send coding prompts here send image prompts here etc

[-]

Ill_Ad_4604@reddit (OP)

So maybe open claw with a smaller model to route and use each as a sub agent?

[-]

the-supreme-mugwump@reddit

Yes. You should also check your pci lanes speeds if you have a few full speed lanes can have the 8000 and 4000 work a bigger model together and handoff smaller things to the p2200s

[-]

Edenar@reddit

the p2200 (pascal, 5GB at 200GB/s) aren't worth using imo. The RTX 4000 Ada is the best choice since it's the most recent, maybe you can use it together with the older rtx 8000 with the right framework (vulkan ?) but just dont try to add the 5GB card imo, they will probably cripple the whole thing. But as other suggested, maybe you can run smaller specialized model on them.

[-]

tmvr@reddit

How do you want to connect them? Into the same board as the 8000 and 4000 are in? You do get some more VRAM even if it is only 5GB per card and the bw is 200GB/s so probably better than wgat your system RAM is giving you, but you already have 48GB + 20GB from the big cards. I mean is there something that is just about limiting you with that 68GB VRAM? If yes, add 5-10-15GB with the P2200, but those would probably be better for running smaller, up to 4B models, or maybe a quantized 9B on two of them.