Group Buys for Shared Compute or Model Hosting? Is this a thing?

Posted by JustinPooDough@reddit | LocalLLaMA | View on Reddit | 4 comments

I've been using GLM 5.1 a lot lately, and I love this model. However I don't love sending all my requests to China. I'm not freaking out about it, but it's not ideal. I don't want to send my data to any provider ideally.

With the cost and availability of Cloud compute, it looks to me like someone could theoretically orchestrate a "Group Buy" to rent something like a cluster of 8xH100s - maybe 16x. Unless Gemini has failed me, this would be enough to host GLM 5.1 at FP8.

My questions are:

  1. Is anyone doing this - or has anyone tried to do this?

  2. If you wanted to bring costs down to say 50 bucks a month per user, how many users would you need?

  3. Would the hardware support this at a reasonable t/s?

Genuinely curious. I would be interested in such a deal personally. I would imagine you would want to auto-ban open-claw users or people clearly abusing the API - or at least segregate non-coding use cases to a separate group and separate hardware... thoughts?