Recommendation for Production Hardware for inference and fine tuning.

Posted by Whyme-__-@reddit | LocalLLaMA | View on Reddit | 2 comments

Hi guys, I am trying to get a mini Ai rig which I can run 2-3 20b models finetuned on proprietary data and sending it to customers as a startup.

There are 3 goals i need to achieve from this machine

  1. Finetuning and RL from the machine
  2. Inference via vLLM on larger workloads using our front end software which is dockerized.
  3. Ease of deployment: I want to load up my software, connect it to the LLMs on the machine and ship it to customers to deploy in their environment. Completely private.

My options are:

  1. DGX spark,
  2. GMKtec AI Mini PC Ryzen Al Max+
  3. Anything else you recommend but I don’t want to build a tower pc and mess around with the form factor.

What are the challenges that I can encounter with the option 1,2 to accomplish my goals?

Any help regarding this would be greatly appreciated. Thank you