Using my Mac Mini M4 as an LLM server—Looking for recommendations

Posted by cockpit_dandruff@reddit | LocalLLaMA | View on Reddit | 7 comments

I’m looking to set up my Mac Mini M4 (24 GB RAM) as an LLM server. It’s my main desktop, but I want to also use it to run language models locally. I’ve been playing around with the OpenAI API, and ideally I want something that:

• Uses the OpenAI API endpoint (so it’s compatible with existing OpenAI API calls and can act as a drop-in replacement)

• Supports API key authentication. Even though everything will run on my local network, I want API keys to make sure I’m implementing projects correctly.

• Is easy to use or has excellent documentation.

• Can start at boot, so the service is always accessible.

I have been looking into LocalAI but documentation is poor and i simply couldn’t get it to run .

I’d appreciate any pointers, recommendations, or examples of setups people are using on macOS for this.

Thanks in advance!