Ollama 0.19 with MLX is the real deal
Posted by PracticlySpeaking@reddit | LocalLLaMA | View on Reddit | 3 comments
So it only runs their special Qwen3.5-35b-a3b-NVFP4 model. But it rips — on a *32GB* Mac Studio with *binned M1 Max* (24 GPU) — returning \~64tk/sec for moderate sized prompts.
>Ollama is now powered by MLX on Apple Silicon in preview · Ollama Blog
\- [https://ollama.com/blog/mlx](https://ollama.com/blog/mlx)
That was while also running hermes-agent, a bunch of Chrome and Safari tabs, terminal, Activity Monitor and some other editors and utilities.
3 Comments
chibop1@reddit
PracticlySpeaking@reddit (OP)
AurumDaemonHD@reddit