Ollama 0.19 with MLX is the real deal

Posted by PracticlySpeaking@reddit | LocalLLaMA | View on Reddit | 3 comments

So it only runs their special Qwen3.5-35b-a3b-NVFP4 model. But it rips — on a *32GB* Mac Studio with *binned M1 Max* (24 GPU) — returning \~64tk/sec for moderate sized prompts. >Ollama is now powered by MLX on Apple Silicon in preview · Ollama Blog \- [https://ollama.com/blog/mlx](https://ollama.com/blog/mlx) That was while also running hermes-agent, a bunch of Chrome and Safari tabs, terminal, Activity Monitor and some other editors and utilities.