Qwen 3.6 35b a3b Q4 vs qwen 3.6 27b q6, on m5 pro 64gb

Posted by skyyyy007@reddit | LocalLLaMA | View on Reddit | 28 comments

Tried to test the two versions of models in my own m5 pro 64, curated the results on claude, not an expert so settings/config might not be the best. do share what results or improvements that can be attempted. test prompts were generated in claude for testing purposes.

Qwen3.6 35B A3B vs 27B UD — M5 Pro 64GB benchmark

Hardware: MacBook Pro M5 Pro 18-core · 64GB unified memory · LM Studio · MLX runtime · thinking OFF (/no_think) · 128K context

Specs

35B A3B MLX 4bit 27B UD MLX 6bit
Model size \~21.7GB \~30.5GB
Architecture MoE — 3B active/token Dense — 27B active/token
RAM at 128K ctx \~27GB \~38GB

Speed

Test 35B A3B 27B UD
800 token test \~72 tok/s · 11s \~9 tok/s · 32s
1200 token test \~70 tok/s · 16s \~9 tok/s · 70s
Advantage 8x faster baseline

Intelligence — 4-task coding benchmark

Task 35B A3B 27B UD
Auth hook (useRequireAuth) 9.5/10 — typed, mounted cleanup 8/10 — used any, no cleanup
Conflict resolution (500ms rules) 10/10 10/10
Delete account (ordered ops) 10/10 10/10
Bug identification (syncBatch) 10/10 — found 3 bugs + improvements 7/10 — found 1 bug
Overall 9.8/10 8.75/10

Test prompt: 4 coding tasks · max_tokens 1200 · temp 0.6 · /no_think system prompt

Verdict: 35B A3B wins on both speed and quality for coding tasks on 64GB Apple Silicon. 27B is slower (8x) and didn't demonstrate the reasoning depth advantage expected from a dense model on these tasks.

wanted to have some number/references when i was looking for mac to get, hopefully this helps someone out there.