Qwen3.5-27B Q4 Quantization Comparison

Posted by TitwitMuffbiscuit@reddit | LocalLLaMA | View on Reddit | 116 comments

This is a Q4 quantization sweep across all major community gguf quants of Qwen3.5-27B (available the 03/03/2026), comparing mean KLD to the BF16 baseline across different quantizers and recipes. The goal is to give people a data-driven basis for picking a file rather than just grabbing whatever is available. KLD (KL Divergence): "Faithfulness." It shows how much the quantized model's probability distribution drifts from a baseline (the probability distribution of the original weights). Lower = closer. # KLD Results — Custom Chat Dataset Evaluated on `titwitMuffbiscuit-v03-full.txt` — chat-wrapped corpus (Qwen3.5 ChatML format), 2502 blocks, 47 chunks at context 4096. Content: Science & engineering, Medicine, Philosophy, History, Finance, Culture, multilingual content and code snippets. kld\_plot\_Qwen3.5-27B # Wikitext2 + Custom Dataset Comparison Evaluated on `wikitext2_test.txt`, 72 chunks at context 4096 (plain text). The dumbbell plot shows both datasets side by side — solid circle = chat corpus (primary), semi-transparent diamond = wikitext2 (secondary). dumbbell\_Qwen3.5-27B *lmstudio-community and mradermacher standard Q4\_K\_M are identical files — stacking/blending visible on the dumbbell plot.* # Sorted by KLD — Custom Dataset *lmstudio-community Q4\_K\_M excluded — identical file to mradermacher Q4\_K\_M.* |Rank|Quantization|Size (GiB)|PPL|KLD| |:-|:-|:-|:-|:-| |1|unsloth\_Qwen3.5-27B-UD-Q4\_K\_XL|16.411|5.8901|0.005087| |2|bartowski\_Qwen3.5-27B-Q4\_K\_M|15.952|5.8882|0.005633| |3|unsloth\_Qwen3.5-27B-Q4\_K\_M|15.591|5.8948|0.006193| |4|ubergarm\_Qwen3.5-27B-smol-IQ4\_NL|15.415|5.9026|0.006371| |5|mradermacher\_Qwen3.5-27B.i1-Q4\_K\_M|15.404|5.9059|0.006469| |6|bartowski\_Qwen3.5-27B-Q4\_K\_S|14.985|5.8984|0.006720| |7|bartowski\_Qwen3.5-27B-IQ4\_XS|14.130|5.9017|0.007062| |8|bartowski\_Qwen3.5-27B-IQ4\_NL|14.851|5.9091|0.007233| |9|unsloth\_Qwen3.5-27B-Q4\_K\_S|14.686|5.9083|0.007449| |10|unsloth\_Qwen3.5-27B-IQ4\_NL|14.610|5.9147|0.007461| |11|mradermacher\_Qwen3.5-27B.i1-IQ4\_XS|13.680|5.9129|0.007569| |12|unsloth\_Qwen3.5-27B-IQ4\_XS|13.949|5.9179|0.007677| |13|mradermacher\_Qwen3.5-27B.i1-Q4\_K\_S|14.499|5.9209|0.007937| |14|mradermacher\_Qwen3.5-27B.Q4\_K\_M|15.404|5.9028|0.009201| |15|mradermacher\_Qwen3.5-27B.IQ4\_XS|13.784|5.9342|0.011463| |16|steampunque\_Qwen3.5-27B.Q4\_K\_H|14.864|5.9050|0.012091| |17|mradermacher\_Qwen3.5-27B.Q4\_K\_S|14.499|5.9293|0.012364| # Most Efficient Quantization — Custom Dataset Efficiency Score: √ (Normalized Size² + Normalized KLD²) — lower is better. |Rank|Quantization|Size (GiB)|KLD|Eff. Score| |:-|:-|:-|:-|:-| |1|bartowski\_Qwen3.5-27B-IQ4\_XS|14.130|0.007062|0.317506| |2|mradermacher\_Qwen3.5-27B.i1-IQ4\_XS|13.680|0.007569|0.341075| |3|unsloth\_Qwen3.5-27B-IQ4\_XS|13.949|0.007677|0.369294| |4|unsloth\_Qwen3.5-27B-IQ4\_NL|14.610|0.007461|0.471585| |5|unsloth\_Qwen3.5-27B-Q4\_K\_S|14.686|0.007449|0.490965| |6|mradermacher\_Qwen3.5-27B.i1-Q4\_K\_S|14.499|0.007937|0.493275| |7|bartowski\_Qwen3.5-27B-IQ4\_NL|14.851|0.007233|0.520404| |8|bartowski\_Qwen3.5-27B-Q4\_K\_S|14.985|0.006720|0.527916| |9|mradermacher\_Qwen3.5-27B.i1-Q4\_K\_M|15.404|0.006469|0.659219| |10|ubergarm\_Qwen3.5-27B-smol-IQ4\_NL|15.415|0.006371|0.659346| |11|unsloth\_Qwen3.5-27B-Q4\_K\_M|15.591|0.006193|0.716059| |12|bartowski\_Qwen3.5-27B-Q4\_K\_M|15.952|0.005633|0.835306| |13|mradermacher\_Qwen3.5-27B.Q4\_K\_M|15.404|0.009201|0.847417| |14|mradermacher\_Qwen3.5-27B.IQ4\_XS|13.784|0.011463|0.877012| |15|unsloth\_Qwen3.5-27B-UD-Q4\_K\_XL|16.411|0.005087|1.000000| |16|mradermacher\_Qwen3.5-27B.Q4\_K\_S|14.499|0.012364|1.043999| |17|steampunque\_Qwen3.5-27B.Q4\_K\_H|14.864|0.012091|1.055620| **Hardware:** i3-12100F — 64GB DDR4-3200 — RTX 3060 12GB **Evaluation tool:** llama.cpp (mainline) version: 8189 (4d828bd1a)