Difference between Qwen 3.6 27b quants for vLLM

Posted by Blues520@reddit | LocalLLaMA | View on Reddit | 5 comments

Hi guys, I am trying to understand what is the difference between these quants to run in on dual 3090's.

First there is the official FP8: https://huggingface.co/Qwen/Qwen3.6-27B-FP8

Then I see this 6-bit AWQ: https://huggingface.co/QuantTrio/Qwen3.6-27B-AWQ-6Bit

And I see CyanWiki also has a quant up: https://huggingface.co/cyankiwi/Qwen3.6-27B-AWQ-BF16-INT4

They are all similar sizes so I'm unsure what to select. What is BF16-INT4 and will it perform faster on ampere but be less accurate then FP8?