Is a heavily quantised Q235b any better than Q32b?

Posted by Secure_Reflection409@reddit | LocalLLaMA | View on Reddit | 42 comments

I've come to the conclusion that Qwen's 235b at Q2K~, perhaps unsurprisingly, is not better than Qwen3 32b Q4KL but I still wonder about the Q3? Gemma2 27b Q3KS used to be awesome, for example. Perhaps Qwen's 235b at Q3 will be amazing? Amazing enough to warrant 10 t/s?

I'm in the process of getting a mish mash of RAM I have in the cupboard together to go from 96GB to 128GB which should allow me to test Q3... if it'll POST.

Is anyone already running the Q3? Is it better for code / design work than the current 32b GOAT?