The use Q8 a waste of resources?

Posted by Spiderboyz1@reddit | LocalLLaMA | View on Reddit | 27 comments

I can run G4 31B Q8 XL with ctx 75k and Gwen's 27B and 35B Q8 XL ctx 145k, but I'm wondering if I'm wasting GB of SSD and VRAM.

Is it worth upgrading to Q6 K? To save disk space and gain a little more T/s and more context? Or does intelligence deteriorate significaly "Kld" or "kl"?

Is Vision affected by using Q6?

Q6 K XL is much better than "Q6 K" normal?