You guys seen this? beats turboquant by 18%

Posted by OmarBessa@reddit | LocalLLaMA | View on Reddit | 25 comments

https://github.com/Dynamis-Labs/spectralquant

basically, they discard 97% of the kv cache key vectors after figuring out which ones have the most signal