Is Mistral-3.5-Medium-128B broken in Llama CPP?

Posted by EmPips@reddit | LocalLLaMA | View on Reddit | 7 comments

Trying some if Bartowski's Q4 quants. Using Vulkan with the latest main branch as of a few hours ago.

The model is coherent - but incredibly weak. I've tried a few sampling settings as well as toggling reasoning on and off. It's lacking knowledge-depth that Magistral Small could decently handle and code tasks fail to run, let alone end up anywhere that'd register on SWE-Bench.

Wondering if anyone's put more time in, tried vLLM, or tried other quants of this model and had a better experience?