Hey, has anyone here used Qwen3.5-27B-NVFP4-GGUF with llama.cpp yet?

Posted by mossy_troll_84@reddit | LocalLLaMA | View on Reddit | 26 comments

Hey!

I was wondering if anyone of you have used Qwen3.5-27B-NVFP4-GGUF on RTX5090 on llama.cpp? I have downloaded and tested today Freenixi/AxionML-Qwen3.5-27B-NVFP4-GGUF and it's quire impressive (quality of answers and deffinatelly beter in non-english langauges) Also what was your speed on llama.cpp? Just asking out of curiosity. Please share your experience. Thanks!