Low performance in 7900XTX in Qwen 3.6 35B A3B

Posted by soyalemujica@reddit | LocalLLaMA | View on Reddit | 11 comments

When I first setup my PC, I did get 92t/s in Qwen3.6 35B A3B, and now for some reason it won't ever get past 30t/s no matter what settings I use, either rocm or vulkan.

.\llama-server.exe --model ../models/Qwen3.6-35B-A3B-UD-Q5_K_M.gguf -ctv q8_0 -ctk q8_0 --temp 0.6 --top-p 0.95 --top-k 20 --min-p 0.0 --presence-penalty 0.0 --repeat-penalty 1.0

GPU usage is 100%, wattage is at 250w\~

Using Qwen 27B Q4KM

.\llama-server.exe --model ../models/Qwen3.5-27B.Q4_K_M.gguf -ctv q8_0 -ctk q8_0 --temp 0.6 --top-p 0.95 --top-k 20 --min-p 0.0 --presence-penalty 0.0 --repeat-penalty 1.0 -fa on -fit on -c 100000

and I can't get above 29t/s which sounds reasonable I guess.