Bad model quality qwen3.6-27b with hipfire on strix halo

Posted by sterby92@reddit | LocalLLaMA | View on Reddit | 5 comments

Hi, I'm running the default qwen3.6-27b with dflash with the latest hipfire on strix halo (Rocm 7.2). It works an gives a decently fast performance (i guess). But the output quality is really subpar. It does barely manage to do a tool call in openwebui and even messes up todays date for another date (todays date in the system prompt). I'm not sure if I'm doing something wrong, or if it is expected and we just wait for better support and better quants?

  run 1/5 pp 102 tok/s | TTFT 196 ms | decode 34.9 tok/s (128 tok)
  run 2/5 pp 102 tok/s | TTFT 196 ms | decode 34.9 tok/s (128 tok)
  run 3/5 pp 103 tok/s | TTFT 194 ms | decode 34.7 tok/s (128 tok)
  run 4/5 pp 103 tok/s | TTFT 195 ms | decode 34.7 tok/s (128 tok)
  run 5/5 pp 102 tok/s | TTFT 196 ms | decode 34.9 tok/s (128 tok)

  Prefill    tok/s      mean      min      max    stdev     ms
  ────────────────────────────────────────────────────────────────
  pp128               165.2    164.9    165.4      0.2   775.0
  pp512               270.9    270.5    271.2      0.2   1890.3

                       mean      min      max    stdev
  ──────────────────────────────────────────────────────────
  Prefill  tok/s      102.3    101.8    102.9      0.4   (user prompt, 20 tok)
  TTFT     ms         195.5    194.4    196.4      0.7
  Decode   tok/s       34.8     34.7     34.9      0.1
  Wall     tok/s       33.1     33.0     33.1      0.0

  Decode ms/tok: 28.72