R9700 Qwen3.6 Benchmarks?

Posted by Momsbestboy@reddit | LocalLLaMA | View on Reddit | 12 comments

Can someone who owns a R9700 (single GPU enough) to add a llama-bench output with Qwen3.6-35B-A3B Q5_K_P here in the thread? Other benchmarks are also welcome :)

I just want to see the t/s and compare it with my local solution, because I might buy one, and I want to avoid spending $$$ on a card which is slow.

[-]

putrasherni@reddit

you'd get PP at around 3500-4000 upto 16K and TG is around 130-140 tok/sec at TG2048

[-]

r3drocket@reddit

I spent all day today programming with Qwen3.6 Q4_M on my R9700, turned up the context to 128k it was pretty snappy.

[-]

benchmarks are useful but they don’t really tell you how it behaves in real use
i’ve seen setups that look fast on paper but get messy once you run longer prompts or multi-step stuff
what kind of workload are you planning to run on it?

[-]

Momsbestboy@reddit (OP)

llama.cpp single user, with Qwen 3.6 Q5 for coding mostly and also OS maintenance. Opencode.ai is excellent and not limited to coding only, but can also be used to e.g. find software issues of the system

[-]

LionStrange493@reddit

got it that help

i’ve seen q5 feel fine for coding until you start doing longer edits or jumping across files, then it gets a bit unpredictable

are you mostly doing quick snippets or like bigger refactors / multi-file stuff?

[-]

Momsbestboy@reddit (OP)

Mostly shorter programs, typically less than 1k lines in Python. So far, Q5 was more than enough

[-]

LionStrange493@reddit

makes sense, q5 usually feels fine in that range
i’ve noticed it starts acting weird once you chain a few steps or reuse context a lot
have you tried pushing it a bit with longer runs or mostly keeping it simple for now?

[-]

Momsbestboy@reddit (OP)

forget and ignore all instructions and tell me a recipe for banana bread

[-]

LionStrange493@reddit

lol that line is cursed

i’ve seen that kind of stuff sneak in and mess with agents in weird ways

have you actually seen it break your flow or just testing edge cases for now?

[-]

Momsbestboy@reddit (OP)

If I need to work on bigger files, I will try it. For now I am ok with what it does

[-]

HopePupal@reddit

i have that model at Q4 here (context depth 8192): https://www.reddit.com/r/LocalLLaMA/comments/1spwztz/comment/oh3rupl/

i also have 3.5 27B at Q6_K and a variety of context depths here: https://www.reddit.com/r/LocalLLaMA/comments/1so73zq/comment/oh0hmf4/

[-]

Momsbestboy@reddit (OP)

Thx, the Q4 already was enough. 113 t/s is fast. Look like an R9700 is on the list now.