Agentic coding Qwen 3.6, Q6_K 125k context vs Q5_K_XL 200k context

Posted by ComfyUser48@reddit | LocalLLaMA | View on Reddit | 20 comments

What would you choose if you were in my shoes? How viable is 125k for agentic coding really? is "compact" really good enough, or would you go with Q6_K 125k?

I am getting around 165-170 tok/sec with either config with my 5090.