3060 12gb

Posted by Mother_Desk6385@reddit | LocalLLaMA | View on Reddit | 7 comments

best open Model quant that run in 3060 12gb and it's equivalent closed model in speed and time , for agentic coding.

[-]

sampdoria_supporter@reddit

I gotta ask - do you actually run this? What's it like?

[-]

sine120@reddit

If you have enough system RAM, 3.6-35B is pretty much your only decent option.

[-]

Total-Interview8697@reddit

Qwen3.5-9B at Q5, all on GPU. Use this for: fast, reliable agent loops where tool calls actually fire Drawback: it's a smaller model, so it knows less and makes more mistakes on harder tasks

Qwen3-Coder-30B-A3B at Q4 with expert offload (32GB+ RAM, 16GB may work but laggy). Use it for: best reasoning you can squeeze out of the card. Drawback: it drags with big context, so real agentic loops feel slow. (Qwen3.6-35B-A3B is newer and a bit smarter, but it's bigger, so even more offload and slower prefill on 12GB.)

The small one for speed, and the 30B for quality

[-]

3060 12gb

Comfortable_Ebb7015@reddit

sampdoria_supporter@reddit

sine120@reddit

Total-Interview8697@reddit

suesing@reddit

cibernox@reddit

310dweller@reddit