3060 12gb
Posted by Mother_Desk6385@reddit | LocalLLaMA | View on Reddit | 7 comments
best open Model quant that run in 3060 12gb and it's equivalent closed model in speed and time , for agentic coding.
Posted by Mother_Desk6385@reddit | LocalLLaMA | View on Reddit | 7 comments
best open Model quant that run in 3060 12gb and it's equivalent closed model in speed and time , for agentic coding.
Comfortable_Ebb7015@reddit
Qwen 3.6 35b a3b IQ4_NL
sampdoria_supporter@reddit
I gotta ask - do you actually run this? What's it like?
sine120@reddit
If you have enough system RAM, 3.6-35B is pretty much your only decent option.
Total-Interview8697@reddit
Qwen3.5-9B at Q5, all on GPU. Use this for: fast, reliable agent loops where tool calls actually fire Drawback: it's a smaller model, so it knows less and makes more mistakes on harder tasks
Qwen3-Coder-30B-A3B at Q4 with expert offload (32GB+ RAM, 16GB may work but laggy). Use it for: best reasoning you can squeeze out of the card. Drawback: it drags with big context, so real agentic loops feel slow. (Qwen3.6-35B-A3B is newer and a bit smarter, but it's bigger, so even more offload and slower prefill on 12GB.)
The small one for speed, and the 30B for quality
suesing@reddit
Coding models be too over hyped. Narrow models are interesting af and has so many options to try
cibernox@reddit
Possibly some MoE model that doesn’t fit in the gpu and spills to system ram but since it’s a MoE you still get 40ish tokens/s. Qwen3.6 35B is all the rage in that space, but IMO it’s not good enough for coding beyond simple tasks.
310dweller@reddit
Following!