Best config for Qwen3.6?
Posted by CatSweaty4883@reddit | LocalLLaMA | View on Reddit | 21 comments
With all the high praise for the model all around, I also want to try it on my own. I have an rtx3060 12gb vram and 16gb system ram. How may I load the 27b model in my system? Or is it even possible? Tasks I want to do are: coding, some visual reasoning and agentic tasks.
Sharp_Classroom9686@reddit
just go with 35b MOE 32K Context , Q4K, and use a good Agentic Tool like Forge. Dont use OpenCode. maybe you can get 25/30tks
redblood252@reddit
How is forge better than opencode? I’m not arguing I’m wondering. I have little knowledge on agentic tools
Sharp_Classroom9686@reddit
In OpenCode, a single task typically consumes at least 25k tokens of context when using prompt-based workflows. The same tends to happen with ClaudeCode.
With Forge, however, you can achieve similar results while using only around 5–7k tokens of context.
If you’re running a local model on limited hardware (e.g., 8GB or 16GB), this difference in how context is handled becomes a game changer.
redblood252@reddit
I do have qwen 27b iq3 on 16gb vram. So it sounds good. I use superpowers and subagents with opencode. Is there something equivalent you would recommend?
Sharp_Classroom9686@reddit
Forge has native Claude Code plugin support — drop the plugin in .forge/plugins/ or symlink it from \~/.claude/plugins/ and it shows up under /plugins. Honest caveat: only gstack has been tested end-to-end so far, but I’ll try superpowers today and report back.
Subagents are first-class. Built-in registry (explorer, reviewer, tester, debug, summarizer, refactorer, docs, commit, builder) plus whatever your plugins ship. spawn_subagents fans out in parallel — goroutines + semaphore, configurable concurrency. Explore mode is built around it for read-only analysis.
redblood252@reddit
I used opencode for coding. If you say forge has natively reviewer/tester/debug/refactorer/docs that's most of what I was. Will I need to do anything specific to have these work? Or are these built in plugins already using well curated prompts?
Sharp_Classroom9686@reddit
Just use /agent name prompt -- give it a try. I'm hungry for feedback
redblood252@reddit
Getting the same error here: https://github.com/tailcallhq/forgecode/pull/3255
But with llama.cpp
Sharp_Classroom9686@reddit
mb. https://github.com/defexnicolas/forge
redblood252@reddit
Doesn't work properly using llama.cpp I get this error: Assistant message must contain either 'content' or 'tool_calls'.
If there is already something called forge why did you call your project forge as well? I've seen at least 3 different projects called 'forge'
Sharp_Classroom9686@reddit
what do you want to test i can do the run for you i has qwen3.6 27b
redblood252@reddit
It’s coding related tasks. Off the top of my head: - full documentation of a component (overview/setup/configuration/architecture/troubleshooting) - review code for correctness - refactor to get rid of dead code/useless code - simplify the codebase - curate tests (LLMs tend to make hundreds of useless tests but forget logic related tests which are more critical)
AvidCyclist250@reddit
Similar with Hermes. 20k
Sharp_Classroom9686@reddit
Not at all. Hermes is too big , but with forge you get a basic claw , for basic stuff.
Mordimer86@reddit
I'd go with 35B MoE as well, something like this:
This one takes around 10GB in VRAM for me.
Jester14@reddit
Just use
-fitps5cfw@reddit
You don't.
Your best best is the 35b MoE, which can run at acceptable speeds at q4, but not 27b, no.
CatSweaty4883@reddit (OP)
Is 3.5 9B the best I get from qwen family of models? :((
Jester14@reddit
Someone was kind enough to respond to your half-assed post and literally gave you the "best" recommendation, and you come back and ask if something else is "best"? Are you retarded?
ps5cfw@reddit
I would pick 3.6 35b any day of the week over 9b
mr_Owner@reddit
https://www.reddit.com/r/LocalLLaMA/s/OpmIz5X9Mt