Use Qwen3.6 right way -> send it to pi coding agent and forget
Posted by Willing-Toe1942@reddit | LocalLLaMA | View on Reddit | 79 comments

Just a reminder, the harness you use can makes a huge diffrence (your llm client and interface bascially), It's is way more important than people think, I'm using pi.dev for over 2 months and oooh boy Qwen3.6 suddenly become a monster.
my local machine + pi + exa web seach + agent-browser extenion and this setup can solve 80% of all my use cases which are:
now
- coding (python / rust / c++)
- anything require maintance / adminstration on my machines (linux machines mainly)
- web research, qwen3.6 35b with exa web research is a monster and can 100% replace perplixity for me and even give better results (only sacrific some time as side effect)
complex planning task i delegate it to kimi2.6 and coding itself is handled by Qwen3.6
at the end: Use your Qwen3.6 with Pi coding and forget 😃
horribleGuy3115@reddit
What's tour GPU setup looks like ? 120k Context window with my 3090 feels unusable in coding work in Pi.
Protopia@reddit
Even though Pi starts with a very small system prompt context, you still need to manage down your context size with judicious use of MCP servers and context optimisers.
JuniorDeveloper73@reddit
opencode with planner works better
Varmez@reddit
I kept having open code just stop with no explanation, and had worse looping.
I Pi had it make an extension out of my process, standards, and requirements docs and it’s working great.
CommonPurpose1969@reddit
Qwen 35B & 27B will loop regardless of the harness. Even with Claude Code.
Kodix@reddit
Extremely rare for 35B to loop in Hermes with a temperature setting of 1.
Willing-Toe1942@reddit (OP)
Never had single loop with unsloth udQ4
CommonPurpose1969@reddit
Would you please share your settings? Model quantization?
Varmez@reddit
Yea I still have it some, typically when like 60%+ context window, run a bigger window and compact more often to alleviate a bit
riceinmybelly@reddit
Hermes does that for me and it’s using 35B for light tasks, 27B for planning and I have Claude code for Anthropic models and pi for my z.ai and opencode subscriptions
wasnt_in_the_hot_tub@reddit
I find that opencode is not as context-efficient as pi, at least for my workflow. It might be the LSP integration
Mamaun30@reddit
What's planner?Â
sagiroth@reddit
What's that?
Pineapple_King@reddit
I agree
grabber4321@reddit
Im using OpenCode, havent tried Pi yet.
Problem I have with Qwen3.6 - it stops randomly (around 80-90k context) and I have to say "keep going" and then it comes back and keeps doing the task.
anybody figure out how to solve this?
CommonPurpose1969@reddit
pi.dev is a bit YOLO. It would not ask for any permission by design.O_o
Cupakov@reddit
yeah, it's intended to be sandboxed, i use bubblewrap with it usually
CommonPurpose1969@reddit
It is not about the havoc it can wreak on the main system, which, of course, is an issue too. One misunderstanding, and it goes on and does its thing, changing the source code, and the user is left with the changes to revert manually or again with the LLM, hoping it reverts it properly.
Karyo_Ten@reddit
Agentic LLMs are trained to ship ~~slop~~ code unfortunately so they do use git commit and git checkout.
It's the harness, or at least the system prompt that needs to add limits or create a git-guardrail extension that prevents git commit.
But at least Pi allows people to tune to their usage. There are low value glue or data extraction I don't mind giving the LLM free reign
Ok-Measurement-1575@reddit
I dunno why this isn't the default on everything tbh.Â
There isn't even a launch arg on opencode to enable it which is very short sighted, IMHO.
Willing-Toe1942@reddit (OP)
Real men accept their fate. You fire the agent and forget :D
DerDave@reddit
Which Qwen3.6?
Willing-Toe1942@reddit (OP)
35ba3b (MoE version)
EbbNorth7735@reddit
Wait until you get to try 27B or 122B models
Karyo_Ten@reddit
Qwen3.6-122B ? It's out?
No-Upstairs-4031@reddit
I agree; I use the gemma4-26b with a custom-designed Pi harness. It works much more smoothly and is easier to control than OpenClaw, Claude Code, and other harnesses.
MoodDelicious3920@reddit
Which is the best harness, codex, forgecode,opencode, or a simple custom made harness with basic access to web tools and code execution ?Â
Willing-Toe1942@reddit (OP)
yipe. the difference is huge and surprising me
bonobomaster@reddit
For everyone that now wants to try (some) Pi, know that this Pi could serve you a slice of sudo rm -rf in a heartbeat!
Standard Pi Agent has ZERO command filtering or sandboxing per default!
If Pi decides it's time for cake, then cake will be served!
Okay, enough with the Pi puns already!
Cupakov@reddit
there's an optional extension with minimal guardrails that ships with pi
bonobomaster@reddit
Yeah, I know.
But there will be most likely some people like me, who install first and ask questions later. ;)
Luckily, I caught my "little" oversight pretty early on, as, to my surprise, Pi searched the whole C drive for a specific directory instead of being confined to it's working directory or at least the user directory.
Just puttin it out here...
Paradigmind@reddit
What was your mistake? I'm asking because I'd like to avoid it. :D
Steus_au@reddit
little-coder has all plugins you need for pi
bonobomaster@reddit
I just checked out the repo. That looks quite interesting! Thx!
gladfelter@reddit
Can you point to the specific extension NPM packages that you're using, specifically the exa web search and agent browser extensions? There are many. These are the most popular that match your descriptions:
I've found the quality and ease-of-use of pi extensions varies dramatically, so I'm very interested in hearing exactly what has worked for you, since guessing will most likely result in frustration.
mantafloppy@reddit
This seem to be the version time stamp, the repo is about a month old if you go on Github or Npmjs.
gladfelter@reddit
yeah, I think you're right. I suppose absolute age is not a sign of notability and authority in this space. Regular maintenance probably is a better signal, so the UI focuses on that. I'm getting too old, I guess.
mantafloppy@reddit
Did you try any of them?
gladfelter@reddit
pi-web-access is solid. pi-agent-browser isn't bad. It installed its dependency on first run, at least.
mantafloppy@reddit
I'll check it out.Â
Agent being able to find info on the web themself is so useful.
gladfelter@reddit
I'm at work rn. I will later if I don't hear from OP. I've installed two other pi web search packages and one of the two sucked so hard. It grabbed a random Gemini model that I didn't have quota for and pi.dev won't let you prune the available models when you connect a provider, so I eventually had to modify the source code to the extension. I don't normally work in Typescript and npm, so I had a bit of learning to do.
UnWiseSageVibe@reddit
Not related but I setup a selfhosted firecrawl for web searches and fetching, works well.
https://docs.firecrawl.dev/contributing/self-host
philmarcracken@reddit
I tried pi and it was ok, I find late to be similar and like its out of the box experience, mostly. It needed one tweak to its tool abilities with powershell, that was it.
The stage it has between plan and then subagent spawning is fantastic. Snapshot beforehand and off it goes. Im thinking it might even accept building a mermaid diagram to save on even more context size in comprehending larger codebases.
Comfortable-Crew-919@reddit
Qwen3.6 35B with gsd-2 (built on top of pi) has been great for planning and coding. Running on M4 Pro 64gb via oMLX with recommended Qwen settings for coding and 128k context.
Naz6uL@reddit
I'm currently using oMLX + opencode, but I'll give this one a try.
Ok-Measurement-1575@reddit
llama-server with built in web server and locally hosted mcp = chatgpt at home
I have no doubt with enough time I could mcp all the things that make gpt/claude appear intelligent.
It's just kinda magical watching your various tools fire and getting straight up sota results at home for peanuts.
Mennas11@reddit
I have been using Aider with this model (qwen3.6 35ba3 Q4). It's been pretty good, but mostly just doing refactoring and some small functions. I only have a mac pro m2 with 32gb ram, so it's a little slow for bigger things like extracting some functions to a new class and file, but pretty usable.
AvidCyclist250@reddit
i use nous hermes and 27b. 16gb vram, 80k context coz i like my DE.
buttplugs4life4me@reddit
Just use little-coder. If you came from OpenCode and sometimes had the issue that Qwen would run into a "soft" loop, i.e. just try and try and not find any solutions, then little-coder is night and day difference.
Plus unless you just do "allow all" for commands, I had to babysit and contig write A LOT for OpenCode and meanwhile Little-Coder is fine.
Willing-Toe1942@reddit (OP)
tried little Coder vs pi in some complex code modification benchmark and pi wins by big margin and need less steering
buttplugs4life4me@reddit
Little-Coder is just pi with some extra extensions for small model steering, so that would be a little weird
BannedGoNext@reddit
pi.dev beats everything hard for qwen 3.6 in speed, and matches other harnesses for accuracy.
buttplugs4life4me@reddit
Little-Coder is Pi with some extra extensions for small model steering
dondiegorivera@reddit
What context size do you use?
Willing-Toe1942@reddit (OP)
200k and - np 3 which mean I can spawn up to 3 parallel coding session
sdfgeoff@reddit
My experience is that pi was pretty bad compared to claude code and hermes - all with Qwen3.6 27B running at the same settings.
What makes you say pi was better?
bromatofiel@reddit
Curious to know how you run qwen with CC
sdfgeoff@reddit
https://unsloth.ai/docs/basics/claude-code
SawToothKernel@reddit
What strategies are you using with pi? As I understand it, it's pretty bare bones at the outset.
Cupakov@reddit
I use pi as well and beside specifying the setups i prefer to develop in in the system prompt and adding Matt Pocock's /grill-me skill, not that much is needed imo. I experimented a bit with persistent memory stuff but it doesn't seem that useful to be honest, or at least i couldn't get it to be useful.
epicfilemcnulty@reddit
not the OP, but I have a pretty similar setup -- my own minimal coding agent (pretty much the same as Pi but in Lua) -- and it turns out you don't need that much. The harness has 4 basic tools (read, write, edit, bash), and I have written two skills: idea shaper and coding planner, and a bunch of custom commands using them, like /plan this, /review that, and that's basically it. Works like a charm.
SawToothKernel@reddit
That's good to hear, thanks.
eikenberry@reddit
Why is it better? Without some examination of why it is better there is no reason to believe this is anything more than it fitting your habits/workflow better and nothing about it being better in general.
Skystunt@reddit
never heard of pi before, will give it a try
Ha_Deal_5079@reddit
pi setup is key fr. agent config management gets messy fast and skillsgate handles that if u havent seen it https://github.com/skillsgate/skillsgate
e9n-dev@reddit
Looks complicated, I just ask my Pi to install it himself or symlink it to the project if I made it myself.
Pineapple_King@reddit
I find opencode way more structured and successful. People have pointed out some downsides to opencode, too, mainly being slower. But I strongly prefer the structured approach of opencode, and have a very high successrate with it. not sure why people insist on pi
Still_Flower5350@reddit
I think it's mainly due to PI being easier for fire and forget workloads, while OpenCode shines in a more interactive approachÂ
Pineapple_King@reddit
Interesting point!
rm-rf-rm@reddit
how are you running web search + llm?
tempedbyfate@reddit
I'm trying to optimize my setup with Qwen 3.6 27B with Pi as my harness. If you don't mind, could you share more details about your set up please?
Are you running qwen 3.6 using llama.cpp/server or vLLM (for MTP)? What args do you use for these? do have thinking on or off? are you using custom jinja template? there are some threads about issues with tool calling with default template. Thanks in adva
Willing-Toe1942@reddit (OP)
I'm using llamacpp (with llama-swap) but in your case definitely go for vllm and get mtp enable this should be way faster. here is my Config if you want to try llamacpp (configured for 3 parallel requests) model unsloth Qwen3.6-35B-A3B-GGUF (UD Q4 XL)
--port ${PORT} --host 0.0.0.0 \ --flash-attn on --no-mmap --jinja \ --temp 0.7 --top-p 0.95 --top-k 20 --min-p 0.00 \ --presence-penalty 1.5 \ --ctx-size 600000 \ --cont-batching -np 3 -b 4096 -ub 2048 \ --chat-template-kwargs '{"preserve_thinking": true}' \ --image-min-tokens 300 --image-max-tokens 512
CornerLimits@reddit
Cool thing about pi is that you can configure it easily with skills and extensions. Anyway going from llamacpp web chat to pi is just…wow
Apart_Boat9666@reddit
True, its great i am using without thinking mode its still very usable sure, i dont trust this for full freedom or vague query. In one shot problem its very good and 36 tps is very usable
Southern_Sun_2106@reddit
Can you please explain the Dagestan connection?
Willing-Toe1942@reddit (OP)
it's a funny meme. if you want your boy to transform into mma fighter and be a real man: send him to Dagestan and forget (youtube)
gtek_engineer66@reddit
Its an Mma thing. Khabib beat Connor
solarkraft@reddit
What did you use before? How does it compare to the other harnesses?
Willing-Toe1942@reddit (OP)
I tried everything basically: opencode / cline/ kilo..etc
nothing come close to pi. it's light and make qwen3.6 truelly shine
I also did a benchmark with backend modification and nothing passed except pi as harness