Use Qwen3.6 right way -> send it to pi coding agent and forget

Posted by Willing-Toe1942@reddit | LocalLLaMA | View on Reddit | 79 comments

Just a reminder, the harness you use can makes a huge diffrence (your llm client and interface bascially), It's is way more important than people think, I'm using pi.dev for over 2 months and oooh boy Qwen3.6 suddenly become a monster.

my local machine + pi + exa web seach + agent-browser extenion and this setup can solve 80% of all my use cases which are:

now

- coding (python / rust / c++)
- anything require maintance / adminstration on my machines (linux machines mainly)
- web research, qwen3.6 35b with exa web research is a monster and can 100% replace perplixity for me and even give better results (only sacrific some time as side effect)

complex planning task i delegate it to kimi2.6 and coding itself is handled by Qwen3.6

at the end: Use your Qwen3.6 with Pi coding and forget 😃

[-]

horribleGuy3115@reddit

What's tour GPU setup looks like ? 120k Context window with my 3090 feels unusable in coding work in Pi.

[-]

Protopia@reddit

Even though Pi starts with a very small system prompt context, you still need to manage down your context size with judicious use of MCP servers and context optimisers.

[-]

JuniorDeveloper73@reddit

opencode with planner works better

[-]

Varmez@reddit

I kept having open code just stop with no explanation, and had worse looping.

I Pi had it make an extension out of my process, standards, and requirements docs and it’s working great.

[-]

CommonPurpose1969@reddit

Qwen 35B & 27B will loop regardless of the harness. Even with Claude Code.

[-]

Kodix@reddit

Extremely rare for 35B to loop in Hermes with a temperature setting of 1.

[-]

Willing-Toe1942@reddit (OP)

Never had single loop with unsloth udQ4

[-]

CommonPurpose1969@reddit

Would you please share your settings? Model quantization?

[-]

Varmez@reddit

Yea I still have it some, typically when like 60%+ context window, run a bigger window and compact more often to alleviate a bit

[-]

riceinmybelly@reddit

Hermes does that for me and it’s using 35B for light tasks, 27B for planning and I have Claude code for Anthropic models and pi for my z.ai and opencode subscriptions

[-]

wasnt_in_the_hot_tub@reddit

I find that opencode is not as context-efficient as pi, at least for my workflow. It might be the LSP integration

[-]

Mamaun30@reddit

What's planner?

[-]

sagiroth@reddit

What's that?

[-]

Pineapple_King@reddit

I agree

[-]

grabber4321@reddit

Im using OpenCode, havent tried Pi yet.

Problem I have with Qwen3.6 - it stops randomly (around 80-90k context) and I have to say "keep going" and then it comes back and keeps doing the task.

anybody figure out how to solve this?

[-]

CommonPurpose1969@reddit

pi.dev is a bit YOLO. It would not ask for any permission by design.O_o

[-]

Cupakov@reddit

yeah, it's intended to be sandboxed, i use bubblewrap with it usually

[-]

CommonPurpose1969@reddit

It is not about the havoc it can wreak on the main system, which, of course, is an issue too. One misunderstanding, and it goes on and does its thing, changing the source code, and the user is left with the changes to revert manually or again with the LLM, hoping it reverts it properly.

[-]

Karyo_Ten@reddit

Agentic LLMs are trained to ship ~~slop~~ code unfortunately so they do use git commit and git checkout.

It's the harness, or at least the system prompt that needs to add limits or create a git-guardrail extension that prevents git commit.

But at least Pi allows people to tune to their usage. There are low value glue or data extraction I don't mind giving the LLM free reign

[-]

Ok-Measurement-1575@reddit

I dunno why this isn't the default on everything tbh.

There isn't even a launch arg on opencode to enable it which is very short sighted, IMHO.

[-]

Willing-Toe1942@reddit (OP)

Real men accept their fate. You fire the agent and forget :D

[-]

DerDave@reddit

Which Qwen3.6?

[-]

Willing-Toe1942@reddit (OP)

35ba3b (MoE version)

[-]

EbbNorth7735@reddit

Wait until you get to try 27B or 122B models

[-]

Karyo_Ten@reddit

Qwen3.6-122B ? It's out?

[-]

No-Upstairs-4031@reddit

I agree; I use the gemma4-26b with a custom-designed Pi harness. It works much more smoothly and is easier to control than OpenClaw, Claude Code, and other harnesses.

[-]

MoodDelicious3920@reddit

Which is the best harness, codex, forgecode,opencode, or a simple custom made harness with basic access to web tools and code execution ?

[-]

Willing-Toe1942@reddit (OP)

yipe. the difference is huge and surprising me

[-]

bonobomaster@reddit

For everyone that now wants to try (some) Pi, know that this Pi could serve you a slice of sudo rm -rf in a heartbeat!

Standard Pi Agent has ZERO command filtering or sandboxing per default!

If Pi decides it's time for cake, then cake will be served!

Okay, enough with the Pi puns already!

[-]

Cupakov@reddit

there's an optional extension with minimal guardrails that ships with pi

[-]

bonobomaster@reddit

Yeah, I know.

But there will be most likely some people like me, who install first and ask questions later. ;)

Luckily, I caught my "little" oversight pretty early on, as, to my surprise, Pi searched the whole C drive for a specific directory instead of being confined to it's working directory or at least the user directory.

Just puttin it out here...

[-]

Paradigmind@reddit

What was your mistake? I'm asking because I'd like to avoid it. :D

[-]

Steus_au@reddit

little-coder has all plugins you need for pi

[-]

bonobomaster@reddit

I just checked out the repo. That looks quite interesting! Thx!

[-]

gladfelter@reddit

Can you point to the specific extension NPM packages that you're using, specifically the exa web search and agent browser extensions? There are many. These are the most popular that match your descriptions:

https://pi.dev/packages/@feniix/pi-exa
most downloads, has exa ai web search, but somehow was only published a day ago? I wonder if they extrapolate based on daily?
https://pi.dev/packages/@counterposition/pi-web-search
This extension has been around a bit longer and has a good amount of downloads
https://pi.dev/packages/pi-agent-browser
Been around a while and matches your description well

I've found the quality and ease-of-use of pi extensions varies dramatically, so I'm very interested in hearing exactly what has worked for you, since guessing will most likely result in frustration.

[-]

mantafloppy@reddit

but somehow was only published a day ago

This seem to be the version time stamp, the repo is about a month old if you go on Github or Npmjs.

[-]

gladfelter@reddit

yeah, I think you're right. I suppose absolute age is not a sign of notability and authority in this space. Regular maintenance probably is a better signal, so the UI focuses on that. I'm getting too old, I guess.

[-]

mantafloppy@reddit

Did you try any of them?

[-]

gladfelter@reddit

pi-web-access is solid. pi-agent-browser isn't bad. It installed its dependency on first run, at least.

[-]

mantafloppy@reddit

I'll check it out.

Agent being able to find info on the web themself is so useful.

[-]

gladfelter@reddit

I'm at work rn. I will later if I don't hear from OP. I've installed two other pi web search packages and one of the two sucked so hard. It grabbed a random Gemini model that I didn't have quota for and pi.dev won't let you prune the available models when you connect a provider, so I eventually had to modify the source code to the extension. I don't normally work in Typescript and npm, so I had a bit of learning to do.

[-]

UnWiseSageVibe@reddit

Not related but I setup a selfhosted firecrawl for web searches and fetching, works well.

https://docs.firecrawl.dev/contributing/self-host

[-]

philmarcracken@reddit

I tried pi and it was ok, I find late to be similar and like its out of the box experience, mostly. It needed one tweak to its tool abilities with powershell, that was it.

The stage it has between plan and then subagent spawning is fantastic. Snapshot beforehand and off it goes. Im thinking it might even accept building a mermaid diagram to save on even more context size in comprehending larger codebases.

[-]

Comfortable-Crew-919@reddit

Qwen3.6 35B with gsd-2 (built on top of pi) has been great for planning and coding. Running on M4 Pro 64gb via oMLX with recommended Qwen settings for coding and 128k context.

[-]

Naz6uL@reddit

I'm currently using oMLX + opencode, but I'll give this one a try.

[-]

Ok-Measurement-1575@reddit

llama-server with built in web server and locally hosted mcp = chatgpt at home

I have no doubt with enough time I could mcp all the things that make gpt/claude appear intelligent.

It's just kinda magical watching your various tools fire and getting straight up sota results at home for peanuts.

[-]

Mennas11@reddit

I have been using Aider with this model (qwen3.6 35ba3 Q4). It's been pretty good, but mostly just doing refactoring and some small functions. I only have a mac pro m2 with 32gb ram, so it's a little slow for bigger things like extracting some functions to a new class and file, but pretty usable.

[-]

AvidCyclist250@reddit

i use nous hermes and 27b. 16gb vram, 80k context coz i like my DE.

[-]

buttplugs4life4me@reddit

Just use little-coder. If you came from OpenCode and sometimes had the issue that Qwen would run into a "soft" loop, i.e. just try and try and not find any solutions, then little-coder is night and day difference.

Plus unless you just do "allow all" for commands, I had to babysit and contig write A LOT for OpenCode and meanwhile Little-Coder is fine.

[-]

Willing-Toe1942@reddit (OP)

tried little Coder vs pi in some complex code modification benchmark and pi wins by big margin and need less steering

[-]

buttplugs4life4me@reddit

Little-Coder is just pi with some extra extensions for small model steering, so that would be a little weird

[-]

BannedGoNext@reddit

pi.dev beats everything hard for qwen 3.6 in speed, and matches other harnesses for accuracy.

[-]

buttplugs4life4me@reddit

Little-Coder is Pi with some extra extensions for small model steering

[-]

dondiegorivera@reddit

What context size do you use?

[-]

Willing-Toe1942@reddit (OP)

200k and - np 3 which mean I can spawn up to 3 parallel coding session

[-]

sdfgeoff@reddit

My experience is that pi was pretty bad compared to claude code and hermes - all with Qwen3.6 27B running at the same settings.

What makes you say pi was better?

[-]

bromatofiel@reddit

Curious to know how you run qwen with CC

[-]

sdfgeoff@reddit

https://unsloth.ai/docs/basics/claude-code

[-]

SawToothKernel@reddit

What strategies are you using with pi? As I understand it, it's pretty bare bones at the outset.

[-]

Cupakov@reddit

I use pi as well and beside specifying the setups i prefer to develop in in the system prompt and adding Matt Pocock's /grill-me skill, not that much is needed imo. I experimented a bit with persistent memory stuff but it doesn't seem that useful to be honest, or at least i couldn't get it to be useful.

[-]

epicfilemcnulty@reddit

not the OP, but I have a pretty similar setup -- my own minimal coding agent (pretty much the same as Pi but in Lua) -- and it turns out you don't need that much. The harness has 4 basic tools (read, write, edit, bash), and I have written two skills: idea shaper and coding planner, and a bunch of custom commands using them, like /plan this, /review that, and that's basically it. Works like a charm.

[-]

SawToothKernel@reddit

That's good to hear, thanks.

[-]

eikenberry@reddit

Why is it better? Without some examination of why it is better there is no reason to believe this is anything more than it fitting your habits/workflow better and nothing about it being better in general.

[-]

Skystunt@reddit

never heard of pi before, will give it a try

[-]

Ha_Deal_5079@reddit

pi setup is key fr. agent config management gets messy fast and skillsgate handles that if u havent seen it https://github.com/skillsgate/skillsgate

[-]

e9n-dev@reddit

Looks complicated, I just ask my Pi to install it himself or symlink it to the project if I made it myself.

[-]

Pineapple_King@reddit

I find opencode way more structured and successful. People have pointed out some downsides to opencode, too, mainly being slower. But I strongly prefer the structured approach of opencode, and have a very high successrate with it. not sure why people insist on pi

[-]

Still_Flower5350@reddit

I think it's mainly due to PI being easier for fire and forget workloads, while OpenCode shines in a more interactive approach

[-]

Pineapple_King@reddit

Interesting point!

[-]

rm-rf-rm@reddit

how are you running web search + llm?

[-]

tempedbyfate@reddit

I'm trying to optimize my setup with Qwen 3.6 27B with Pi as my harness. If you don't mind, could you share more details about your set up please?

Are you running qwen 3.6 using llama.cpp/server or vLLM (for MTP)? What args do you use for these? do have thinking on or off? are you using custom jinja template? there are some threads about issues with tool calling with default template. Thanks in adva

[-]

Willing-Toe1942@reddit (OP)

I'm using llamacpp (with llama-swap) but in your case definitely go for vllm and get mtp enable this should be way faster. here is my Config if you want to try llamacpp (configured for 3 parallel requests) model unsloth Qwen3.6-35B-A3B-GGUF (UD Q4 XL)

--port ${PORT} --host 0.0.0.0 \ --flash-attn on --no-mmap --jinja \ --temp 0.7 --top-p 0.95 --top-k 20 --min-p 0.00 \ --presence-penalty 1.5 \ --ctx-size 600000 \ --cont-batching -np 3 -b 4096 -ub 2048 \ --chat-template-kwargs '{"preserve_thinking": true}' \ --image-min-tokens 300 --image-max-tokens 512

[-]

Willing-Toe1942@reddit (OP)

it's a funny meme. if you want your boy to transform into mma fighter and be a real man: send him to Dagestan and forget (youtube)

[-]

gtek_engineer66@reddit

Its an Mma thing. Khabib beat Connor

[-]

solarkraft@reddit

What did you use before? How does it compare to the other harnesses?

[-]

Willing-Toe1942@reddit (OP)

I tried everything basically: opencode / cline/ kilo..etc

nothing come close to pi. it's light and make qwen3.6 truelly shine

I also did a benchmark with backend modification and nothing passed except pi as harness