Qwen3.6 can code

Posted by Purple-Programmer-7@reddit | LocalLLaMA | View on Reddit | 33 comments

Got my 5th error on OpenAI models tonight and said “fuck it, let’s see how Qwen3.6-27b can do”.

Linked it up in opencode. Asked it to so some svelte 5.

Perfect result.

N=1 and it took longer than it would take the paid apis… the next 12 months will be quite interesting

[-]

gestapov@reddit

Im sorry am just a beginner in local LLM, but is opencode the same as openclaw? Local agent?

[-]

Purple-Programmer-7@reddit (OP)

OpenCode is a cli coding harness like ClaudeCode.

Different from OpenClaw which, to your point, is an autonomous agent.

[-]

Intelligent_Ice_113@reddit

that's how Chinese slowly killing OpenAI 🤫🥰

[-]

GPT 5.4 is great for planning what you want to do. It thinks a lot better than Opus or Sonnet and critiques Claude a lot with stunning results. I'm definitely going to cancel my Claude sub and try Kimi 2.6 for comparison. Opus 4.7 is a massive let down. I love frontier models for planning. But now, with Qwen 3.6... all my coding is going local.

[-]

szansky@reddit

What GPU are u using ?

[-]

Fabulous_Fact_606@reddit

The Chinese needs to start building a something like a GTX 3090 1000G vram and sell for cheap Nvidia is cooked.

[-]

EggDroppedSoup@reddit

OpenAI isn't open at all

[-]

SnooPaintings8639@reddit

Can't tell if trolling or...

[-]

Dany0@reddit

Yes it is! It stands for Open your wallets. And Insides

[-]

kmp11@reddit

Next twelve month will possibly see local models shrink 50-90% if researchers can get technology like 1.58bit models and TurboQuant to work.

[-]

exaknight21@reddit

I feel like LLMs went from 1T to 500B to 300B then 200B to 100B then 70B now 27B all within what I can safely say feels like yesterday. So I think by the end of 2026 we have agentic 4B models doing dank stuff.

Can’t wait

[-]

3oclockam@reddit

What agent are people using for this? Anyone using Hermes for coding with qwen?

[-]

ranting80@reddit

Opencode works like butter.

[-]

Kodix@reddit

I'm using hermes a *little* bit for coding. With Qwen3.6-35B. Not directly - it's not what I really want from it - but earlier I made it autonomously fix the out of date OpenViking plugin and it just.. did that, fully, with almost no input from me. And now the OpenViking memory finally works properly.

So yeah, it's capable *enough*.

[-]

ranting80@reddit

I just bought a spark because of this. Models that can fit inside of this VRAM window and code everything I need to was a dream. Qwen 3.6 122b is the model I want to run on it when/if it comes out. Then I can pretty much leave the internet behind.

[-]

tuvok86@reddit

noob here: im testing locally on 4090 and cant get opencode to do as well as pi; apparently it's using lot more token because it's sending the thinking blocks back to the backend, but that's not needed?

[-]

SnooPaintings8639@reddit

I am not sure but I think pi does also send all the tokens (including thinking) back to the API. The difference is - pi keeps history as is, and opencode does some 'magic optimization' on the context. This means, the opencode breaks the cache quite often and cause the entire prompt to be reprocessed again, and again, which is slow.

OpenCode is great for external APIs, but I don't think it's best fit for local inference.

This is very much an opinion, and I'd be happy if someone explained me if I got something wrong here.

[-]

GroundbreakingMall54@reddit

yeah kv cache is a memory monster. fp8 helps but you still sacrifice context for vram. either batch smaller or just accept the limit tbh

[-]

Maximum-Wishbone5616@reddit

? 27b fits comfortable even on mini AI rig like 2x 5090 (FP8/KV16/262k context uses something around 60GB VRAM)

[-]

Potential-Leg-639@reddit

What the hell are you talking about? „mini AI rig like 2 x 5090“

[-]

Glittering-Call8746@reddit

Cheap electricity bill to have 2 5090 running all times. 😁

[-]

SummarizedAnu@reddit

Bro thinks cheap mini rig is even a 5090.

Most people here are either running rtx 3060 or below. Only a few people who are just rich or are trainings making all these quant /abliterated models for us are the ones with 2 4090s ,2 5090s etc.

[-]

met_MY_verse@reddit

My 16GB RX580 agrees.

[-]

SummarizedAnu@reddit

I used to use 16 gb ram and no gpu until 5 moths ago for 10 years .

[-]

dkarlovi@reddit

mini AI rig like 2x 5090

[-]

LegitimateCopy7@reddit

Perfect result

Perfect proof

[-]

Purple-Programmer-7@reddit (OP)

Not worth my time. If you don’t trust, all good.

[-]

LegacyRemaster@reddit

I was completing the merge between 2 scripts and Claude gave me this error. I started Qwen 3.6 27b q8 ---> corrected and fixed script. And it found some bugs that Claude had added. I asked Gemini Pro to evaluate the Qwen result and it said 100% ok. Today I'm also evaluating it with Minimax 2.7 Q4 local and it works very well... Just to better understand which workflow to use for validation. Whether 100% local or hybrid. Note: The error is clear: they tell you to use the API with Claudecode or VsCode and not chat. True. But LMstudio with Qwen's long context on an RTX 6000 96GB did the job "only" using chat.

[-]

Qwen3.6 can code

gestapov@reddit

Purple-Programmer-7@reddit (OP)

Intelligent_Ice_113@reddit

ranting80@reddit

szansky@reddit

Fabulous_Fact_606@reddit

EggDroppedSoup@reddit

SnooPaintings8639@reddit

Dany0@reddit

kmp11@reddit

exaknight21@reddit

3oclockam@reddit

ranting80@reddit

Kodix@reddit

ranting80@reddit

tuvok86@reddit

SnooPaintings8639@reddit

GroundbreakingMall54@reddit

Maximum-Wishbone5616@reddit

Potential-Leg-639@reddit

Glittering-Call8746@reddit

SummarizedAnu@reddit

met_MY_verse@reddit

SummarizedAnu@reddit

dkarlovi@reddit

LegitimateCopy7@reddit

Purple-Programmer-7@reddit (OP)

LegacyRemaster@reddit

CalligrapherFar7833@reddit

LegacyRemaster@reddit

CalligrapherFar7833@reddit

GroundbreakingMall54@reddit

cr0wburn@reddit