Is qwen3 coder next still relevant with qwen3.5 release for agentic coding?
Posted by ROS_SDN@reddit | LocalLLaMA | View on Reddit | 30 comments
Basically the title. I know it will depend on your quant, but with 48gb of vram inbound, I'm curious on the communities opinion before I get the chance to vibe check.
I see a lot of people saying 35b / 27b is better, and curious on what are more focused discussion on this brings matter.
PromptInjection_@reddit
At least for me it is not relevant any longer.
Qwen 3.5 is clearly superior.
Far-Low-4705@reddit
For me it is either qwen 3.5 35b at Q8 or qwen 3 coder next 80b at Q4
Both run at 50 T/s, but I have no idea which is better.
Rn I’m leaning towards 3.5, it has (toggle-able) reasoning which is a big plus imo, and vision. Only downside is it’s 2x smaller.
ROS_SDN@reddit (OP)
How do you toggle the reasoning without a llama server reset? Any idea?
Far-Low-4705@reddit
this is what i have for llama-swap for 27b
idk if u use llama-swap, but it lets u swap models easily and define a config file for swapping models
basicially route "thinking" prompts to the `qwen3.5-27b-thinking:Q4_0` and instruct to the `qwen3.5-27b-instruct:Q4_0` id.
Also you can use --reasoning on/off flag for llama-server instead of setting chat-template-kwargs
ionizing@reddit
Agreed. I find 122B superior to Coder Next in my framework, so much so that I simply deleted the model rather than keep it around for potential use.
RedParaglider@reddit
You must be one of those mythical 256bros :)
mr_zerolith@reddit
Senior developer here.
I don't find 122b to be particularly good, GPT OSS 120b is much faster and equivalent or a little better in quality
Qwen Next Coder was super unimpressive.
JsThiago5@reddit
there is a gpt oss puzzle that is 88b and seems to have the same performance as 120b
mr_zerolith@reddit
but does it have the quality..
my_name_isnt_clever@reddit
What? I swapped out GPT-OSS for Qwen 3.5 122b and it kicks it's ass at everything I've tried. It's not even close on tool call reliability.
mr_zerolith@reddit
ah, my assessment doesn't include tool usage at all, just code generation from a prompt
my_name_isnt_clever@reddit
Well yeah, it's double the active parameters so it was always going to be slower. I'm not surprised you prefer gpt-oss for direct chats, but qwen 3.5 is an agentic workhorse.
mr_zerolith@reddit
Good to know.
I dig Seed OSS 197B because i get \~20% better speed than Qwen3.5 133b and it seems good with agentic and is awesome at coding.
For coding purposes i just couldn't tell the difference between the two \~120b models
madtopo@reddit
Do you use GLTMOSS 120B with a harness? If so, which one? I found its speed impressive but the tool calling was underwhelming
ForsookComparison@reddit
I'd rather wait for a good output from 27B than quickly get slop from Qwen3.5-Next.
And heck, sometimes Qwen3.5-35B gets the best of both worlds if your task is simple enough.
Far-Low-4705@reddit
Idk, I feel like having the ability to iterate quickly and tell the model where it went wrong in 5min is better than getting it right first try but waiting 15min for it.
Just my opinion, I’d rather be in the loop and iterate faster. I see it as an assisting tool, not as a fully automated coder. And I feel like iterating is often better
ForsookComparison@reddit
If you stick with that mindset you'll never move past the "coding assistant" phase
Far-Low-4705@reddit
i dont want to have to debug, and learn thousands of lines of buggy code that i did not write.
if you are "vibe coding" then yeah sure, but if you're doing real work, that is absolutely not feasible.
ForsookComparison@reddit
That's fine and has plenty of merit. I just do not think it's a winning strategy in the long-term unless your goal is to gather everyone else's dust on your clothes.
I'm not even disagreeing with you, I just have a family to feed lol
Far-Low-4705@reddit
sure, i just prefer to get more work done rather than less.
that is just my preference, but do whatever you want to do
KURD_1_STAN@reddit
True, i like to use free claude more than free gpt, but we not talking about such models, 27b or 397b both are gonna make mistakes even for simple things
ForsookComparison@reddit
Compared to Opus and Sonnet? Yes definitely
Compared to Qwen3-Next/Coder? It's way more self-sufficient on its own without hand holding
asfbrz96@reddit
Gemma 31b pretty solid
stormy1one@reddit
Serious question - what are you using it for? What has been your experience in comparison to Qwen models? I never had much luck with any Google models - they work great with small snippets, but trip over themselves with larger code bases
asfbrz96@reddit
I'm using it as my coding agent, like I have GPT-5.4 as an orchestrator that sends tasks to my local models. What I notice is that when GPT-3.5 always breaks tool calls and just stops mid-process.
AvocadoArray@reddit
27b is better in almost every way. The biggest difference is how thorough it is when writing plans/specs and thinking through edge-cases. It remembers details over long contexts where Q3CN and even 3.5 122b fall short, and it can actually get itself out of failure loops in most cases.
That makes it perfect for planning and executing long ralph loops. I let one run the other night to build a TUI interface to replace one of my bash CLI tools. It ran for over an hour before it finally finished, and it implemented the feature perfectly. The only downside is that it took the instructions on writing extensive unit tests too seriously and ended up writing 300+ tests for silly failure modes like verify that calling docker ps fails if docker is not installed.
The larger MoE models are sometimes better when working with a less popular language or framework, but I prefer 27b with tooling that allows it to search the web, check reference docs, or look at the library's source to get the info it needs.
dyslexic_jedi@reddit
I still prefer the Qwen3 Coder 80b at Q8. I think it beats other local models at coding.
RedParaglider@reddit
Better than 122b and faster.
Voxandr@reddit
Still a better coder than 122b.
PromptInjection_@reddit
Yeah, the 122B might struggle here and there. Well, what I find practical is precisely its visual capability. With Qwen 3 Coder, you can't just say, "Look at this design and make something similar for me."