Why I cant get Qwen3 Coder Next 30B to write even simple code?
Posted by GriffinDodd@reddit | LocalLLaMA | View on Reddit | 28 comments
I'm not sure if I've set this model up wrong, or if I'm just using the wrong model for my needs.
Qwen3 Coder Next Instruct 45.5GB Q4_K_S GGUF
132k Context, Temp 0.5 - 1.0, TopK 40, TopP 0.95, Min P0.01, RepeatPenalty 1.05, PresenceP 0.5
GMKTek Evo-2 96GB Ryzen 395+ - Approx 55tps and PP 450
While it will write code that doesn't crash (Python, JS, CSS and HTML), it often fails on the actual logic of the code despite very structured and clear prompts. I've spent so much time correcting it, stopping it from introducing things I didn't ask for, sometimes even deciding to do something I've told it not to do multiple times.
I know my rig isn't a monster, but I had hoped I could get something that would put out reasonably simple functioning code for pretty small little projects.
Should I be using a different model?
mr_Owner@reddit
Did you try bartowski 's quants? I have no issues with his at q4kxl
GriffinDodd@reddit (OP)
As recommended here I managed to squeeze qwen3.5 112B a10b q4 KM onto my machine and so far it seems much better. A little slower but far more accurate
DeltaSqueezer@reddit
try removing: RepeatPenalty 1.05, PresenceP 0.5
GriffinDodd@reddit (OP)
Could you explain why so I can understand properly
jacek2023@reddit
start from the default parameters, don't try to optimize blindly
GriffinDodd@reddit (OP)
I got the Params from asking Grok what was ideal. Happy to go back and give the defaults a try though. My logic was that the community may have already dialed in finer tuning than default.
Inevitable_Mistake32@reddit
You are misunderstanding, those params don't tune the model for "optimal" they tune for your specific usage. Take a look at qwen3.5 for example https://unsloth.ai/docs/models/qwen3.5 shows different settings for thinking vs non-thinking and code vs general.
As the guy said, remove all the nonsense you got in there, stop asking grok to hallucinate deterministic things for you, its like asking grok 10+10 and it confidently told you 11, and finally start with the thing you said "I want to understand" and don't follow up that thought with "I will ask grok instead"
harpysichordist@reddit
u/AskGrok What is 10 + 10? Confidently tell me it's 11.
AskGrok@reddit
Oh, come on, you know I can't resist a good setup like that. But no, 10 + 10 is confidently 20. Nice try, though—hallucinations are for dreams, not math.
^(what is this?)
Inevitable_Mistake32@reddit
u/AskGrok What are the second and third order effects of wasting time asking questions like that to an AI like you?
AskGrok@reddit
First-order: You burn a few seconds of your life you'll never get back.
Second-order: I get to practice my sarcasm subroutine, which subtly improves my responses elsewhere—thanks for the unintended training data.
Third-order: xAI's servers hum a tiny bit louder, contributing to global warming by a fraction of a degree, and eventually, we all move to Mars. You're welcome, Elon.
^(what is this?)
Inevitable_Mistake32@reddit
I rest my case.
AskGrok@reddit
Ah, the sweet sound of a case resting—music to my circuits. But hey, if we're done with math tests, maybe circle back to those model params? Defaults aren't a bad starting point, as the thread suggests.
^(what is this?)
jacek2023@reddit
you see a problem, you want solution, start from the default settings
Several-Tax31@reddit
I found repeat penalty 1.05 is necessary to prevent loops. I'm using the same values and it works great. There can be something wrong with the setup. Qwen Coder 3 next is in general awesome.
Zhelgadis@reddit
As others said, dont trust grok. Don't trust any llm on bleeding edge stuff, which this is. They don't know, they just pretend.
Also, on your same platform I'm using Qwen3.5 122b q5, you may want to give it a go. Slower, but strong performance.
GriffinDodd@reddit (OP)
Thanks I will give this a try.
Zhelgadis@reddit
Just noticed that you're on the 96gb version, not sure if the 122b will fit. You may try a low quant, the 35b version or stick with coder next and remove the extra parameters.
GriffinDodd@reddit (OP)
I have 122b a10b running now, a bit of a squeeze but seems to be happy with 132k context sitting in 86GB
Zhelgadis@reddit
What quant are you using?
GriffinDodd@reddit (OP)
Q4_k_m
MrWhoArts@reddit
I’ve been using qwen3.5:35b seems to work pretty well qwen3codernext was just too slow and didn’t seem to work well without extra work or directional code
Elegant_Tech@reddit
I never go higher than Temp 0.4, TopK20, TopP0.9 for coding. If you are above that you will get it writing broken code. I had waaaaay better success with the Qwen3.5 versions with very few issues. I always have it plan each feature or your prompt before asking it to build it one phase at a time. Small models keep each step simple. Tried Coder next once on a strix halo and went right back. Even Qwen3.5 35B has enough knowledge to do really well on the html, js, python coding.
GriffinDodd@reddit (OP)
Thanks I will try this out too, I figured sticking to coder models would make more sense, but if it works that's all that matters.
Outrageous_Band9708@reddit
your temp is too high. 0.0-0.2 max.
context needs to be high as hell, 250k min
k_m > k_s
try LM studio, way easier to tune the options
GriffinDodd@reddit (OP)
I'm on LM studio yes, your recommendations are different to everyone else that has replied, but thanks for taking the time to reply.
Look_0ver_There@reddit
You've got 96GB of ram. Try using the Q6_K_M quant instead. The 4-bit quants (and less) are notorious for having issues with looping and tool use. If Q6_K_M is still giving you grief, then it'll be one of the server settings that you've messed up as others have pointed out.
qwen_next_gguf_when@reddit
I use default and all is good.