It's been a while since we had new Qwen & Qwen Coder models...
Posted by sammcj@reddit | LocalLLaMA | View on Reddit | 53 comments
Just saying... 😉
In all seriousness if they need to cook further - let them cook.
TheTideRider@reddit
Rumor has it that Qwen 3 is just around the corner. Maybe next week
YieldMeAlone@reddit
What's the source of that rumor?
__JockY__@reddit
The internet never lies.
TheTideRider@reddit
It has been released today. You are very welcome.
__JockY__@reddit
This comment aged out pretty quickly!
TheTideRider@reddit
25 t/s
__JockY__@reddit
85.3% of all inference statistics are made up on the spot.
TheTideRider@reddit
It has been released today. You are welcome.
sammcj@reddit (OP)
Well folks, it's out today!...
LackBig7563@reddit
does will be realeased a coder qwen3 like what happened for 2.5?
phazei@reddit
What happened to the Qwen 3 they said was going to release this month?
Ok_Warning2146@reddit
Probably they are fine tuning Qwen3 to beat gemma 3 27b at lmarena b4 it can be released.
sammcj@reddit (OP)
I don't think the Qwen team will have a hard time doing, especially for coding.
Ok_Warning2146@reddit
Well, the reality is that gemma 3 scores higher at lmarena than qwq
sammcj@reddit (OP)
LMArena is not a reliable source of which models are best, especially for coding models - it has a very limited context size and the UI is geared for people to enter short prompts.
Ok_Warning2146@reddit
Well, I was just suggesting a reason why they didn't publish qwen3 yet. It would be nice if they can also beat gemma 3 at lmarena.
SkyFeistyLlama8@reddit
For what it's worth, in practical usage Gemma 3 27b is better for Python and C# coding than QWQ 32B or Qwen 2.5 Coder 14B.
deldongoo@reddit
Do you have any metrics to support this?
SkyFeistyLlama8@reddit
Metrics? Don't need no stinkin metrics. Imperial's the way to go. Compare Gemma 27B against Qwen 2.5 Coder 32B and see what you get with your favorite programming languages.
Regular_Working6492@reddit
Qwen scores better on the aider leaderboard, FWIW
Blues520@reddit
This is the one I'm waiting patiently for. I hope they spend their time creating a quality model that we can use for a long time. Qwen 2.5 has been stellar, so they can't drop the ball on this.
sammcj@reddit (OP)
It really has! I hadn't replaced it until recently with GLM-4 32b.
umataro@reddit
And is it an improvement? Which languages do you use it for? What sort of problems?
sammcj@reddit (OP)
Very much so, it's like a slightly better version of QWQ but without the reasoning/thinking overheads.
ForsookComparison@reddit
It's good at one shots but very poor at editing and instruction following. The Qwen family crushes it in editors despite losing in one-shot scenarios.
SidneyFong@reddit
I'll second this sentiment. I have a bunch of non-coding tests I run on LLMs and GLM-4-32B doesn't really do particularly well. The one thing I haven't tried with it (and that people seem to be excited about) is their one-shot code generation... but honestly I personally don't have much use for one-shotting code (so I can't really comment on whether it's actually good on that front).
In short, GLM-4-32B seems rather meh to me and I don't understand why people are swearing by it..
ForsookComparison@reddit
Agreed. It's one shot abilities are amazing for 32B.. it can trade blows with QwQ without reasoning tokens, but after that first shot it all falls apart into mediocrity.
CheatCodesOfLife@reddit
Isn't Qwen kind of crap at this though (despite being the third best local model / best easy to run local model)?
https://aider.chat/docs/leaderboards/
ForsookComparison@reddit
You can build and edit/iterate with Qwen. Although yes it will eventually reach a point where complexity is too much to edit competently, Qwen Coder can be used with like . aider and roo very well for quite a while
CheatCodesOfLife@reddit
I'll have to give them a try with roo.
When I was using it, I pretty much got used to starting a new context after about 16k tokens.
umataro@reddit
Thank you, I'll give it a try. My use case is mostly DevOps, so bash, terraform, ansible and python.
I kind of gave up on looking for new models after qwen2.5-coder:32b and deepseek-r1:32b because these were finally good enough and there are just too many coming out every week.
CheatCodesOfLife@reddit
GLM-4 32b is good with this.
AdventurousSwim1312@reddit
My guess is that they intended to release earlier this month, with sota results, but with the release of gemini 2.5 they delayed a bit to tune on gemini output.
Gonna be great.
Ylsid@reddit
The words are spoken! Ready the quant engines!
SkyFeistyLlama8@reddit
QAT. Please do QAT versions of these so Qwen can compete against Gemma-3.
dampflokfreund@reddit
that would be nice. But Qwen 3 won't have Vision capabilities, so in my opinion G3 will still be ahead.Â
Better_Story727@reddit
I think QWen3 may encounter problems because its goal is to maximize community influence. Parameters like 15BA2B have the potential to maximize community influence, but it is impossible to achieve leading performance. I estimate that they may encounter similar difficulties as Llama4 in balancing performance and parameters.
nullmove@reddit
Qwen has never been about absolute leading performance, they play a different game as you noted. They have always had trouble scaling up (their 100B tier enterprise max models have barely ever been any better than 72B open weight ones).
For that matter, Llama 3 was never leading in the overall sense either. Zack saying Llama 4 would be SOTA was interesting, but quite simply a lot of stuff happened since then.
tengo_harambe@reddit
The only confirmed Qwen3 model sizes so far are 7B and 15B MoE. I think the worry is not in scaling up but down, especially with an MoE of that size which has been unheard of before. The Qwen team admittedly loves its performance metrics so I wonder if they and others are experiencing performance anxiety after seeing Llama 4's reception.
stoppableDissolution@reddit
Theres 3b moe from ibm, so not that unheard of
Evening_Ad6637@reddit
Yes it absolutely not that unheard. There is also olmoe with 7B and Deepseek coder lite with 16b moe
faldore@reddit
Qwen3 is frankly too small to be exciting. As much as I loved Qwen2.5.
Mushoz@reddit
Qwen2.5 was also committed with only one size, but ended up with many different sizes. I reckon Qwen3 will be the same. Only one size of each architecture (dense + MoE) has been posted about.
sammcj@reddit (OP)
It hasn't been released yet - other than the preliminary commits to transformers and llama.cpp last month I don't think we have official sizes do we?
reabiter@reddit
I've heard rumors that they're planning to release them ahead of May Day.
sammcj@reddit (OP)
Where did you hear that from? People say a lot of things.
xignaceh@reddit
Yesterday I saw a commi for autoawq that adds qwen 3 support. Even though autoawq became deprecated earlier this week
JLeonsarmiento@reddit
…and is totally fine. They work perfectly still today.
AsDaylight_Dies@reddit
And hopefully a good 12n and 14b models for us broke souls.
Consistent-Sugar8531@reddit
Specter_Origin@reddit
Qwen3 is on the way, they already added the PR here from their team:
https://github.com/ggml-org/llama.cpp/pull/12828#issuecomment-2789244344
Patience my friend!
MDT-49@reddit
I hope they're using a pressure cooker by now, because I'm starving!
sunomonodekani@reddit
And truth. Good times.