Stop QwenLLama! Every other 4th post in this sub is about Qwen models in the past month

Posted by prselzh@reddit | LocalLLaMA | View on Reddit | 37 comments

Disclaimer: I use Qwen models on a day to day basis..

You could take it as a rant or even my concern about innovation in other models. If the whole set of people here, just keep talking about Qwen models. What about other models? I’m just getting tired of this Qwen 3.5, 3.6, 3.7 in sub. looks like you Qwen team is just enjoying the free PR visibility here they are trying to keep up the hype train going on with the new version every other week.

I requested everyone to start talking about other models as well and try other models as well. Not just keep praising about how good Qwen is ! We can all agree that everybody is actually using it due to model size being small and benchmark is good and then it’s come to a point that Qwen is good.

If the moderator see this, kindly help to take a look at this..It’s starting to feel like Qwen llama, rather than local llama

[-]

DeltaSqueezer@reddit

Oh. I was just wishing for another Qwen post. Thanks for starting one :P

Gemma is also interesting, but the KV cache cost was way too much. I might look again when TurboQuant is more mature.

[-]

prselzh@reddit (OP)

There will always be ONE PLAYER who dominates but Who it is at this moment is all about it..Remember we had GPT3.5, llama3 moment, Deepseek Moment..Qwen Moment is all it is now..

[-]

Awwtifishal@reddit

this subreddit has been praising qwen for over a year

[-]

prselzh@reddit (OP)

Recently this praise has gone to a threshold point ..I am fed up to go inside those posts… :D

[-]

FinalCap2680@reddit

I will be happy to talk about other models, but.... it is hard to talk about something else when you put a 27B model against 120B model ( https://www.youtube.com/watch?v=H-GtrbcDqYQ ) or even something bigger ( https://www.youtube.com/watch?v=iAIlTC4m8Fw ) and it performs close, that is something....

It is not my videos and in my test the gap between Nemotron and Qwen was bigger (and not in Nemotron favor)

[-]

prselzh@reddit (OP)

Salutes to the effort and Nice comparison btw .. tbh atleast you tried to compare another model to make it win..i am just disappointed to see lot of posts about Qwen just to boast. And those kind of posts is repeating in a loop

[-]

Rimera_5@reddit

Lowkey agree tbh. Feels like every other post is “new Qwen drop” or “Llama benchmark” now. I miss when this sub had more weird experiments, tooling hacks, local setups, and genuinely cursed builds instead of nonstop model leaderboard discourse lol.

[-]

ttkciar@reddit

Yeah, I miss that too. It's gotten to the point where I wonder if anyone would even want to see me post about my data augmentation projects. It seems like topics which used to be this sub's bread and butter are now reviled.

[-]

kevin_1994@reddit

Because this sub got sloppified and its hard to tell if projects are genuine anymore.

Also, unfortunately, this sub is now mostly contrarians who download opencode with kimi k2.6 on openrouter and tell all their colleagues about how smart they are for "going local". Unfortunately the percentage of people actually running models on their own hardware is super low now. Remember a couple years ago when like half the posts where peoples' jank ass builds?

Sadly this is what LLMs have done to the internet, and, unless we are willing to give up whatever privacy people we think we have and use some sort of platform where we prove we are human through biometrics or something, this isn't going to change

[-]

ExoticYesterday8282@reddit

Qwen is better suited for local deployment and experimentation.

[-]

can999999999@reddit

Well I mean it's in a league of it's own currently, if that's a good thing or a bad thing is up for debate, but it doesn't change the fact that it's the only relavant model for coding on 16-32GB VRAM systems.

[-]

ttkciar@reddit

The only one? What's wrong with Gemma4?

[-]

can999999999@reddit

As the other guy already said, Gemma 4 is fine, Qwen 3.6 is better.

[-]

ea_man@reddit

KV cache management as in sliding windows vs delta net, yet we get your point, that is good too.

...but not as good. :P

[-]

Lesser-than@reddit

It is what is, its the best performing model at this moment in time for local usage and checks all the agent usage boxes at the same time. Its not up to localllama to turn the qwen noise down its for other labs to step up their game.

[-]

kzoltan@reddit

Ok, but let me show you the benchmarks I made on my mama’s toaster with qwen 3.6 35b q2 first…

Just kidding, keep them coming. I also find them annoying sometimes, but it’s great that so many people have access to a decent quality model with accessible hw. Who am I to tell them not to write about it?

If you want to see other type of stuff, look somewhere else, there are great communities on Discord and Github for example. The open Internet is dying (at least the experience just a couple of years ago isn’t there anymore IMO), start looking somewhere else.

There will be a new qwen 3.6 soon anyway, and people will do the same with that.

[-]

Fedor_Doc@reddit

People discuss local model that works well in their setups. It helps local LLM community, as newcomers can quickly see what model is good.

General mod rules are enough, I think. Low effort posts are deleted. Personal experience posts with details are kept. There are optimization and quant suggestions in the comments. It is how it should be.

[-]

prselzh@reddit (OP)

I have to agree with You on the rules but Somebody has to call this out loud..As this is what’s happening in the sub with Qwen series …Everybody showing off the same thing what they developed using Qwen. Atleast this is my real Human effort to stop and not some AI slop or vibe coded app which almost everybody keeps publishing ..Reddit is supposed to discussion platform anyway

[-]

alexkey@reddit

Post another model experience? Honestly would love to find anything to compare Qwen-coder with. So far not much I can find that would run on my system.

[-]

alexkey@reddit

I would love to see more model information specially for models that I can run on my hardware locally. I think everyone talks about Qwen because that’s what is very easily available in the size that can be used in self hosted setups.

[-]

Ok-Measurement-1575@reddit

I actually tried gpt120 again the other day and it's hilariously bad against 36b for some things. Like so bad, it made me question if it had somehow been dumbed down.

[-]

Pristine-Woodpecker@reddit

If some other provider ships an open weights model that runs on 16GB->24GB class GPUs with a performance that is better than Qwen3.6, we would all be talking about that.

So the way to fix this is pretty clear for Qwen's competitors, no?

[-]

prselzh@reddit (OP)

Of course, I agree Competitors needs to come up with an answer to this. Not me! But I come to see in this sub what new models are available and what ppl had tried out you know ..but getting fed up seeing the same models for past several days now..I personally have tried Qwen and agree it’s good .No need for so much posts on the same..Better ppl can use Reddit AI search for the searching which model suits their needs

[-]

ttkciar@reddit

> If the moderator see this, kindly help to take a look at this..It’s starting to feel like Qwen llama, rather than local llama

I know what you mean. If it were just Qwen astroturfing, that would be one thing, but it's more than that. A lot of folks in this group also genuinely adore Qwen -- the models, the team, the whole franchise.

We'll do what we can about the astroturfing, when we can, but I don't think there's much to be done about the fanboyism.

Like someone else has suggested, your best recourse is to post the kind of content you'd like to see in the sub.

[-]

kivaougu@reddit

Tried mimo v2.5 which is 316GB in fp8. Sometimes it would cut its own answer in the middle and other times it would give the answer inside the thinking block. It would also spawn a million research agents for the simplest possible task (e.g. look at this one file: "Okay let me spawn 5 subagents").

Deepseek v4 flash is fine but if you want the best accuracy it can think for over 100k tokens, Minimax m2.7 is fine but has 200k max context.

So for me there really isn't anything worth talking about smaller than minimax. Gemma exists but the attention mechanism really kills it for my purposes even if I would agree it's better than qwen at human interaction.

[-]

llama-impersonator@reddit

the qwenning will continue until morale improves

[-]

1nicerBoye@reddit

Be the change you want to see in the world. Post something worthwhile about some local stuff not involving qwen then.

[-]

prselzh@reddit (OP)

I agree with Qwen model size, it’s the popular posts but doesn’t have to literally everyone keep praising the same thing …

[-]

1nicerBoye@reddit

That has nothing to do with my comment?

[-]

prselzh@reddit (OP)

Oops Sorry I meant to reply to another comment below

[-]

Plus it’s a never ending market of fine tunes and distills

It’s a very good versityel model

What’s the problem with it

[-]