Larger Gemma-4/Qwen3.6
Posted by Non-Technical@reddit | LocalLLaMA | View on Reddit | 38 comments
Qwen3.5-122B-A10B at Q6_K is really good even though it takes an hour to load on my machine.
Do you think we will see a larger MoE Gemma-4 or Qwen3.6 at some point?
NNN_Throwaway2@reddit
Going by the 3.6 27b blog post, a 122b seems unlikely at this point. 397b is already confirmed to be not coming.
DeltaSqueezer@reddit
where is this confirmed? why 122b not likely?
NNN_Throwaway2@reddit
Does no one actually read anything? Is it really all just hype and vibes around here?
The official blog post that was published with the release of Qwen3.6 27b made it pretty clear that there would be no more models released in the 3.6 family.
No_Algae1753@reddit
Please give us the part of the blog where this is stated.
NNN_Throwaway2@reddit
The summary.
No_Algae1753@reddit
??? Can you provide at least a link ???
NNN_Throwaway2@reddit
??? Google it ???
No_Algae1753@reddit
You are the one claiming here things
NNN_Throwaway2@reddit
Yeah? That doesn't mean I'm obligated to spoon feed you information you can read yourself on the public internet.
No_Algae1753@reddit
Maybe we are not able to find the part of the blog which supports your thesis? I've read through it nothing in this blog post tells me that qwen 3.6 122b is UNLIKELY or there are no newer 3.6 modules coming out. So please in teh future just give us the part if the blog which supports your statement. You are wasting our time.
NNN_Throwaway2@reddit
I told you. The summary at the end of the post.
No_Algae1753@reddit
MAYBE WE ARE NOT ABLE TO FIND THE PART OF THE BLOG WHICH SUPPORTS YOUR THESIS? good God just give us at least the part of the blog. I read through it and was not able to find the part which supports your thesis https://qwen.ai/blog?id=qwen3.6-27b
DeltaSqueezer@reddit
That's why I'm asking because their blog post said nothing about there being no more model releases in the 3.6 family.
NNN_Throwaway2@reddit
In other words, they view the 3.6 family as "comprehensive," which essentially means complete. "Range" also implies an even distribution without gaps that need to be filled. "Now offers" implies that, prior to the introduction of the 27b, these qualifiers weren't yet satisfied, but now are.
Compare this with what they said in the 35b blog:
A very unambiguous statement of intent to release more 3.6 models.
Again, re-stating that there will be more 3.6 models.
I suppose you could argue that the 27b blog post doesn't explicitly rule out more 3.6 model releases, but the shift in language is absolutely there.
If they were planning to release more 3.6 models, you'd think they would say so. Instead, their phrasing very much implies the opposite. They went from stating twice that they will expand the range of 3.6 models to not saying it once.
No_Algae1753@reddit
I just read that comment at this point you are counter arguing yourself. You are completely wrong on your take.
NNN_Throwaway2@reddit
Where am I counter-arguing myself?
Pristine-Woodpecker@reddit
Where was that confirmed?
NNN_Throwaway2@reddit
3.6 Plus is already out with no open weights.
Pristine-Woodpecker@reddit
I'm confused how this confirms the weights won't be opened.
stddealer@reddit
The larger Gemma is Gemini, and you probably won't get it outside of Google's API.
Blues520@reddit
Would a 122b moe be better than a 27b dense for coding?
Voxandr@reddit
3.5 122b apex quant is better for me
Blues520@reddit
Better than 3.6 27b?
Voxandr@reddit
Yes always
ttkciar@reddit
I think a Qwen3.6-122B-A10B release is likely, and am a bit surprised they haven't released it already.
Google teased us with a 120B during their beta-testing, but I don't know that we will ever see it released.
In my spare moments I've been doodling "on paper" about making a hybrid dense/sparse Gemma4 out of Gemma-4-31B-it, via the same techniques AllenAI used for FlexOlmo, but only for Gemma4-31B's middle blocks (per RYS theory), and with full router training post-merge (since FlexOlmo's sharded router training was very poor). I lack the compute resources to actually make a big one, but might be able to manage a proof of concept with a trivial number of experts (like, four).
_derpiii_@reddit
> I think a Qwen3.6-122B-A10B release is likely, and am a bit surprised they haven't released it already.
Why surprised? These things take work you know.
No-Refrigerator-1672@reddit
Previously, Qwen team released all they have within a week, or even the same day. Now, with the new leadership, it seems like they took PR-centric approach and spread out the hype as thin as possible. If they will release it at all, which is not given, i.e. 3.5 Omni remains closed source.
_derpiii_@reddit
Fair enough :)
My interpretation though: They released models as they became available, vs waiting for all of them to be available.
NNN_Throwaway2@reddit
Qwen3.6 122B seems unlikely if we go by the blog post accompanying the 27b. They're obviously moving in that direction at least, since they transitioned to not releasing the weights for the 397b.
ForsookComparison@reddit
I don't want to jinx it but I I have this weird feeling that we're not going to see larger Qwen's again in open-weight. I think 3.5-397B was a one-time thing.
FuckSides@reddit
I'm still optimistic, since they included the 122B in the poll they gave earlier about which models people are anticipating them to release. They've released 2 out of the 4 options so far, so I expect we'll see the 9B and 122B in the future, but am less sure about other sizes.
FinalCap2680@reddit
Same here, and I hope the 122B will be the next one.
With other labs releasing huge models. I hope they may be forced to release even something bigger...
Evanisnotmyname@reddit
They’re just getting too good? Think people will quit using public models en masse?
billy_booboo@reddit
Idk, the game theory is ruthless and these companies are acting simultaneously as profit seeking corporations, speculative investment vehicles and representations of nationalistic power. It's absurd but I think this sick and twisted environment is to our benefit, for now. The conflicted incentive both to make ends meet positively by building products and to destroy the competitions' moats (there are no moats) by releasing open models seems to keep all of these guys coming back around again and again. I thought google was out, but low and behold, there they went again.
The paid inference business is consistently on thin ice so they keep needing to come back for another drink of the government / VC money and the way they do that is by showing that they're still hot by releasing open models.
It's a race to the bottom.
onil_gova@reddit
GCoderDCoder@reddit
To be clear, qwen 3.6 35b has been better than qwen 3.5 122b in my experience which is consistent with benchmarks. Test out what you use it for because you can run a higher quant of the 35b for more accurate coding of you do coding.
I got trained to aim for 120b models for my hardware but the last couple months have us some intense smaller models that match much larger sparse models.
sloth_cowboy@reddit
Just following and bumping the topic
billy_booboo@reddit
Yeah, I think Qwen3.6 122B would be an extreme sweet spot for me in terms of not relying on claude as much