Anyone else find it weird how all Chinese Labs started delaying OS model releases at the same time?

[-]

tengo_harambe@reddit

/r/localllama when it's been more than 87 seconds since the last Chinese open weights release

[-]

DistanceSolar1449@reddit

Some people can pattern match, others can’t.

[-]

Jayfree138@reddit

Alibaba said they'd open up Qwen 3.6 soon. Right now it's free on Openrouter anyway. It seems like every week there's a new model i need to dig into.

[-]

They're catching up to SOTA, be happy that we even get oss models. That stuff ain't cheap. But there will be new studios coming that want their market share and they will go for early oss releases and so on.

[-]

johnfkngzoidberg@reddit

They’re not just giving it away because they’re nice, they want engagement and hype. Cheap, expensive, it doesn’t matter. 20 years ago people would create OSS projects out of passion and most apps you use today are based on those projects. People have been brainwashed to think that money is the only way things happen.

[-]

ormandj@reddit

The barrier to entry on a SOTA model is millions upon millions of dollars. This isn't like OSS where a hobbyist could write code for the price of their time + a computer. Until training data and training itself is orders of magnitude less expensive, this is 100% a game tied to money (or influence/destabilization of economies funded by governments/organizations).

[-]

Tai9ch@reddit

This isn't like OSS where a hobbyist could write code for the price of their time + a computer.

No, it's like OSS where a billion dollar company uses millions of dollars of resources to produce complex computer artifacts.

[-]

niccolus@reddit

But a business has to be profitable to exist. What part of giving away something for free is profitable? If a business isn't profitable, it isn't sustainable.

[-]

Tai9ch@reddit

Turns out that the majority of large open source projects are funded by for profit businesses. They've got working business models, in spite of your incredulity.

[-]

StupidScaredSquirrel@reddit

Yes because that way they get a neat tool with free PR requests coming their way. This is something I can see continuing to happen for something like inference engines, but just releasing model weights doesn't generate extra free dev time coming back your way, so the only reason to do it is to advertise your other products, like cloud compute or closed models.

[-]

jld1532@reddit

I actually think they're using the same playbook used to capture steel production. Subsidize via the state to collapse competition, and I think they have a very real chance to be successful.

[-]

popporn@reddit

Except China never raised the price of steel after dominating the industry. Others simply can't match China's efficiency. Before someone chimes in slave labor. India is major steel producer and their wages are much lower than China.

[-]

arcanemachined@reddit

This is the only strategy that really makes any sense IMO.

[-]

stoppableDissolution@reddit

It is literally impossible to train a decent llm as a passion project unless you are a billionaire

[-]

Orolol@reddit

20 years ago people would create OSS projects out of passion and most apps you use today are based on those projects. People have been brainwashed to think that money is the only way things happen.

Those OSS project doesn't cost tens of millions in compute. This compute can't be paid in passion.

[-]

Due-Memory-6957@reddit

Engagement and hype is worthless, what they want and get out of opening their models is to have people develop the environment around them.

[-]

sly0bvio@reddit

My crazy take? ALL COSTS were already PAID. Love pays all, but where is the love in what they do?

I personally will always disagree that we should be happy with what is, simply because “It costs”, because there is always a way to free up costs, to unburden the burden.

It is not the gift from some magnanimous corporate entity. It is a calculated act that distracts from true costs under the guise of “free”. It is a trap 🪤 even unto themselves.

[-]

Noitswrong@reddit

Seriously, if I'm reading this, I want whatever you're on.

[-]

sly0bvio@reddit

We all want love, that’s obvious.

I mean… all of us except the bodiless entities with no individual free will of their own known as “Organizations”… 🙄 I feel like religions often have another name for something like that…

[-]

hyperspacewoo@reddit

Brother lay off the acid for a bit. Especially when your talking about the worlds oligarchs companies and utilizing the love doctrine comparison

[-]

sly0bvio@reddit

Nah, I’ll dive deeper, they always do 🙌

The company operates entirely off of costs and benefits. It is black and white to the company, this or that.

The individual operates almost entirely off of the qualities of it. The view of the individual is directly contrary to the view of the company.

That is the dynamic between U/S - User & System

This same dynamic shows up in US when we use an economic system, or a political system, or any worldly system. 🌎

Systems are not designed with U in mind. U are always left out (see the name? ☝️ sly0bvio leaves u out). Even take something as basic as our BIOLOGICAL system. Your body leaves others outside of knowing what’s really going on internally. In the same sense, the system of government leaves out its individuals it attempts to serve, and a corporation leaves out the individual it attempts to trade with. This is written into the nature of reality itself, not from me or “acid”.

When U see it, U do.

[-]

PunnyPandora@reddit

https://i.redd.it/zm2qbip8jetg1.gif

[-]

leonbollerup@reddit

mushrooms or something stronger ?

[-]

power97992@reddit

They wont catch up to mythos for at least 6-9 months.. Even Glm 5 is not better opus 4.5 which came out 5 months ago

[-]

evia89@reddit

suspect glm 5.1 is close to opus 4.5 but worse than 4.6

Its worse then opus for coding but kinda good enough? You need to spend more time brainstorming. Also cheap glm subs have model quantized and it breaks after 100k context.

if you can do all that it works fine

[-]

zenmagnets@reddit

If you're in first place, you stay proprietary. If you're in second place, you go open source to gain adoption and infrastructure control. Problem is a bunch of these open weights labs are starting to catch up to SOTA.

[-]

ea_man@reddit

I guess the problem is that when those big SOTA starts to rein in to monetize the cheap / good enough open model will be there to rain on the parade.

[-]

Finanzamt_Endgegner@reddit

Yep this, any small team could create a new breakthrough that brings us better oss models at any time

[-]

BuildDeus@reddit

Centralized economic planning

[-]

-p-e-w-@reddit

China doesn't have that, or their GDP would be the same as that of the Congo.

Even the Soviet Union didn't have central planning for some of their most critical industries. Look up the history of the military design bureaus for more details about this.

[-]

GnistAI@reddit

They dont have central planning, but they most certainly can have central edicts.

[-]

sexy_silver_grandpa@reddit

They literally have central planning.

They are called "5 year plans": https://en.wikipedia.org/wiki/Five-year_plans_of_China

[-]

GnistAI@reddit

When I say "centrally planned" I'm talking Great Leap Forward level planning, quotas for individual products and factories. After Deng Xiaoping's reforms starting in 1978, those five year plans gradually became more like strategic direction-setting. China itself acknowledged this in 2006 by officially renaming them from "plans" to "guidelines" in Chinese. Today they read more like "invest in renewables and chip self-sufficiency" than "Factory #47 produces 10,000 right work boots."

That said, "central edicts" still absolutely apply. The CCP can and does tell entire industries to change course overnight, which might be what's happening with the model releases.

[-]

sexy_silver_grandpa@reddit

"when I say central planning, I don't mean the planning done by the Central Committee".

Lol ok?

[-]

GnistAI@reddit

As with everything this is a continuum, from free markets to mixed economies to planned economies. When you say "centrally planned", to me that means you are literally trying to compute supply and demand, then direct industry to fulfill quotas to that exact specification. If you have another definition, that is fine, and utterly uninteresting.

[-]

sexy_silver_grandpa@reddit

The 5 year plans literally have huge impact on both supply and demand. Mostly through forced adjustments to lending rates and production targets and quotes...

[-]

PunnyPandora@reddit

https://i.redd.it/tav8cswdjetg1.gif

[-]

sexy_silver_grandpa@reddit

China absolutely, without a doubt at all, does central economic planning. They literally have 5 year economic plans that shape the economy from the highest level.

https://en.wikipedia.org/wiki/Five-year_plans_of_China

[-]

SquareKaleidoscope49@reddit

Comparing the mathematical field of optimization during the Soviet era to now is certainly a take.

Central planning already runs the biggest companies in the world like Walmart. And ever single country in the world, including the United States has some sort of central planning that leverages the techniques USSR could only dream about. China just does it much more extensively.

This whole idea of chaotic economy being the vehicle of success is one of the best examples of how incredibly poisoned the American educational system is.

[-]

combrade@reddit

I mean that’s like saying why did we have so many electric cars under Joe Biden with the Inflation Reduction Act. The US EV share went from 3% to 10% from 2021 to 2024.

[-]

rorowhat@reddit

Lol wth

[-]

Iory1998@reddit

Well, it's unfortunate but understandable. The world's economy is changing. Uncertainty about the energy is increasing, so AI labs should be wary as their main cost is energy cost.

Additionally, the fact that AI models were used in the current USA/Israel war to kill human target may push many labs to reevaluate their approach to open-source. I can't blame them if they close-source models.

[-]

Competitive_Ad_2192@reddit

well, wait a little, they are waiting for permission from the communist party

[-]

WithoutReason1729@reddit

Your post is getting popular and we just featured it on our Discord! Come check it out!

You've also been given a special flair for your contribution. We appreciate your post!

I am a bot and this action was performed automatically.

[-]

ZealousidealShoe7998@reddit

either that or they had breakthroughs , it takes time to mature a new architecture or breakthrough and test agaisnt older tried and true methods. Honestly they dont need to be releasing models every few months so i think the next model they release is like at the end of the year but gaps the current models i will be totally fine with it .

[-]

MerePotato@reddit

Once Chinese labs catch up to the frontier the CCP is gonna pass legislation blocking open weight releases, mark my words

[-]

root_klaus@reddit

I work in corporate and imo this is a pattern of concern. Their goal is always profit, no other way around, and in that regard many companies use the open source as a ‘get popular fast’ approach. Then, slowly move to we have free stuff that you can do whatever you want to we have the best of this stuff only here after gated payment. This keeps on going until the open source does not make sense anymore as they get viable profits from the closed source ones. Most of the managers to usually tent focus on profits, and they simply make marginal calls from those. This is what i see daily in my job.

Hope this is not the case, these models provide extraordinary value to the community and because of these models we have seen a big advance in the last two years.

[-]

tobias_681@reddit

The payment isn't so much the issue as having control over the data. The most compelling selling point of the OSS models right now is that you can run them where and how you want. E.g. European Inference providers can run Chinese OSS models that are GDPR compliant in Europe. There isn't even a closed source alternative to that.

[-]

root_klaus@reddit

Yeah that one as well, eventually control of data will bring profit or an alternative of income. But this is a pattern we see a lot in OSS projects, a lot of even python packages promising agentic and AI capabilities have a wrap around and a pricing section. I believe it mostly due to hop on the AI train as much as they can.

[-]

Such_Web9894@reddit

Implementing Engram

[-]

UnionCounty22@reddit

As if their government might have a say in what they do

[-]

TopTippityTop@reddit

It may be starting to be good enough to compete with other closed source models. Not on the high end, but certainly in pricing for the majority of conventional use cases.

[-]

Lissanro@reddit

I think it was mentioned that GLM-5.1 is still being worked on, in other words there were no weights ready for release yet. Ibthinknthey promised to release at April 6th or 7th. Maybe others doing the same, running closed weight beta testing before doing final open weight release. In any case few weeks delay is not an issue.

I think open weight releases from top labs will continue for a while. At very least we should get Minimax M2.7, GLM 5.1 and Qwen3.6 but it is unclear if Qwen3.6 397B will be released or only smaller ones this time.

Obviously unless we are doing our own decentralized training nothing is guaranteed, however decentralized training projects are still very experimental and more proof of concept stages for now - which makes sense since while we have so many good open weight releases, most people just use them and take them for granted. But hopefully decentralized training will get traction in the future so we have more alternatives.

[-]

EanSchuessler@reddit

Crowdfunding base models seems possible.

[-]

Lissanro@reddit

Crowfunding likely to be less sustainable because require donating actual money, instead of contributing compute. This also will not be decentralized, since it motivates the owner to seek money. I think approach that Covenant-72B used has more potential because relies on permissionless collaborative training - which means anyone can join and help if they have eligible hardware and network connection (their technical report: https://arxiv.org/pdf/2603.08163 ) - if further developed or better methods are invented, internet connection speed and hardware requirements for each node may be reduced.

[-]

fallingdowndizzyvr@reddit

And it'll take forever. Which means improvements will also take forever since iteration will be slow.

[-]

high_funtioning_mess@reddit

This! I am starting to think that we have to take things into our own hands as a community instead of relying on these “labs” if decentralized training works. Gonna read about this.

[-]

Atomic-Avocado@reddit

Links to these decentralized training projects?

[-]

Lissanro@reddit

The most recent one I know of is Covenant-72B and before it, there was also INTELLECT-1-Instruct 10B which was smaller and less decentralized, so Covenant 72B was a good step forward. Obviously both are more like proof-of-concept than production-ready LLMs, but they show it is possible to train in a decentralized way; a lot of further improvements could be made.

[-]

Monkey_1505@reddit

Thanks for the tip. Beat out llama-2, which is a decent start.

[-]

inertially003@reddit

It is weird that OpenAi has yet to release any gpt oss update.

[-]

TheReaperJay_@reddit

They got caught distilling the big boys and need to hand-scrub out "I am Claude Opus 4.6" from the training responses.

[-]

Torodaddy@reddit

Its well known the government tells companies how to behave, they probably are following the economics of the ai industry in the US and how things are a lot less rosy as they were before. AI in the US is clamping down on loss leaders and subsidies for user adoption which means more people are going to look at chinese models as things get a lot more expensive

[-]

combrade@reddit

America should have a Open Source LLM industrial policy similar to what Joe Biden did with EVs. We should have subsidies for chips , GPUs for companies that release LLMs Open Source .

Industrial planning is not a dirty word , it’s how we built the Nazis during WW2 .

[-]

ea_man@reddit

The plan is more coal, oil, wars, tariffs.

And it's not even a plan, more like spinning wheel of fortune.

[-]

Murgatroyd314@reddit

The plan is whatever one old man feels like doing today.

[-]

tobias_681@reddit

At the current rate I wouldn't root for the USA to beat anyone at anything...

[-]

dsartori@reddit

lol try having a functioning constitutional government first bud.

[-]

Enthu-Cutlet-1337@reddit

Shipping lag usually means eval debt, not conspiracy; aligning a 72B release can mean weeks of red-teaming and quant rebuilds.

[-]

Bobylein@reddit

Aligning aka restrictions that get mostly removed anyway once released? (at least for the one people actually host themselves)

[-]

sumguysr@reddit

It's probably not a coincidence that China stopped exporting AI to the US when one of our proxy wars turned hot.

[-]

tobias_681@reddit

OSS is not an export. You can use it allover the world and in many ways it was more enticing in the EU than the USA because of stronger data protection requirements that US firms can not legally guarantee.

[-]

sumguysr@reddit

If it's made in the country and leaving the country it's an export.

[-]

ashleigh_dashie@reddit

Oil shock from the israel/trump attack on iran is just now hitting asia. World economy is in for a bad time.

[-]

custodiam99@reddit

I think Gemma and Nemotron are better than the Chinese labs thought they would be.

[-]

inkberk@reddit

Cause us companies violate the terms of use of models - presenting them as their own without any credibility. It has been an expected outcome like this.

[-]

justserg@reddit

timing could be coincidence, but it does feel like everyone's waiting for the other shoe to drop before showing their hand.

[-]

RedZero76@reddit

Yeah, this is what I think too. I'd imagine by now everyone has realized how much it sucks to release a model only to have another model steal your thunder a day or two later, intentionally or not, still probably sucks to have happen. So I bet they're all either waiting for the other, or, I mean, with the recent Gemma/Qwen drops, they might be rethinking releasing what they have without some more improvement, bc those models have upped the bar.

[-]

xrvz@reddit

Only someone who never created software professionally would think it's weird that stuff gets delayed.

[-]

Substantial-Ebb-584@reddit

It's simple - they're earning money this way. They will release in some time when the hype wave rolls over. And they are preventing companies like Anthropic from using parts of their models for training their own always closed models from day 1.

[-]

mindwip@reddit

Chinese new year...

[-]

x11iyu@reddit

they all came together and decided to do this together. This does not feel organic...

any evidence other than feels?

[-]

Pink_da_Web@reddit

He said "It's almost like," which is a question he posed, not a statement of truth.

[-]

x11iyu@reddit

I wouldn't call "it's almost like" a question, more a strong suggestion by the OP that they did all come together and decide to do it at once

honestly I would love some more insights into it though

[-]

Hoppss@reddit

It was a hunch, meaning OP doesn't have all the insights. Sometimes it's as simple as that.

[-]

-dysangel-@reddit

The whole point with these things is that there will likely be no evidence, or at least not for months or years. Feels are often just based on very obvious things which eventually get proven out, like a lot of the obvious misinformation, gaslighting and control governments pushed during covid.

[-]

853350@reddit

the economic system of china?

[-]

Confusion_Senior@reddit

China is a country were the economical is coordinated by the political so if some government office decides this as a national strategy there isn’t much they can do about it

[-]

Sicarius_The_First@reddit

Tbh? I don't care why or why, but as long we get a SOTA open source models, I am happy.

Good on them 👍🏼

[-]

Embarrassed_Adagio28@reddit

Dude we just got qwen 3.5 and glm 4.7 flash. How fucking impatient are we fir FREE models.

[-]

wingsinvoid@reddit

There were many impactful papers published in the last months and they need to integrate those solutions. SOTA already has them, see Google's turbo-quant and DeepSeek's mHC and Engram.

[-]

blueredscreen@reddit

You're really complaining about a couple of weeks of delay for something this massive, and on top of that, completely free? No cost, no barrier, access to one of the most complex systems ever built, and somehow that still isn't enough. You are simply detached from reality.

[-]

Sliouges@reddit

feel something is off...

The Chinese business model is to release the low end versions of the high end internal models, let the public test them, they incorporate the changes in, repeat, until they get to SOTA, the pull the ladder up.

Everyone here excited that the Chinese are #1 OSS is... well, giving them free alpha and beta testing.

Also if you follow the money, they are all funded by the same entities. Follow the money and you will discover yourself why they speak with the same mouth.

[-]

ProfessionalSpend589@reddit

You missing that dopamine hit with new model release, eh?

Don't worry, it'll pass. I'm enjoying the new Gemma 4 models and play with the 26b and 31b variants.

[-]

True_Requirement_891@reddit (OP)

tbh, I'm mostly thinking about what's next and it seems off that all companies doing the exact same thing at the exact same time... almost feels like these all companies are actually just one company pretending to be multiple...

[-]

ProfessionalSpend589@reddit

Well, it would be stranger for investors to continue to burn money to train models and in the end to just give them for free… with no agenda.

So, yes. I too think something will happen, but I can’t do much about it, so I’m not worrying too much.

[-]

Specialist_Golf8133@reddit

not that weird tbh, if you're a chinese lab and deepseek just got export controlled, you're gonna wait and see how hard the hammer falls before shipping your weights. coordinated? maybe. but also just everyone reading the same room at the same time. kinda curious if this just means we get a flood of releases in 3 months or if open source from china just got way more complicated

[-]

fatYogurt@reddit

They just back form Chinese new year break, pick up some backlogs

[-]

dark-light92@reddit

If I remember correctly, minimax has always been doing that since 2.1. They always take couple of weeks to release the weights. Only people didn't notice because nobody knew that they release pretty good models.

[-]

suesing@reddit

They’re optimizing for Chinese hardware. Huawei got deepseekv4

[-]

asfbrz96@reddit

We have to thank Cursor that basically copied Kimi K2. 5

[-]

Professional_Bat8938@reddit

Qwen 3.5 was just released. This seems like a “talking point”

[-]

Due-Memory-6957@reddit

Doomerism is the most common philosophy of our time.

[-]

Nervous_Variety5669@reddit

They have to bring in revenue at some point now that they are trying to compete at the frontier. It was pretty obvious where this was going when it became common to release >300B models to try to compete with OpenAI/Anthropic (by their own admission since they compare in their own benchmarks).

They have no incentive to release open weight models that satisfy the demans of most users. Otherwise no one is going to pay for their APIs.

Google can release Gemma 4 because everyone can say "it's good, but its no gemini-3.1-pro". We still have an incentive to pay for Gemini.

The Chinese labs dont have this privilege because the difference between a dense 32B model and their >300B param API offerings isnt enough to justify the cost.

But the difference between gpt-oss and gpt-5.4 is vast. The difference between Gemma 4 and Gemini 3.1 Pro is vast. The difference between Anthropic's ... wait, Anthropic dont give a sh*t about open source as much of a darling it is in the AI world.

You get the gist.

[-]

Dr_Me_123@reddit

The model's capabilities have reached a truly usable level.
OpenClaw sparked a trend in China.
Chinese users are now developing a willingness to pay.
Stock prices and funding.
Demanding investment returns.
The use is increasing, but computing power resources are insufficient, charging and profitability have become possible.

[-]

redballooon@reddit

There's only one reasonable explanation: they've hit AGI and think it's better for the world to keep it closed.

/s

[-]

ea_man@reddit

You are /s but if tomorrow someone release a new model that is good at tooling and is well trained at cybersecurity you get attacks on the IT infrastructure of most countries, that are usually full of holes.

[-]

EffectiveCeilingFan@reddit

Honestly, I think releasing the model as closed weights, then making it open weights a month later or so is fine. At some point, the labs need to recoup some training cost. I feel like most models are only really hugely popular for like a month or so as well, since something better usually comes out by then and everyone moves onto the next big thing.

I just hope we actually do get the weights 😭

[-]

murkomarko@reddit

“All Chinese labs” are actually the absolute same thing funded by gov and just with a different name

[-]

nuclearbananana@reddit

Add stepfun to your list, unless they drop it on Monday.

[-]

34574rd@reddit

wait stepfun said they were gonna release another model?

[-]

TKGaming_11@reddit

They released an update to StepFun 3.5 Flash with thinking control and reduced token usage, but it’s api only, StepFun did commit to open sourcing all models so it is odd it hasn’t been made open weight yet

[-]

Hairy_Reputation7434@reddit

Perhaps they are optimizing for the transition from Nvidia chips to Huawei chips.

[-]

Organic_Challenge151@reddit

Anthropomorphic achieved AGI thus recognized and banned all their accounts making it impossible for them to distill, obviously.

[-]

Sweaty-Scene5621@reddit

So now they've committed a crime if they want to keep some of their models close-sourced?

[-]

Eyelbee@reddit

Except they literally did not

[-]

a_beautiful_rhind@reddit

Kinda how much of the free inference has dried up. The bill is due and they need return on investment.

[-]

perfopt@reddit

I think they are transitioning to closed. I think open models are no longer beneficial for them.

[-]

PathIntelligent7082@reddit

it's far from weird if you know how china works...

[-]

sandykt@reddit

I was wondering the other day that if open-weight models reach the capability of gpt-5.4, would they still be open sourced? Because that’s pretty much solves any kind of complex agentic use case.

[-]

b3081a@reddit

Minimax and z-ai had an IPO recently, and they're probably gonna try to make more profit in the near future.

[-]

oblivion098@reddit

I wish they flood the western market of opensource models so more people ll adopt and take down that ai bubble scam down. And the whole us financial system like a domino.

Providing open source free models like took of freedom. Where usa failed

[-]

13baaphumain@reddit

Well probably they have just begun announcing it earlier than they should.

[-]

sly0bvio@reddit

Not just closing models, but also using their Trojan Horses against us too. That’s why I use their open source stuff to make a basic platform from which I will be developing off of. Nothing important is given to them, I only feed information to AI’s that I want to have shared and leaked to others. I shared project information about things not made in the market because there is almost zero people wanting or discussing it… and now all of a sudden, there are multiple developers who made “product demos” that are all centered around that unique idea, but terribly non-detailed and useless in functionality, it’s like they just built it without even knowing WHY they were building it. For that reason, I use AI to share loose ideas I want to see OTHERS think and work on, then I take those ideas and create my own more formed projects that are hopefully closer to what I intend to make.

Basically, I use these greedy data-collecting companies tactics against them, letting them take my “free” ideas just as they give you a “free” AI model. They’ll quickly learn that TWO ✌️ can play that game…

[-]

nullmove@reddit

That's not "all" labs. Besides the gap between OS model releases had always been >=2 months for all labs. What they are doing is now releasing more frequent intermediate checkpoints, presumably for marketing/hype. For example, barely a month between minimax 2.5 and 2.7. Same for GLM-5 and whatever Turbo thing was, between Qwen-3.5 and 3.6. It only stands to reason that such frequent checkpoints are alpha quality, makes sense for creating hype for relevance but doesn't pass the muster for actual weight release on HF, hence the need for more time.

[-]

Septerium@reddit

Perhaps now they are being able to release closed weight models and still get the public's attention

[-]

misha1350@reddit

They want money, bubble is popping