Why Alibaba set high price for coding plan, while releasing powerful open source models?

Posted by Historical-Crazy1831@reddit | LocalLLaMA | View on Reddit | 15 comments

It seems to me that qwen3.5 27b and 122ba10b are not too far behind the 397ba17b at least according to the benchmarks. The alibaba coding plan is selling 397ba17b for 50 dollars per month, too expensive! If say 70% of work can be done by 27b and 122ba10b, which are much easier to deploy on local PC, then releasing them will simply give people a reason to not using their coding plan. They could just use a cheaper chatgpt/claude subscription to solve the remaining harder problems.

My guess is that maybe Alibaba will gradually stop releasing powerful small models, or ensure that small models are not good enough to compete with their flagship model. Since Alibaba is one of the very few companies releasing small models, if they stop raising the bar, other companies might follow suit and slow down their progress as well. Like Z.ai, they used to release small models, but now they only release huge model and significantly increase their coding plan price (Pro plan from 30 dollars per month to 72 dollars per month).

Maybe I am too pessimistic, but I am afraid that small open source models (say below 60 GB in size) will stop evolving at some point, optimistically touch GPT-4o level. Then if you want better performance, you will either have to have hundreds of GB of VRAM to run huge local LLMs or subscribe to very expensive cloud models.

[-]

Miserable-Dare5090@reddit

they wont release the 397b of 3.6; they’re saying “if 35b seems good, come check out 397b”. But 397 is miles better in multi-function cases. So when you say “122b is not far behind” I think that depends on use. On Coding? World knowledge? Context retrieval? RP? agentic workflows? I’d say it’s a shame they didn’t release the big boy, but at least we have Minimax and GLM…

ProfessionalSpend589@reddit

We still have Qwen 3.5 397B too.

jackmusick@reddit

What is it about all of these subs that attracts seemingly nothing but cheap and entitled people? This is basically magic and you’re complaining about 50 bucks? You all are in for a world of shock when everyone starts charging what it actually takes to make money.

Foreign-Beginning-49@reddit

As long as I locallama lives we shall carry on brother, no world of hurt necessary.

qwen_next_gguf_when@reddit

$50 is a small project with opus if you use API.

rosstafarien@reddit

10m tokens? A very small project.

Healthy-Nebula-3603@reddit

Such amount tokens I have for 20 USD plan for a week using codex-cli

shadow1609@reddit

$50 USD is actually a few prompts if you use Opus via API...

Haha ...true

Fit-Produce420@reddit

It's called a "stupid tax." Too stupid to figure out how to run a model locally? Gotta pay the tax.

DUFRelic@reddit

They don't have enough GPUs why should they sell it cheaper?

SingleProgress8224@reddit

It's useful for those who don't have the hardware at home. Not everyone has a spare >24GB GPU to use exclusively for an LLM, plus the CPU and RAM it will use and that can't be used for anything else. Given the choice of paying 50 per month or a couple of thousands up front, it's not such an easy decision, especially that by the time that you pay back your GPU, the hardware might be deprecated.

666666thats6sixes@reddit

The $50 plan gives you access to more than just Qwen models, they have Kimi, GLM, Minimax and a few others. Decent speed, too.

Jeidoz@reddit

I have seen recently a news (re)post (and 2nd one) where mentioned some issuea and potential stop of releasing open sourced Qwen models. It also can be partially related to the mess with research team in Qwen (which were responsible for developing new models).

mayo551@reddit

What would you consider a reasonable price.