We have a new weight class...
Posted by LegacyRemaster@reddit | LocalLLaMA | View on Reddit | 123 comments
Maybe this is the beginning of a trend! We'll see...
Posted by LegacyRemaster@reddit | LocalLLaMA | View on Reddit | 123 comments
Maybe this is the beginning of a trend! We'll see...
Thomas-Lore@reddit
It is misleading. M2.7 allows you to use the model commercially - just not serve it to users (as provider). Here is a discussion: https://x.com/RyanLeeMiniMax/status/2043573044065820673
So it affects no one here. Just providers who were taking money form users and giving back nothing to minimax while serving the model with wrong settings.
belkh@reddit
this sounds fine in theory, but when Minimax bumps pricing, what can you do? the majority are not going to be able to host that locally.
I prefer the qwen approach of having the slightly better/bigger context model on their on platform with the base open, though minimax's approach probably makes more sense financially
relmny@reddit
With a single 5090 you can run q4 and get over 10t/s.
A 5090 is, I guess, a top gaming GPU but is not a top local llm GPU...
belkh@reddit
10t/s at 128 context is not really that useful outside chat apps, at Q4 as well, full precision m2.7 is just usable but still quite not there, downgrading to Q4 and slowing to a halt is definitely not usable for agentic flows
-dysangel-@reddit
Within a few years, anyone who wants to is going to be able to run this version of Minimax locally. There is going to be such an insane glut of cheap AI hardware in the future as data centres (and folks on here!) upgrade. Just look at how cheap you can get older gen tensor core GPUs and servers now.
In short, it's only going to suck for us if they aren't releasing the weights. So I hope that continues. If it stops then we have to switch to distributed training.
Borkato@reddit
What older gen tensor cores are cheap?
-dysangel-@reddit
Old nVidia Teslas for example.
Aphid_red@reddit
V100, MI50, MI60. \~10 year old hardware is cheap. Does it take tinkering to run modern models? Yep. Is it loud, poor efficiency, and power hungry, also yes. But it still has more VRAM than consumer cards per $ spent, so it's still better at getting a model to run in the first place.
zdy132@reddit
Plus, if there is truly a need to run it, you can always rent a gpu server now. Serving yourself is still allowed by the license.
rpkarma@reddit
M2.7 is smaller than I expected. There’s a world where people can run it locally in the future.
_metamythical@reddit
Actually, prefer Minimax's model to Qwen model, where we have the full model to work with.
hotcornballer@reddit
I don't mind it, cuts out the middleman. If you need the model just rent gpus if you don't want to bother just pay minimax and that money goes towards making better models.
ResidentPositive4122@reddit
That is simply not true, and your take is nowhere in the linked post.
Quote from the post:
And from the license itself:
You cannot this to ship anything commercial.
Also using MIT anywhere in the license and the post is not ok. MIT is the antithesis of this license. Just call it NC. It's their work, they decide how they license it, but don't use confusing wording. This is strictly NC, so name it like so.
typical-predditor@reddit
That's not how it works. It's a tool. Any code it outputs is separate from the tool and not bond by the terms of the tool.
Commercial use includes serving it as a provider. Derivative work includes fine tunes.
wil_is_cool@reddit
That absolutely is how it works if that is what the license states. If a license says it can't be used for commercial, nowhere in your commercial product development chain can it be used.
That's like saying that VSCode community can be used by enterprise for free just because it's a tool.
typical-predditor@reddit
i, ii, and iii all refer to running the model (or derivatives) as a service.
Outputs from the model are not bound by the license, and this even goes against the spirit of the license as stated by RyanLee:
If you want to use this to build a commercial product (not LLM-powered, but utilizing the outputs in the course of business) then there are no checks to stop this and there isn't even any way to audit the output to prove it was produced by Minimax.
What the license does is prevent anyone from saying, "Powered by Minimax" and then serving up a vastly inferior product and ruining Minimax's name.
wil_is_cool@reddit
I get Minimaxs intent, but the license directly states that it can't. You need to follow the wording of the license exactly. The wording states
Eg commercial advantage is you can code faster. Includes without limitation means anything else even if its not in the list.
That doesn't matter at all.
Adobe can't audit your pictures to see if the photoshop used to make them was pirated, it's still against license terms and they can still sue you.
If i gave you a piece of paper that you signed that says "I can shoot you in the head if you chew gum", then other people said "nah its fine they dont pursue it", and I gave you gum, would you take the gum? No. Because the paper directly says I'm allowed to shoot you.
I think you are getting confused with LLM providers not owning LLM generated outputs.
Please stop saying that the license as it currently stands allows commercial usage because in its current wording it absolutely does not. It's not about the intent of the creators here, it's law. Someone at the company saying on another platform "nah its fine" doesn't matter.
It's not really a big deal, I believe minimax do want usage to be allowed like this, and if they do they need to update their terms. Until then it can't, it's that simple.
petuman@reddit
It's in the replies
https://x.com/RyanLeeMiniMax/status/2043580021588328927
UmutIsRemix@reddit
I am not understanding? Is there ANY MODEL out there that forbids you from generating code and selling the product that it was generated by a coding agent? What?
petuman@reddit
Model doesn't forbid, license does. Command-R is released CC-BY-NC license, you can't use in commercial context regardless of your use case. Now if you pay Cohere to access the model under different license, there's no problem.
UmutIsRemix@reddit
Imma be blunt with u, u might not really understand what these licenses are about or how they apply. Maybe dont put too much thought on it, nobody is gonna know which LLM generated ur bloated code lmao
petuman@reddit
But you surely understand licenses!
Whether it's enforceable in practice wasn't a question.
UmutIsRemix@reddit
Not comparable. Tool use is auditioned often and is actually enforcable, especially in companies. VSCode has some options for the smaller guy to use properly.
For LLMs it won't hold anyway. Everyone can pick any kind of license for their software, the question the open source community should ask: will it hold?
LegacyRemaster@reddit (OP)
more then all: There is absolutely no way for minimax, gtp, sonnet or any provider to tell if the generated code was made in the paid or free version.
Pyros-SD-Models@reddit
Of course there are ways, especially in a commercial context. Most of those ways are called "employees who want to shit on their employer."
I got like 5k bucks from Embarcadero just for ratting out my employer for using cracked Delphi versions. easiest money in my life.
petuman@reddit
There's no way for Adobe to tell if content is produced with pirated Premiere/Photoshop/AE/etc. Any serious business wouldn't risk it.
Even more true in case of a LLM where licensor can look up API usage.
ResidentPositive4122@reddit
For some reason that doesn't load for me, thanks. They should absolutely do that, if that's their intention. The license as it is now is pretty clear about NC use.
Darkoplax@reddit
Problem is Minimax models are quite large, no individual or even small companies can realistically run this locally
This is not like under 120b models that you can see potentially used by individuals
The only way to use this is through another provider and in their case it would be only them
relmny@reddit
That's not true, I run q4 with a 32gb GPU (slower than a 5090) and get over 10t/s.
You might be thinking of GLM or Kimi or Deepseek.
suicidaleggroll@reddit
Lots of individuals here run MiniMax.
Darkoplax@reddit
Lots of individuals can run over 200b locally ? damn
suicidaleggroll@reddit
Sure, you can even run it on a $7500 Mac Studio M3 Ultra 256 GB. The speed won't be great for agentic tasks, but it'll work.
someone383726@reddit
What if I want to use it as a chat help agent on my website?
lacerating_aura@reddit
I wouldn't be so sure about that local helper for commercial projects, because I haven't read the license, but yeah, this seems to be a good direction in my book, as in their points are reasonable.
Momo--Sama@reddit
I would be beyond shocked if their intent was “you cannot run Minimax locally to work on paid software.” I highly doubt that’s the case
ResidentPositive4122@reddit
(emphasis mine)
It's pretty clear, mate. This is a NC license.
Ok-Scarcity-7875@reddit
There's missing that Kimi K2.5 requires you to mention its use on all your products if you have more than 100 Million users or 20 Million $ monthly:
Quote:
So never get too successful with your business if you use Kimi. Just stop right at 19.9 Million a month and stop user's from registering when you have more than 99 Million. haha
ProfessionalSpend589@reddit
I don’t think this applies if you don’t ship Kimi K2.5 to 100 million users or give access to it.
Ok-Scarcity-7875@reddit
Sure it does, or what do you think does "or any derivative works thereof" means?
ProfessionalSpend589@reddit
Generating text which I may embed is in no way derivative work.
It’s like saying compiling a program is a derivative work of gcc and should be licensed the same way. Not a lawyer, but I’m 100% sure you’re too strict in your interpretation.
Ok-Scarcity-7875@reddit
OK: I asked Chat who was right and it turns out that text which is written with Kimi does legally not fall under derivative work. Still Chat said I had a point on licensing philosophy.
Aphid_red@reddit
Uh huh... so you asked an AI model... do you have the actual weight of this outcome? (Can be a bit complex if a thinking model). If you ask the same question 1,000 times, do you get 1,000 yes'es or maybe 715 yes and 285 no? These are probabilistic guessing machines, please do research and never assign any authority to them for non-obvious questions.
I happen to agree (that only the terms of use of the product itself can be covered, not the artifacts resulting from your use of it), but I'm not a lawyer. Certainly not a federal judge who'd end up making the call if big companies sue eachother. And I think there's an argument for it can go both ways.
For example, an online service that provides a model can and does often put 'content restrictions', which technically do apply to the output, that say you can't generate X or Y with the model; which means you might end up paying, asking for it, then receiving nothing (or even your account and its balance taken away). I'm not aware of any offline programs that do this, but I wouldn't be surprised if say office365 took your license away if you wrote a censored document.
What is this license? Is it a copyright license, or a contract? Did the end-user agree to the contract, or is the end-user a client of a party that agreed to said contract? Things can get complicated!
Mochila-Mochila@reddit
And how would this hurt your business exactly ?
Ok-Scarcity-7875@reddit
It is not about hurting a business. It is about the definition of open source. So you think it is okay that you have to label your Software or Texts with "Made by Kimi K2.5 "?
Imagine that would apply to coding languages. "This software was written with C++"
Long_War8748@reddit
This is already the case? Even the most permissive licenses, like MIT, Apache 2.0, or BSD, have mandatory attribution requirements. E.g. Rust even has a special cargo-about helper that takes care of that for you.
Middle_Bullfrog_6173@reddit
None of those licenses apply to use of the software (on your server to serve an app or api). Even GPL puts no requirements on that. Their requirements only apply when distributing copies of the software.
ukrolelo@reddit
Hmmm where is qwen3.5 27b? Oo
LegacyRemaster@reddit (OP)
alerikaisattera@reddit
The correct division is:
WeGoToMars7@reddit
It's unfortunate that the term "open-source" (category 1) has been essentially hijacked by labs releasing open-weights models (category 2).
And there are true open-source models! Models like Olmo from Allen AI (https://allenai.org/olmo) should receive more support and attention. Maybe someone here knows of more recent examples?
keepthepace@reddit
The problem with true open source is that it is basically illegal everywhere with copyright laws to release a decent open dataset. Companies and labs are forced to use a "don't ask don't tell" stance about the presence of copyrighted material in the training dataset, as training on it is gray zone, but distributing it is clearly illegal.
WeGoToMars7@reddit
Wouldn't a 100% synthetic dataset completely sidestep the copyright law, at least in the current de facto interpretation? The research shows current models can output copyrighted material word-for-word, but that requires very specific prompting to work.
xoexohexox@reddit
I mean I think so but the natural next question to ask is the training dataset that the model that generated the next training dataset was trained on.
I was reading about a music gen model that is trained on first principals with data on what individual instruments and effects sound like instead of songs, I thought that was an interesting approach.
keepthepace@reddit
Yeah, that's the hope to sidestep the issue for 5 more years until they plug that loophole too but at one point we do need a copyright reform. It is not sustainable to constantly engineer circumvention solutions to what is clearly an inadequation of laws to the situation.
Boxy310@reddit
Some of the organic "flavor" seems to be necessary to prevent the need for over-specified prompting. One would expect the need for more "babysitting" of a model if it didn't grasp certain organic abstractions that even a socially dead programmer would throw out as a metaphor.
re-thc@reddit
How is this a problem with true open source? The fix to having done something illegal isn't to hide it.
keepthepace@reddit
The problem is with the law and, quite frankly, with corruption within the US legislative process.
Yes, the real fix is to fix the law and the legislative process.
I. for one, am happy with an inelegant solution that allows us to still have open models in the meantime.
basxto@reddit
The solution is to create training material that can be shared legally
keepthepace@reddit
Sure, let's rewrite all the scientific articles by hand, which is probably illegal anyway, and they should have been public to begin with.
NinjaOk2970@reddit
Funny how the copyright law is, again, creating more actual copyright violation.
charles25565@reddit
Those "true open source" models meet the OSD for AI definition. There's little recourse for a company using the software version instead. There's little recourse in general since only the logo and
OSI Approved Open Source Licensetext are trademarked for software.Pyros-SD-Models@reddit
Just looked at the big open weight releases of the last 12 months and literally no lab is calling what they do “open source”.
This sub was calling open weight “open source”during llama2 already so if something hijacked anything than it is this very sub
relmny@reddit
Yes, is unfortunate, but as I said many times, if the general public doesn't really understand what "open source" means, you can imagine that "open weights" will be even way more complicated.
And, although you mention one, I don't think the number of "open source" models is relevant. Sadly. They might be good for research (maybe fine-tuning?) and so, but I doubt that they are relevant for day to day use.
So as long as the, wrong, name "open source" gets out to the general public, that's good enough, I guess.
At least people will start to understand that there's something else other than the commercial ones.
JustOneAvailableName@reddit
Open-source without open-data is essentially useless. Open-source without open-weights is also essentially useless, as the "compile" (training) is very far from free. Allowing users to change the compilation-artifact to suit their needs is in my opinion the most important part of Open-source. In that sense, I do think that open-weights for ML is most comparable to open-source for software. Of course, getting all 3 plus the training logs would be even better.
Huggingface! They don't just share other's work but also do some interesting work themselves.
silenceimpaired@reddit
Hyperbole to make a point is acceptable… but I’m finding use in open weight models via inference.
JustOneAvailableName@reddit
I think you misread my comment, I also think the weights are by far the most important part to be open.
WeGoToMars7@reddit
I completely agree with your reasoning. The argument is more about principle. "Open-source" or "FOSS" means a very specific thing, and it's really important not to let companies shift the Overton window to the side that benefits them and hurts the consumer in any way.
JustOneAvailableName@reddit
FOSS includes the “free” part, open-source isn’t alway FOSS. In other words: you’re shifting the Overton window ;-)
Potential-Gold5298@reddit
K2-V2 by MBZUAI.
ChocomelP@reddit
what in the
alerikaisattera@reddit
The most unfortunate is the term "open-source" itself. Not only it's a misnomer, but it exists solely because the leader of FSF is a nut
It also should be noted that even if the developer does not misrepresent proprietary software as free, brainless simps that don't know that the terms "free" and "open source" refer to the license and not public availability will do it for them
On the other hand, it's totally possible to have a free software license that is a total nightmare for practical use. Such an example is the requirement that all modifications be distributed as patch files
relmny@reddit
Why? what is "free software"?
If it's only "open source", then that category is only for 1-2 models that I bet only a very small percentage use.
I prefer the division as it is in the image. Is clear (at least to me), and although misses "open source", there are only a few of them, and sadly they seem to not be relevant.
MarcCDB@reddit
Wake me up when technology advances so much that I can run a 397B in my 16GB GPU.
LowNo5605@reddit
0.5 bit quantized 🥀
jreoka1@reddit
Gemini 3.1 pro is a good model but I feel like its not as good as all the benchmarks show in real life usage.
PatagonianCowboy@reddit
Mark Zuckerberg yapped so much about how important was for AI to be Open and now he's a quiet grifter
MoffKalast@reddit
Yapping about how important open AI is was the grift. The whole point was that Meta was a decade behind the rest and did it as a compromise to commoditize everyone else's work at the expense of not much of their own.
DeepOrangeSky@reddit
Why is everyone on here so sure that they aren't going to release any more open weights models? They still said they fully intend on releasing the Abocado, Paricado, etc, open weights models, afaik, just delayed a couple months. I mean, yea it's possible they cancel it or something, but also possible it's just actually delayed a couple months and they release more open weights models pretty soon.
I mean, Google, Alibaba (Qwen), Mistral, and a bunch of the other main labs have proprietary models and still release open weights models too, even though they also have proprietary models. So, I don't see why Meta creating proprietary flagship models necessarily means they are done with open weights.
I am a noob, and fairly new here, so, maybe I am missing something, but, are people assuming it just randomly just purely for the sake of being pessimistic, or is there some actual reason or major leak or something as far as why everyone always talks about it like it's some sure thing that they are just actually done with open weights models from here on out?
Zulfiqaar@reddit
People lost faith in them after they failed to follow through on their most recent planned releases - eg LLaMa Behemoth..and that itself was a very long time ago in AI terms, and since then nothing really has been released. They also said they wont release weights to everything as they invest more into the training.
I'm cautiously optimistic about some release, as they've come up with a decent model Muse Spark after quite a while, and I'm hoping it was because of underperformance instead of proprietary focus that nothing has been opened yet. Not counting on it though
lizerome@reddit
They don't even need to train a Claude competitor, we already have dozens of those. Just train a decent 8B/12B model that people can run on home hardware. That's all they really need to redeem themselves, and it would cost them a fraction of training the "superintelligent frontier AGI". If Behemoth-2380B-A100B remains closed source, who cares. You're only ever using it through an API anyways.
LizardLikesMelons@reddit
Open as in open your wallets
Or You're dereferencing a null pointer. Open your eyes! SLAP! (Attention on the slap part)
StupidScaredSquirrel@reddit
Selling a product doesn't mean you're a grifter lol some of you are so entitled
PatagonianCowboy@reddit
What about lying? saying things you don't believe in just for clout?
StupidScaredSquirrel@reddit
I'm pretty sure he believed it otherwise he would not have made llama open source in the first place. Almost all AI providers provide some open source models as well as close sourced models. If they didn't believe open source to be important they would just be closed on every front.
Lucyan_xgt@reddit
This is locallama dumbass
BrightRestaurant5401@reddit
Childish comment to make, look at graphs like these.
Only releasing open models attracts "grifters", Meta found out the hard way.
No version of llama yielded them anything then more begging
Aggressive-Permit317@reddit
New weight class drops are my favorite part of this scene. The jump in capability vs size lately has been stupid. You seeing the same performance per parameter gains or is it mostly marketing until we test it ourselves?
Laoweek@reddit
You really wonder how is Alexandr Wang still not fired lmao.
Zeeplankton@reddit
I mean this is their first model since restructuring, and it's hitting opus / 5.4 mark out the gate. That seems pretty good no?
xoStardustt@reddit
it better be… since it’s trained on distilled opus 4.6 lmao
aka_blindhunter@reddit
Not sure who testing these model Gemini 3.1 pro is the worse waste of money on it.
lewd_peaches@reddit
Ooh, what size are we talking? I'm curious how it performs compared to the older 70Bs.
pigeon57434@reddit
thats gotta be pretty embaressing they got their whole category on artifical analysis
KvAk_AKPlaysYT@reddit
This is sad and scary :(
spaceman3000@reddit
And why is that? Companies are there to make profit.
silenceimpaired@reddit
And why is that? Companies exploited the work of Copyright holders without allowing them to profit or protect their Copyright works from consumption in a tool that will someday replace them.
spaceman3000@reddit
You mean reddit comments? 😂
KvAk_AKPlaysYT@reddit
"Non-commercial MIT" shouldn't be a license type :/
spaceman3000@reddit
I'd love everything to follow BSD licenses but well, world doesn't work that way
Kryohi@reddit
Open source licenses like this have always existed and I don't get what's the problem with them. If you're a business and you're making money thanks to a piece of software or model, is it really that bad to ask for something in exchange?
HeadWorth9814@reddit
honestly the license situation is a mess but ryan saying self hosted code generation is fine makes it pretty clear theyre not going after individuals. the real issue is the wording being confusing as hell. just slap NC on it and move on
HeadWorth9814@reddit
Having the CEO clarify licensing edge cases on Twitter/X in reply threads is exactly why people are frustrated. Write a clear license or update the existing one — don't make users dig through social media replies to figure out if they're legally in the clear.
Maximus-CZ@reddit
Any chart that puts sonnet and opus 1% from each other is showing me it tests completely bogus "performace".
Doug_Bitterbot@reddit
That Open weight leap is why local-first meshes are becoming a new baseline for sovereign AI.
XccesSv2@reddit
Where is the problem? Training a model costs a lot of expertise and money. So if you want to use it commercially, you have to pay for it/license it. As long as its free for private use I don't see any problem.
silenceimpaired@reddit
Where is the problem? Writing code and publishing books costs a lot of expertise and money when considered at the scale it’s being absorbed for these models. So if you want to use it without paying a license and without the permission of the creators… the least you can do is let others have unrestricted access to your model to do the same thing you just did… I.e. Apache or MIT
StupidScaredSquirrel@reddit
Who said there's a problem? It's just a new type
LegacyRemaster@reddit (OP)
Let's also summarize their responses: there is absolutely no way for minimax, gtp, sonnet, or any other provider to determine whether the generated code was created using the paid or free version. Commercial use means using the model to serve other people. They're not talking about the model's output.
My post isn't meant to blame anyone. I'm grateful to them for the code they released. The point of the discussion is to understand whether it represents the right compromise for other companies to follow: rather than not releasing open weights, release them more restrictively to preserve profitability.
silenceimpaired@reddit
The problem is it’s extremely difficult if not impossible to prevent serving a model for profit with legal language that is not at least extremely confusing— pretty sure that’s what Black Forest Labs was trying to do with their Flux models.
seamonn@reddit
Can call it 'Trash Weights'.
One_Internal_6567@reddit
Gemini 3.1 in any benchmark placed high is just ridiculous man
swingbear@reddit
Yeah it’s absolutely terrible for any code related work, I didn’t think it was that smart in other areas either. Really did highlight to me that these benchmarks are to be taken with a huge pinch of salt.
One_Internal_6567@reddit
Exactly. Model that repeat itself and produce nothing but garbage code in their own ide and cli, it’s a joke
reggionh@reddit
I like it. very smart for what I use it for. very cost effective too compared to the others.
TechnoByte_@reddit
Nah, Gemini 3.1 Pro is seriously good
Shocking how it's free even with high reasoning, while others give you small models/low reasoning on their free tier
Duxon@reddit
It's a very good model
michaelmalak@reddit
Llama 3 was already restricted to companies with fewer than 700 million active users in the preceding month. https://www.llama.com/llama3/license/
WithoutReason1729@reddit
Your post is getting popular and we just featured it on our Discord! Come check it out!
You've also been given a special flair for your contribution. We appreciate your post!
I am a bot and this action was performed automatically.
t4a8945@reddit
Now put this in perspective of how many parameters the open-weight models have, MiniMax M2.7 becomes the king.
Afraid_Donkey_481@reddit
Please explain the 10 evaluations that went into this.
q-admin007@reddit
Good!
zzDemire@reddit
I would like to see 8vram optimal comparasion
pier4r@reddit
Just in: https://x.com/RyanLeeMiniMax/status/2043573044065820673 (article about "M2.7 license — what changed and why")
Maybe it should be its own post.