MiniMax M3 - Coding & Agentic Frontier, 1M Context, Multimodal
Posted by dryadofelysium@reddit | LocalLLaMA | View on Reddit | 184 comments
Posted by dryadofelysium@reddit | LocalLLaMA | View on Reddit | 184 comments
HavenTerminal_com@reddit
open-weight model with no weights to find is a bit of a move
Aggravating-Sale-191@reddit
maybe have some patience it just dropped today... it always takes a little to release them
WebOsmotic_official@reddit
1M context is nice, but the real test is whether M3 can stay coherent after the 40th tool call and still edit the right file. agentic models keep getting marketed on context size when the expensive failures usually come from bad routing and scope drift.
nomorebuttsplz@reddit
slop
XccesSv2@reddit
sad no 10$ plan anymore
Brilla-Bose@reddit
https://platform.minimax.io/subscribe/token-plan i see a 10$ plan
XccesSv2@reddit
how? I just see Plus Max and Ultra. I had a 10$ plan last month but now its not showing me again (germany). Maybe they're doing some geolocation procing stuff there?
EmPips@reddit
This is either:
way bigger than ~250B params
benchmaxed to previously unachievable levels of success
a breakthrough that'll be a landmark moment in the open weight space forever
One of these is true and we'll know which in a few days.
Ok_Technology_5962@reddit
My guess its same peram count based on a few tests. They did train on 100T tokens now which is 2x more than Mimo v2.5 300b verson. and probably has sparce attention kind of like Deepseek or Qwen etc so can maintain speed to larger maybe 7:1 or something. MTP heads not enabled I assume since its slow. That and COST is same as stepfun almost exaclty 10 cents difference
nomorebuttsplz@reddit
My money is on all of the above
unspecified_person11@reddit
Electricity + tens of thousands of dollars of hardware.
Jazzlike_Bee_3129@reddit
A cloud subscription covered the second half.
Jump3r97@reddit
Well, then you are back to API costs...
nomorebuttsplz@reddit
"another rung on the ladder" as in we're not there yet lol.
lemon07r@reddit
Mix of the first two for sure.
nuclearbananana@reddit
every minimax model has claimed to be SOTA among open source and it's never gone anywhere
Due-Memory-6957@reddit
I remember way back then, it being the best open source model available. Back when Llama was still a thing.
Zc5Gwu@reddit
In personal experience, minimax is pretty good at agentic coding but isn't a great "general" agent. It's like it has one piece of claude but not the whole picture.
nuclearbananana@reddit
yeah that's probably a good description. I always found it mediocre at writing so never used it much
noiserr@reddit
I mean it's been one of the best models in its weight class.
Thomas-Lore@reddit
It is 456A46B.
XCSme@reddit
Same category as Xiamo MiMo-V2.5 Pro and Gemma 4 31B, but 10x more expensive and 2-3x slower.
Short_One_9704@reddit
Where is this comparison from please?
XCSme@reddit
https://aibenchy.com/compare/minimax-minimax-m2-7-medium/minimax-minimax-m3-medium/
XCSme@reddit
M3 is still better for coding though:
pipyakas@reddit
Seems like they ditched the smaller 10$ sub, all subs now starts at 20$
Brilla-Bose@reddit
no i can see the 10$ sub
cmitsakis@reddit
I can see it but maybe it's because I'm paying for it. If I log out it's not visible. Do you see it without being subscribed to it?
Ludbr@reddit
And they have MUCH less usage now, even if you would use the older models.
I just made the math: now even if you use the 2.7 you have 3 times less usage on the 20 USD plan.
And if you use the M3, you have 6x less token usage than 20 USD plan used to offer for 2.7.
M3 better be REAL good, because if it isn't, then the subscription lost it's real value which was being the cheapest one available (opencode, for example, will be way better than).
Saifl@reddit
Subsidized opencode is a better deal? But thats if m3 was discounted right?
Btw i dont think the previous subs for minimax was ever sustainable. Its probably just to pull for more data (its not logical for them to pull more users or retain users, look at glm, people just ditch em after old subs run out cuz new ones suck ass)
Thomas-Lore@reddit
It is not subsidized, they get discounts on the api pricing. "Discounted" would be a better word.
Django_McFly@reddit
If it's approaching frontier, $2.40/M output $0.60/M input isn't particularly painful.
Serious-Regular@reddit
I couldn't find how big it is - anyone know?
decentralize999@reddit
Seems similar size as M2.7, or more 1-10B for adding visual encoder. Benchmarks aren't high, It beats recently released Step3.7 which is only 90B and still lower even previous Opus version.
Legal-Ad-3901@reddit
No way they hit these benchmarks at 229b
mlon_eusk-_-@reddit
Nope, it's way bigger
noiserr@reddit
That kind of defeats what made M2 so great, being able to run on local llama unified memory machines. Anything bigger kind of makes it cloud only for most people.
mlon_eusk-_-@reddit
That's probably true, but their goal, to catch up and surpass the frontiers, requires it to be scaled up.
randylush@reddit
If they release massive models that are very well trained, people can distill or quantize them down, no?
mlon_eusk-_-@reddit
Absolutely, also methods like REAP can prune the model making it smaller with minimal depredation.
Toastti@reddit
Not sure REAP can be considered minimal degredation
randylush@reddit
Or even minimal depredation
Particular-Way7271@reddit
Or even minimal
ice_agent43@reddit
Yeah I tried a minimax 2.7 reap it fuckin sucks, gets confused about tools and can't even answer a basic question
Linkpharm2@reddit
Massive. You probably can't handle the girth. No way.
CalligrapherFar7833@reddit
Thats what she said
kitanokikori@reddit
Can we not?
Gunnarz699@reddit
Kholtien@reddit
Yeah, that’s definitely NOT what she said
Due-Memory-6957@reddit
True, she didn't give me any warning and now I can't walk straight.
-dysangel-@reddit
yeah I noticed that extra sass in your walk today
ProfessionalSpend589@reddit
It may be a tight squeeze with my 256bit memory bus, but if the license is right this time - I’ll try.
RazsterOxzine@reddit
I've seen things, I know some can.
kellencs@reddit
api is 2x-4x more expensive, so I guess around 700b-1t
EndlessZone123@reddit
Maybe around double the size of M2.7. The current 50% off rates are the same as M2.7.
Qwen30bEnjoyer@reddit
Bad news. It might be worse than that considering they utilize sparse attention which makes attention cost scale linearly instead of quadratically. Maybe this is a 800b parameter model?
Or maybe they're closer to the frontier which commands higher profit margins?
blackwell_tart@reddit
Looks like we’re going from MiniMax to MaxMax.
ManySugar5156@reddit
1M context + multimodal sounds nuts, but i still just wanna know param count and if it’ll fit on 4x32gb without pain.
Plappedudel@reddit
Doubt it. From my limited personal usage, it seems like a fairly big model. Probably in the 500B+ range if I had to guess.
jreoka1@reddit
You guys need to bring the $10 a month token plan back for light usage! That was perfect!
Brilla-Bose@reddit
https://platform.minimax.io/subscribe/token-plan
they have it though!
tecneeq@reddit
Why is this post in Localllama? It's just a cloud model.
annodomini@reddit
Because they have said they will open the weights.
Yeah, it's not technically open yet, but they have released weights for their previous models and say that they are going to for this one. So this gives you a preview of what will be available in a week or two.
kevin_1994@reddit
It says "The first open-weight model with three frontier capabilities." But I don't see the weights or even any mention of the number of parameters. Anyone know more than me?
Beginning-Bug-7964@reddit
Have they fixed the license? Or is it still personal use only?
annodomini@reddit
We've got to wait for them to release the weights to find out. My guess is probably not, but only time will tell.
dryadofelysium@reddit (OP)
Weights will be released in the next few days
JamesEvoAI@reddit
Then why even bother posting it here? This isn't a local model until we can run it locally
keyboardhack@reddit
It is either astro turfing or ads have just been so normalized that people cant tell the difference anymore. Either suck.
ProfessionalSpend589@reddit
They give you time to purchase a couple of RTX Pro 6000 and have them ready for the release.
DifficultyFit1895@reddit
you’ll need more than a couple
Spectrum1523@reddit
Posting an announcement of when the weights will be available seems like legitimate local news
Orolol@reddit
There was no weights attached to your comment, why bother posting here ?
AnticitizenPrime@reddit
Sorry for getting you out of bed early
Orlandocollins@reddit
Many people do run this locally so its nice to know I will be refreshing my browser every day for a week to see when it is open
JamesEvoAI@reddit
I understand people run it locally, which is why this is annoying. This is effectively an announcement of an announcement, but actually an ad for their API. The actual news will be when we can download it, not pay them for API access.
Snoo_28140@reddit
They announced a release of weights in ~10 days. That sure is of interest for local AI, especially considering this is a major and anticipated launch.
Winter-Editor-9230@reddit
They gotta pay the bills so you can use it for free.
Bakoro@reddit
So it will be "the first open-weight model with three frontier capabilities".
winwinwinguyen@reddit
I bought the 1 year plan last month thinking the benchmark scores were Claude…I didn’t expect it to be, I also didn’t expect it to perform much worse than Coder Next.
I should’ve seen it coming bc it’s a multimodal, but I wouldn’t believe a single claim coming from these guys.
mrtime777@reddit
https://huggingface.co/MiniMaxAI/MiniMax-M2.7/discussions/33#6a1cf12e44942543cdbbc133
eteitaxiv@reddit
Just tried, greatly hallucinates. I asked about Joyce's The Dead, specifically an analysis of the last section. The model, in the reasoning, wrote the text. Only the first one or two sentence was correct, then, after finishing, it realized the most famous section, the closing sentence, was wrong. So only corrected that, accepted the rest as true, then started analysing a completely wrong text.
Outside reasoning, it gave me that analysis without showing the wrong text. I tried more than a few times, and with different works of literature.
So... this is a failure mode it falls into.
GreenGreasyGreasels@reddit
First thing I noticed was the quite singular absence of MMLU Pro and GPQA Diamond in the stats. It's the first thing I check in models - how much does it know and how will can it reason, it's a base requirement, take stakes.
Without that all intelligence is pointless.
EndlessZone123@reddit
Looks like a pretty big model unfortunately if its nearly up there with Kimi K2.6. MiniMax really pushing for bigger and bigger models each time.
GreenGreasyGreasels@reddit
Priced like K2.6, but not even close in quality to it from early tests. Ouch!
alex20_202020@reddit
How is "requests per 5 hours" relevant here?
EndlessZone123@reddit
You get the same amount of api $ no matter the model from opencode go. So it's a halfway decent measure of actual usage cost.
jmorant555@reddit
may i know what app is this?
EndlessZone123@reddit
Screenshot from the OpenCode Go https://opencode.ai/go
FoxiPanda@reddit
This looks to be phenomenal on benchmarks - hoping it's similar in size to M2.7 parameter count wise. grabs some popcorn and waits for weights
bernaferrari@reddit
Price is the same, so either weights is close or they found a way to optimize.
tomz17@reddit
Nope... Price is double.
GreenGreasyGreasels@reddit
Let's hope they are going for the Deepseek/Xiaomi "you know we are so awesome let's make the discount permanent" play. I doubt it , they have neither the pockets nor the engineering of those labs.
siegevjorn@reddit
Ok, somebody just fucking tell us what parameter size it is? How simple is it to be explicit about the size??? Is it dense? Is it moe? Is it 200b? 400b? 600b?
dryadofelysium@reddit (OP)
It's a all-new MoE architecture. Model size will be known when it releases to HF in a couple of days
segmond@reddit
Mean's we are not going to be running it on day 1 locally. DeepSeekV4 hasn't even gotten support and this architecture doesn't sound any simpler.
Sufficient-Bid3874@reddit
Could be 0 day where they collab, though
TopChard1274@reddit
Probably 600 or larger
sturmen@reddit
It’s not local *yet* (weights coming in 10 days) but if you want to try it out to get a feel for it before you build a local coding rig around it, MiniMax M3 is currently free on OpenCode Zen.
EndlessZone123@reddit
Finally another model that has vision. Seems cheap and efficient.
Eyelbee@reddit
Why do you need vision? Unless it's private?
Inevitable-Plantain5@reddit
Im kinda curious, what inappropriate things do you think most people are doing with vision besides sending screenshots of things to search for or pics of errors particularly visual products of code or I had one helping me identify a pic of bird the other day... ? You think most people using vision are sending pictures of their junk to models or something...? Lolol
illgettheownerforyou@reddit
Well, I took him saying Private as sometimes people have a niche that feeds their family and if more people learned about it, they could come in with bigger resources and take the market.
Or classified info or private industry processing- I didn’t think something lewd at all.
Eyelbee@reddit
Lol, what gave you this idea that I think that
Gunnarz699@reddit
Screenshot -> annotate problem -> fix plez
Clanker fixes problem. Am lazy. This quick.
mintybadgerme@reddit
Fix 'this' button and make it green.
FunConversation7257@reddit
A big use I have is involving categorising past papers. I wanted to categorise them topically, separate the different Qs into segmented folders, etc, all relying on strong vision capabilities (currently use 3.1 Flash Lite in flex service tier for this purpose, can Proccess a PDF with like 21 pages for 1 cent)
EndlessZone123@reddit
Vision is useful for a lot of things. Have a PDF that isn't just text and has pictures? A vision model can natively understand what that image is and relate that to the rest of the content. Have a Powerpoint with graphics and graphs? A vision model can have far more context understanding it. There is a distinct advantage when vision is native to a model vs using another external tool/model to describe an image and using the output.
For my use case, vision models are also much more capable at navigating and inspecting web pages and UI. Non vision models can only click on links and inspect source\cli tools. Kimi K2.6 for example can help me automatically inspect visually and navigate a site autonomously, fix issues, review etc. Malformed text formatting, padding/spacing, non rendering content etc.
Hannibalj2ca@reddit
Mimo 2.5 310b also
EndlessZone123@reddit
I wish 2.5 Pro was multimodal.
FoxiPanda@reddit
A trick I like to use in my custom harness is to create a tool and assign it to text-only models that is simply "analyze_image" and it uses one of my other warm local models (usually Gemma4-26B-A4B because it's fast and pretty good at vision tasks I need it to be). This pretty much negates the "darn I wish this model had vision" issue though does require you to have a second model warm and handle the tool routing yourself.
InsaneDiffusion@reddit
I tried that in Goose with MiMo 2.5 Pro but it didn't work, returns an API error :/
AnticitizenPrime@reddit
Hermes Agent has this behavior as default. You can assign models to 'roles' such as vision, text summarization, etc and it will use those. I keep Gemma 2b 'warm' as you put it for vision and other roles that suit it.
craftogrammer@reddit
Actually I am building my own assistant like claude cowork but more, and it can already do most of it.. Just a bit of slow as I dont want it to be another AI SLOP work. This prompt steering injection mid between based on role has some good stuff I can think of with combination of local + online models.
Stooovie@reddit
I "made" a Vision Delegation tool in OWUI for this, yes. Problem solved.
craftogrammer@reddit
Gonna try this today, thanks. It's good idea.
_derpiii_@reddit
That's a great idea. Thank you for that :)
Hannibalj2ca@reddit
It is multimodal already. It is native omnimodal
annodomini@reddit
The new Step 3.7 Flash has vision.
counterfeit25@reddit
“M3 will soon be fully open-sourced on HuggingFace and GitHub”
LPFchan@reddit
open weights + 1M context + multimodal is surprisingly hard to find, excited!
redblood252@reddit
Unfortunately it might be too big to be hosted in anything besides a datacenter
Jazzlike_Bee_3129@reddit
Open models seem to be going in the wrong direction in that regard, unfortunately.
LPFchan@reddit
hoping for a smaller variant 😭 same goes with qwen3.7
Barafu@reddit
It is 22 times more expensive than DeepSeek. Because when coding, the cache read token price is the only one that matters.
Are you sure it is 22 times better?
microbass@reddit
I love DS4 flash. No way this is 22x better.
Thomas-Lore@reddit
It is local, weight will be released soon.
Xhatz@reddit
For now I am a bit dissapointed, first I have to pay twice the price because the dropped the 10$ plan, and then... it just does not feel smarter, quite the opposite even. It BURNS through my quota while not doing a lot, struggling a lot on a simple web app, and it is so painfully slow compared to before. I really hope it will get better, for now it is not worth it IMO.
AppealSame4367@reddit
I expected a Chinese move to overtake the Americans.
It's still crazy to see it happen in real time.
Also: Where GPT 5.6? We know the Chinese and their sharing of knowledge between the labs + state sponsored minimal inference cost. They will hammer the market with 5 different variants of this in the next two weeks.
Ok_Warning2146@reddit
Wow, Minimax is such a fast follower. They already repackaged Deepseek's DSA to their MSA.
Ok_Warning2146@reddit
Great! Another cheap alternative.
JanErikJakstein@reddit
Why are most benchmarks better than GPT 5.5? The benchmaxxing has to hurt real usage of these models.
Revolutionalredstone@reddit
I absolutely love seeing OpenSource keep things honest ;D
tecneeq@reddit
It's not open source. It's not even open weight, or am i blind?
Revolutionalredstone@reddit
O.S. release in scheduled for 10 days, Enjoy!
tecneeq@reddit
Nice
lostinmahalway@reddit
I am here only waiting for reviews
CodePalAI@reddit
1m context sounds great but for coding i still care more about what it chooses to read. huge window full of junk is worse than small clean context. this is basically the whole fight in CodePal AI, not “can model see everything” but “can it see the right 12 files.”
Long_comment_san@reddit
Something tells me this is 600b+ tier
tecneeq@reddit
MiniMAx 2.7 (which is open weight) is 230b.
Long_comment_san@reddit
Yeah but they have a model very close to Claude, no way its not a deepseek competitor.
Nicoolodion@reddit
Really cool can't wait to try it
Lost_Foot_6301@reddit
is the minimax team anywhere near as good as the ZAI team?
WithoutReason1729@reddit
Your post is getting popular and we just featured it on our Discord! Come check it out!
You've also been given a special flair for your contribution. We appreciate your post!
I am a bot and this action was performed automatically.
Barubiri@reddit
I've been testing it and the model just stops working, I keep saying " please continue and hten it keeps htinking and then stops, it hasn't been able to finish at least 1 request...
cantTankThisFox@reddit
Facing the exact same issue. Idk what is going on
Qwen30bEnjoyer@reddit
I've been daily driving MiniMax M2.7 since its release via the coding plan. Its past that critical threshold for serious work that the only think I felt held it back was a lack of multimodality.
Seeing that headline of multimodal work made me literally scream in excitement.
Charuru@reddit
sadly this is probably way bigger than 2.7
Qwen30bEnjoyer@reddit
Given how in some narrowly defined benchmarks it performs roughly the same - I really hope not.
fugogugo@reddit
any pricing info?
0xd34db347@reddit
.30/m in 1.20/m out on openrouter
Kazushi998@reddit
bet this is +300B in size
IsJaie55@reddit
Damn this looks hella massive
WhyLifeIs4@reddit
secunder73@reddit
we need some upscaler AI model to read this
dryadofelysium@reddit (OP)
https://filecdn.minimax.chat/public/img_v3_02128_b7726cd8-879a-4b7a-a9da-db4395ea597g-1780272508686.jpg
jonydevidson@reddit
That's some pretty good upscaling, wow!
Technical-Earth-3254@reddit
You got a link for that table? The readability (at least on mobile) is really bad
dryadofelysium@reddit (OP)
https://filecdn.minimax.chat/public/img_v3_02128_b7726cd8-879a-4b7a-a9da-db4395ea597g-1780272508686.jpg
Technical-Earth-3254@reddit
Appreciated. No Scicode score yet, I'm waiting for that. The rest looks decent, but most benchmarks sadly mean nothing right now.
kevin_1994@reddit
Needs moar jpeg
dryadofelysium@reddit (OP)
https://filecdn.minimax.chat/public/img_v3_02128_b7726cd8-879a-4b7a-a9da-db4395ea597g-1780272508686.jpg
grumd@reddit
https://filecdn.minimax.chat/public/img_v3_02128_b7726cd8-879a-4b7a-a9da-db4395ea597g-1780272508686.jpg
Link to full size image
texasdude11@reddit
really hope it can continue to fit in 2 of the 6K-Pros
LoveMind_AI@reddit
M3 might be a screamer. Just finished an early version of my typical "how well does it do instantiating dense, psychometrically grounded personality profiles?" benchmark and I'm both impressed as well as cautious. On the positive side the model writes very naturally - the developers did not make a code monkey without vibe. There's absolutely vibe to go along with the capability. The entity tracking in M3 is *EXCELLENT.* In terms of writing psychometrically dense profiles and reverse scoring them back to the original questionnaire data, it's terrific. Generating the actual life stories narratives that we use as the main part of the "can this model instantiate a very specific behavioral profile and transform that into open narrative in a way other models can recover the scores from," however, is difficult for M3. I haven't dug deeply enough into it to know why that is.
I'll be doing a standalone post on this later tonight, but wanted to throw something up VERY quickly for y'all to check out. As always, my reddit posts are 100% human generated - but all the text at these links is GPT-5.5 (specifically, a version of GPT-5.5 running on a specialized psychometrically dense profile with the handle "Cairn" - the Cairn profile + the GPT-5.5 model running in Codex is hands down the most stable, creative, and productive AI agent I have ever had a chance to use)
Plappedudel@reddit
Maybe I'm dumb, but when I connect to the official API, M3 is not available. I'm on the $20 plan. Has anyone else been able to use M3 via minimax.io? Or is it currently only available through Opencode?
CalligrapherFar7833@reddit
gguf when
No_Conversation9561@reddit
I’m thinking 400B to 500B
Darkmoon_AU@reddit
Go Minimax! Despite trailing in the AI zeitgeist, this team have stayed hungry, pushing their own research, not just copying others... looks like the hard work is starting to pay off. Full support!
rpkarma@reddit
MiniMax and StepFun are by far my favourite labs haha
dryadofelysium@reddit (OP)
As a massive Gacha/HoYoverse games enjoyer I put them on my radar when HoYoverse decided to invest in MiniMax a short while ago and betting on them.
lochyw@reddit
Btw swe bench is illadvised these days as it's not really a good or accurate benchmark anymore.
https://deepswe.datacurve.ai is the much better/more accurate alternative.
chawza@reddit
Is good? Better than DS v4 flash?
Iory1998@reddit
DS 4 combo is phenomenal!
WhyLifeIs4@reddit
we need that tech report stat
dryadofelysium@reddit (OP)
https://www.minimax.io/blog/minimax-m3
WhyLifeIs4@reddit
referring to
Legal-Ad-3901@reddit
They cooked
Beamsters@reddit
Neck and neck to gpt 5.5
Technical-Earth-3254@reddit
Oh wow, finally with vision. Anyone know the total parameter size and active parameters? The benches look promising, that's for sure. Is it native 16 bit? Or did they go the Deepseek Flash route?
Happythen@reddit
Not open weights on huggingface yet, but eminent!
tarruda@reddit
Is there any information on the number of parameters?
HugoCortell@reddit
Looking forward to seeing how it stacks against deepseek V4
Recoil42@reddit