AMA Announcement: MiniMax, The Opensource Lab Behind MiniMax-M2.5 SoTA Model (Friday, 8AM-11AM PST)

[-]

Less_Sandwich6926@reddit

What are the recommended server specifications to run MiniMax-M2.5 locally with good inference speeds \`30\~tps\`?

Reply

[-]

MiniMax 2.5 full precision FP8 running LOCALLY on vLLM x 8x Pro 6000 Hosting it is easier then I thought, it just reuse the same script for M2.1. Time to do the vibe coding test! Generation: 70 tokens-per-sec and 122 tokens-per-sec for two conneciton Peak Memory: 728GB KV Cache: 1,700,000 Tokens

Reply

[-]

Neither-Idea-9365@reddit

where is the separate thread?

Reply

[-]

tarruda@reddit

Impressive to see major improvements while keeping the same architecture! What can you say about the size of a possible upcoming major release such as Minimax M3 (assuming this is in the roadmap)? In other words, are you going to continue improve training and extract more performance from similar LLM sizes or are there plans to increase like z.ai did with GLM?

Reply

[-]

Top_Cattle_2098@reddit

We believe that even though the M2 size isn’t the largest, the M2 series is still the best open-source coding and agent model—mainly thanks to RL scaling. M3 is more powerful than M2 and will be available in the not-too-distant future. We hope it can reach the level of the best closed-source models, while also delivering breakthroughs of its own.

Reply

[-]

ptxtra@reddit

What is your roadmap? Will we see MiniMax 3 in the near future? How about multimodal models?

Reply

[-]

Top_Cattle_2098@reddit

We have two iteration roadmaps. Along the M2 series, we’ve been continuously strengthening capabilities in coding, tool calling, search, office/workspace, knowledge, and related areas—and after 2.5 there will be new versions as well. This progress mainly relies on reinforcement learning scaling. In fact, we may be the company that has updated its models most agilely over the past three months. We’ve spent a lot of time developing the M3 model, which is natively multimodal, and we hope it can push through some boundaries.

Reply

[-]

Top_Cattle_2098@reddit

We have two iteration roadmaps. Along the M2 series, we’ve been continuously strengthening capabilities in coding, tool calling, search, office/workspace, knowledge, and related areas—and after 2.5 there will be new versions as well. This progress mainly relies on reinforcement learning scaling. In fact, we may be the company that has updated its models most agilely over the past three months. We’ve spent a lot of time developing the M3 model, which is natively multimodal, and we hope it can push through some boundaries.

Reply

[-]

SAPPHIR3ROS3@reddit

Please please please can you release some 20/30b?

Reply

[-]

nebulaidigital@reddit

Nice, looking forward to this. The “Open-source lab behind MiniMax-M2.5 SoTA model” angle is especially interesting because it’s usually hard to separate model quality from the surrounding stack (data, evals, tooling, post-training). For the AMA, I’d love to hear specifics on: (1) what your evaluation harness looks like (public vs internal, contamination checks), (2) what you consider the key ablations that got you to “M2.5,” and (3) how you’re thinking about serving constraints for local users (quantization targets, context length tradeoffs, recommended runtimes). Also curious whether you’ll release recipes or just weights.

Reply

[-]

siegevjorn@reddit

Which consumer hardware is it most optimized to? What quant do you recommend to warrant it's capability?

Reply

[-]

Best_Sail5@reddit

Do you plan on releasing Forge the framework you used to train the model?

Reply

[-]

QuackerEnte@reddit

Hey, you guys offer a really great model for its size (as compared to the recent behemoth of a model, GLM-5). It gives us a chance at running it locally. My question is, are there ever gonna be smaller models? 30B MoEs, or smaller dense ones, or something like that? Also, since you are listed publicly now and need to fulfill shareholder interests, one concern I have from companies who IPO'd (like yourself, GLMs z.ai and such) is the discontinuation of releasing open weight models, or investing less into R&D since it's much more expensive than just training a good model with existing, proven architectures, which could result in less innovative solutions. What is your stance on that? (As in: how does the research landscape look internally for example? Any hints on interesting things you guys are working on behind the scenes? I heard thar persistent memory, test time learning etc. is hot in research this year) Thank you for being here!

Reply

[-]

Silver-Champion-4846@reddit

There's gonna be (or already exists, dk) a separate thread for questions, the mod said that.

Reply

[-]

QuackerEnte@reddit

missed that guess it'll serve as a note

Reply

[-]

Significant_Fig_7581@reddit

Please somebody ask them to make a lite model for our potato PCs , In the 20B-30B range.

Reply

[-]

Silver-Champion-4846@reddit

Fake legal message start: By virtue of the existence of computers that do not possess a graphics processing unit (GPU), I hereby forbid you from referring to any computer that is capable of inferencing a Large Language Model (llm) above four billion parameters with the label 'potato pc'. Thank you for your understanding. Fake legal message end.

Reply

[-]

Miserable-Dare5090@reddit

Ask me anything: Why are we not seeing the weights on HF?

Reply

[-]

LegacyRemaster@reddit

9 hours from now

Reply

[-]

Swimming_Whereas8123@reddit

Very interested in the model, and the weights. Open-weights is the way to go, more involved engineers try it on their DGXs and then pitch it to the business for broad deployment. Just like OpenAI outsources their billing to Stripe, serious businesses will outsource inferencing since it is not their core business. This is how the open-weights business model works. Getting engineers hyped and grabbing the company credit card to scale beyond local. Any fuel injected into the hype-train will be lost whilst the brakes are engaged.

Reply

[-]

goodtimtim@reddit

with respect, kinda lame to come hype your new model on r/LocalLLaMA before releasing the weights

Reply

[-]

JockY@reddit

Heh, didn't think of this angle but yeah there's an irony of coming to LocalLlama when most of the questions are just gonna be "wen eta M2.5?"

Reply

[-]

Icy_Initiative_162@reddit

Ask me anything: Actually, at present, Step-3.5-Flash is still slightly too large for on-device inference—for instance, in an environment with an M3/M4 Max chip featuring 128GB of RAM, the available memory allocated for inference is quite limited. A more suitable parameter count might be around 80–85 GB for the model itself, with approximately 10 GB reserved for KV-cache. This configuration would theoretically be more practical in practice. Are there any plans to develop a subsequent model based on such parameter allocation?

Reply

[-]

Shivacious@reddit

How do you guys polish ya ballz question to minimax teams

Reply

[-]

XMasterrrr@reddit (OP)

Hi r/LocalLLaMA 👋 We're excited for Friday's guests: **The Core Team of MiniMax Lab and The Lab’s Founder!** **Kicking things off Friday, Feb. 13th, 8 AM–11 AM PST** ⚠️ **Note:** The AMA itself will be hosted in a **separate thread,** please don’t post questions here.

Reply

[-]

mikael110@reddit

So does that means they'll release the M2.5 weights before the AMA starts. If they don't it will be a touch awkward to say the least. And arguably against the spirit of r/LocalLLaMA AMAs.

Reply

[-]

Accomplished_Ad9530@reddit

Yeah that's a little odd. Even if they released the weights right now, people wouldn't really get to check it out before the AMA happens. I guess people will have questions about other stuff, but it seems like a missed opportunity.

Reply

[-]

noage@reddit

I think the better questions will be bout the process and future plans moreso than the experience with this specific model. It's not common to have this type of AMA so i'm all for it.

Reply

AMA Announcement: MiniMax, The Opensource Lab Behind MiniMax-M2.5 SoTA Model (Friday, 8AM-11AM PST)

Reply to Post

28 Comments

Less_Sandwich6926@reddit

cyysky@reddit

Neither-Idea-9365@reddit

tarruda@reddit

Top_Cattle_2098@reddit

ptxtra@reddit

Top_Cattle_2098@reddit

Top_Cattle_2098@reddit

SAPPHIR3ROS3@reddit

nebulaidigital@reddit

siegevjorn@reddit

Best_Sail5@reddit

QuackerEnte@reddit

Silver-Champion-4846@reddit

QuackerEnte@reddit

Significant_Fig_7581@reddit

Silver-Champion-4846@reddit

Miserable-Dare5090@reddit

LegacyRemaster@reddit

Swimming_Whereas8123@reddit

goodtimtim@reddit

JockY@reddit

Icy_Initiative_162@reddit

Shivacious@reddit

XMasterrrr@reddit (OP)

mikael110@reddit

Accomplished_Ad9530@reddit

noage@reddit