TheaterFire

AMA Announcement: MiniMax, The Opensource Lab Behind MiniMax-M2.5 SoTA Model (Friday, 8AM-11AM PST)

Posted by XMasterrrr@reddit | LocalLLaMA | View on Reddit | 28 comments

AMA Announcement: MiniMax, The Opensource Lab Behind MiniMax-M2.5 SoTA Model (Friday, 8AM-11AM PST)
Hi r/LocalLLaMA 👋 We're excited for Friday's guests: **The Core Team of MiniMax Lab and The Lab’s Founder!** **Kicking things off Friday, Feb. 13th, 8 AM–11 AM PST** ⚠️ **Note:** The AMA itself will be hosted in a **separate thread,** please don’t post questions here.

Reply to Post

28 Comments

Less_Sandwich6926@reddit

What are the recommended server specifications to run MiniMax-M2.5 locally with good inference speeds \`30\~tps\`?
View on Reddit #78371998

cyysky@reddit

MiniMax 2.5 full precision FP8 running LOCALLY on vLLM x 8x Pro 6000 Hosting it is easier then I thought, it just reuse the same script for M2.1. Time to do the vibe coding test! Generation: 70 tokens-per-sec and 122 tokens-per-sec for two conneciton Peak Memory: 728GB KV Cache: 1,700,000 Tokens
View on Reddit #78334215

Neither-Idea-9365@reddit

where is the separate thread?
View on Reddit #78333828

tarruda@reddit

Impressive to see major improvements while keeping the same architecture! What can you say about the size of a possible upcoming major release such as Minimax M3 (assuming this is in the roadmap)? In other words, are you going to continue improve training and extract more performance from similar LLM sizes or are there plans to increase like z.ai did with GLM?
View on Reddit #78330020

Top_Cattle_2098@reddit

We believe that even though the M2 size isn’t the largest, the M2 series is still the best open-source coding and agent model—mainly thanks to RL scaling. M3 is more powerful than M2 and will be available in the not-too-distant future. We hope it can reach the level of the best closed-source models, while also delivering breakthroughs of its own.
View on Reddit #78332760

ptxtra@reddit

What is your roadmap? Will we see MiniMax 3 in the near future? How about multimodal models?
View on Reddit #78326996

Top_Cattle_2098@reddit

We have two iteration roadmaps. Along the M2 series, we’ve been continuously strengthening capabilities in coding, tool calling, search, office/workspace, knowledge, and related areas—and after 2.5 there will be new versions as well. This progress mainly relies on reinforcement learning scaling. In fact, we may be the company that has updated its models most agilely over the past three months. We’ve spent a lot of time developing the M3 model, which is natively multimodal, and we hope it can push through some boundaries.
View on Reddit #78332446

Top_Cattle_2098@reddit

We have two iteration roadmaps. Along the M2 series, we’ve been continuously strengthening capabilities in coding, tool calling, search, office/workspace, knowledge, and related areas—and after 2.5 there will be new versions as well. This progress mainly relies on reinforcement learning scaling. In fact, we may be the company that has updated its models most agilely over the past three months. We’ve spent a lot of time developing the M3 model, which is natively multimodal, and we hope it can push through some boundaries.
View on Reddit #78332428

SAPPHIR3ROS3@reddit

Please please please can you release some 20/30b?
View on Reddit #78332119

nebulaidigital@reddit

Nice, looking forward to this. The “Open-source lab behind MiniMax-M2.5 SoTA model” angle is especially interesting because it’s usually hard to separate model quality from the surrounding stack (data, evals, tooling, post-training). For the AMA, I’d love to hear specifics on: (1) what your evaluation harness looks like (public vs internal, contamination checks), (2) what you consider the key ablations that got you to “M2.5,” and (3) how you’re thinking about serving constraints for local users (quantization targets, context length tradeoffs, recommended runtimes). Also curious whether you’ll release recipes or just weights.
View on Reddit #78326893

siegevjorn@reddit

Which consumer hardware is it most optimized to? What quant do you recommend to warrant it's capability?
View on Reddit #78326748

Best_Sail5@reddit

Do you plan on releasing Forge the framework you used to train the model?
View on Reddit #78324715

QuackerEnte@reddit

Hey, you guys offer a really great model for its size (as compared to the recent behemoth of a model, GLM-5). It gives us a chance at running it locally. My question is, are there ever gonna be smaller models? 30B MoEs, or smaller dense ones, or something like that? Also, since you are listed publicly now and need to fulfill shareholder interests, one concern I have from companies who IPO'd (like yourself, GLMs z.ai and such) is the discontinuation of releasing open weight models, or investing less into R&D since it's much more expensive than just training a good model with existing, proven architectures, which could result in less innovative solutions. What is your stance on that? (As in: how does the research landscape look internally for example? Any hints on interesting things you guys are working on behind the scenes? I heard thar persistent memory, test time learning etc. is hot in research this year) Thank you for being here!
View on Reddit #78306555

Silver-Champion-4846@reddit

There's gonna be (or already exists, dk) a separate thread for questions, the mod said that.
View on Reddit #78311130

QuackerEnte@reddit

missed that guess it'll serve as a note
View on Reddit #78316137

Significant_Fig_7581@reddit

Please somebody ask them to make a lite model for our potato PCs , In the 20B-30B range.
View on Reddit #78302355

Silver-Champion-4846@reddit

Fake legal message start: By virtue of the existence of computers that do not possess a graphics processing unit (GPU), I hereby forbid you from referring to any computer that is capable of inferencing a Large Language Model (llm) above four billion parameters with the label 'potato pc'. Thank you for your understanding. Fake legal message end.
View on Reddit #78311094

Miserable-Dare5090@reddit

Ask me anything: Why are we not seeing the weights on HF?
View on Reddit #78297598

LegacyRemaster@reddit

9 hours from now
View on Reddit #78310596

Swimming_Whereas8123@reddit

Very interested in the model, and the weights. Open-weights is the way to go, more involved engineers try it on their DGXs and then pitch it to the business for broad deployment. Just like OpenAI outsources their billing to Stripe, serious businesses will outsource inferencing since it is not their core business. This is how the open-weights business model works. Getting engineers hyped and grabbing the company credit card to scale beyond local. Any fuel injected into the hype-train will be lost whilst the brakes are engaged.
View on Reddit #78305750

goodtimtim@reddit

with respect, kinda lame to come hype your new model on r/LocalLLaMA before releasing the weights
View on Reddit #78300768

__JockY__@reddit

Heh, didn't think of this angle but yeah there's an irony of coming to LocalLlama when most of the questions are just gonna be "wen eta M2.5?"
View on Reddit #78302220

Icy_Initiative_162@reddit

Ask me anything: Actually, at present, Step-3.5-Flash is still slightly too large for on-device inference—for instance, in an environment with an M3/M4 Max chip featuring 128GB of RAM, the available memory allocated for inference is quite limited. A more suitable parameter count might be around 80–85 GB for the model itself, with approximately 10 GB reserved for KV-cache. This configuration would theoretically be more practical in practice. Are there any plans to develop a subsequent model based on such parameter allocation?
View on Reddit #78299699

Shivacious@reddit

How do you guys polish ya ballz question to minimax teams
View on Reddit #78299633

XMasterrrr@reddit (OP)

Hi r/LocalLLaMA 👋 We're excited for Friday's guests: **The Core Team of MiniMax Lab and The Lab’s Founder!** **Kicking things off Friday, Feb. 13th, 8 AM–11 AM PST** ⚠️ **Note:** The AMA itself will be hosted in a **separate thread,** please don’t post questions here.
View on Reddit #78293437

mikael110@reddit

So does that means they'll release the M2.5 weights before the AMA starts. If they don't it will be a touch awkward to say the least. And arguably against the spirit of r/LocalLLaMA AMAs.
View on Reddit #78294618

Accomplished_Ad9530@reddit

Yeah that's a little odd. Even if they released the weights right now, people wouldn't really get to check it out before the AMA happens. I guess people will have questions about other stuff, but it seems like a missed opportunity.
View on Reddit #78295191

noage@reddit

I think the better questions will be bout the process and future plans moreso than the experience with this specific model. It's not common to have this type of AMA so i'm all for it.
View on Reddit #78295580