Xiaomi MiMo - MiMo-7B-RL

Posted by AaronFeng47@reddit | LocalLLaMA | View on Reddit | 18 comments

[https://huggingface.co/XiaomiMiMo/MiMo-7B-RL](https://huggingface.co/XiaomiMiMo/MiMo-7B-RL) **Short Summary by Qwen3-30B-A3B:** This work introduces *MiMo-7B*, a series of reasoning-focused language models trained from scratch, demonstrating that small models can achieve exceptional mathematical and code reasoning capabilities, even outperforming larger 32B models. Key innovations include: * **Pre-training optimizations**: Enhanced data pipelines, multi-dimensional filtering, and a three-stage data mixture (25T tokens) with *Multiple-Token Prediction* for improved reasoning. * **Post-training techniques**: Curated 130K math/code problems with rule-based rewards, a difficulty-driven code reward for sparse tasks, and data re-sampling to stabilize RL training. * **RL infrastructure**: A *Seamless Rollout Engine* accelerates training/validation by 2.29×/1.96×, paired with robust inference support. MiMo-7B-RL matches OpenAI’s o1-mini on reasoning tasks, with all models (base, SFT, RL) open-sourced to advance the community’s development of powerful reasoning LLMs. https://preview.redd.it/rhbeynh1awxe1.png?width=714&format=png&auto=webp&s=78ac27cfa4b73b3fcc1cb591f7a1a7b314700ec2

18 Comments

[-]

Holiday_Attitude_200@reddit

an in-depth discusion of mimo-7b: [https://www.youtube.com/watch?v=y6mSdLgJYQY&ab\_channel=AIonAI](https://www.youtube.com/watch?v=y6mSdLgJYQY&ab_channel=AIonAI)

Dangerous-Yak3976@reddit

GGUF: [https://huggingface.co/jedisct1/MiMo-7B-RL-GGUF](https://huggingface.co/jedisct1/MiMo-7B-RL-GGUF)

ReasonablePossum_@reddit

nice they even included hardware reqs

AnomalyNexus@reddit

It's incredibly chatty on the thinking. 2500+ token response to >tell me a joke ...on the plus side it wasn't the one about atoms that LLMs love so much