MiniCPM4: 7x decoding speed than Qwen3-8B

Posted by Lynncc6@reddit | LocalLLaMA | View on Reddit | 34 comments

MiniCPM4: 7x decoding speed than Qwen3-8B

MiniCPM 4 is an extremely efficient edge-side large model that has undergone efficient optimization across four dimensions: model architecture, learning algorithms, training data, and inference systems, achieving ultimate efficiency improvements.

https://github.com/OpenBMB/MiniCPM/blob/main/README-en.md