For Non-hallucinating work, MiMo 2.5 delivers
Posted by Beamsters@reddit | LocalLLaMA | View on Reddit | 19 comments
MIT license and fully open source. MiMo-V2.5-Pro was just 3 points from Opus 4.7 max and the normal V2.5 is only a step behind SOTA. But both produce 75% and 68% non-hallucination rate. Best intel/hallucination model yet.
V2.5 FP8 is like 316GB, you *might* be able to run a tight 3 bit quant with 128gb m5 max.
From Gemma to Qwen3.6 to Kimi2.6 to Deepseek v4 to MiMo2.5, this probably is the best April.


InteractionSmall6778@reddit
The 75% non-hallucination rate is the headline, but the real story is what that means for retrieval and tool use in agents - models that reliably don't confabulate references unlock use cases that were too risky with most frontier models.
The 3-bit quant path for 128GB M5 Max will be worth watching.
Glittering-Call8746@reddit
Turbo3 right ?
Beamsters@reddit (OP)
Maybe also good for retrieving facts, grade chapters and summarize characters from books.
Specter_Origin@reddit
How is the token efficiency? when the released it initially they were heavily emphasizing how token efficiant the model is.
coder543@reddit
Token efficiency seems quite good.
nuclearbananana@reddit
That's not respectable, that's worse than k2.6
coder543@reddit
Huh? You want less tokens, not more. K2.6 is using twice as many tokens. mimo-v2.5 is much more efficient.
nuclearbananana@reddit
Oh I thought OP was asking about ds v4. Nvm
coder543@reddit
Also:
Beamsters@reddit (OP)
They're pretty good I'd say, only reasoning models were selected here.
zdy132@reddit
Another interesting thing in the second graph is how bad the DeepSeek V4 models are doing. Are they particularly prone to hallucination?
Technical-Earth-3254@reddit
DS models were always prone to hallucinate. V4 is still in preview, keep that in mind (but I doubt it will surpass V3.2). Mimo is for sure completely out of reach.
Kodix@reddit
Yep. Excellent for creative writing (really impressed me, tried similar Polish language prompt on several models and Deepseek was by far the best), but kinda awful for structured work. Mimo 2.5 flagged *so many* issues that Deepseek introduced to a project that I just dropped it from consideration.
Deepseek flash is amazingly cheap per token for the quality, though.
pigeon57434@reddit
in general im massively dissapointed in deepseek v4 but my coping mechanism tells me "they said it was only preview" the likely cause is literally and unfortunately just pretraining tokens was almost zero compared other trillion scale oss models
Asleep-Dot5479@reddit
had it happen a few times already. Even when asked if they're sure, they insist and invent proof
sammybeta@reddit
I tried to use deepseekv4 pro first night, was not impressed. mimo2.5 is significantly better and efficient.
Now with that huge 90% off discount with deepseek, I can tolerate on it using a bit more tokens to get things right slowly.
zdy132@reddit
Same, I tried it a bit and it was meh. Not great, not terrible.
But at less than one quarter of the price than Mimo 2.5, it will stay my main agent while the discount is live.
EmotionalLock6844@reddit
I've been testing 2.5 pro as orchestrator and i can tell you, that its at least 2x better than gpt 5.5 on that. Its insanely efficient and smart at parallel subagent orchestration. Constantly running 5-8 parallel lanes in a single project, parallel worktrees with no issues. Almost flawless at merging worktrunks to main, solving conflicts. Im totally impressed!
ghgi_@reddit
Mimo is my favorate chinese model recently, even nicer then qwen, kimi and deepseek, It checks nearly all the boxes besides coding perf isnt as good as claude or gpt which is fine for 99% of tasks that arent hardcore projects, It can work very well along side other models either as a helper or a assistant and ive had good results with it being an agent and doing automated tasks.