MiniCPM5-1B

[-]

koloved@reddit

I feels quite dull for the benchmarks it shows. Any larger model can already be used on the processor, which will give significantly better results.

[-]

bidutree@reddit

Model is available at Ollama for those who want to try it there.

[-]

alloxrinfo@reddit

It's making a mess in LM Studio, and I've tried a bunch of different settings, which is weird because it's not the same at all on hugging face testing page.

[-]

alloxrinfo@reddit

The MLX ones were working a bit better

[-]

coder543@reddit

Looking at the accuracy chart, it appears that the model refused to answer every question, so... that's something!

[-]

Reminds me when I was a university student and I trained a neural network to determine if a number was prime or not. I was so excited when I saw 90%+ accuracy on millions of samples, then i realised it learned to just guess "not prime" for every number lol

[-]

MuDotGen@reddit

Maybe it was trained on contrarian works and found a loophole in every question's motives for being asked?

[-]

10minOfNamingMyAcc@reddit

AGI right here!

[-]

psylenced@reddit

Just invert the answer!

[-]

1337Captain@reddit

It's over fitted

[-]

Interpause@reddit

99% hallucination rate seems truly useful for RNG

[-]

coder543@reddit

This is 99% non-hallucination, not 99% hallucination.

[-]

sterby92@reddit

Did anyone get tool calling to work with llama.cpp and openwebui? For me it spits out broken, half finished toolcalls.

[-]

And1mon@reddit

yeah something seems of. Cannot enable thinking as well.

[-]

Prize_Negotiation66@reddit

what is the best quant for such models?

[-]

Healthy-Nebula-3603@reddit

So small :)

[-]

DaleCooperHS@reddit

Whats worng with it being small! MAybe it has other qualities, ... maybe is funny , and romantic, and caring. Nothing wrong with being small OK!

[-]

Healthy-Nebula-3603@reddit

Hehe

[-]

DigiDecode_@reddit

🤯🤯🤯

[-]

jake_that_dude@reddit

the sleeper spec is 131k context on a 1.08B model, with only ~680M non-embedding params. that makes it more interesting as a local tool router than a chat model: cheap enough to sit in front of bigger models, long enough to carry repo/docs context, and enable_thinking=false gives you the fast path when you only need JSON/tool args.

koloved@reddit

bidutree@reddit

alloxrinfo@reddit

alloxrinfo@reddit

Few_Water_1457@reddit

coder543@reddit

kevin_1994@reddit

MuDotGen@reddit

10minOfNamingMyAcc@reddit

psylenced@reddit

1337Captain@reddit

Interpause@reddit

coder543@reddit

sterby92@reddit

And1mon@reddit

Prize_Negotiation66@reddit

Healthy-Nebula-3603@reddit

DaleCooperHS@reddit

Healthy-Nebula-3603@reddit

DigiDecode_@reddit

jake_that_dude@reddit