Local Models is the Way - I cannot believe what I just saw

Posted by Southern_Sun_2106@reddit | LocalLLaMA | View on Reddit | 16 comments

So there's a meme going in Claude Code right now about the 'strawperry'. I thought it was a joke!

Then I ran this in the real Claude app:

AND the exact same question by Unsloth's Qwen 27B UD Q6_KXL gguf:

Mindblowing... on so many levels.

[-]

versking@reddit

What local framework / interface is this?

[-]

Fabulous_Fact_606@reddit

Custom context injection. RAG; CAGRA GPU vector database with Qwen3-8B embedding 4096 vectors + a bunch of other stuff to surface 15 facts out of 200K at context injection.

[-]

tbh, just wasting tons of resources on trick questions is really meaningless maybe we can do some real benchmarking instead of just trying to fool the models would be better? (that’s just my point of view, i don’t mean any disrespect of course)

[-]

Southern_Sun_2106@reddit (OP)

Of course, this is no benchmark. I don't understand how 0.5 - 1.5T parameter model cannot get this question, while a 27B can. The question is so random, it cannot be in any of their's training data. I cannot wrap my head around it.

[-]

thewhzrd@reddit

It eats training data. It doesn’t regurgitate it. It flows along weighting params and finds answers. A lot of models are still based off the original training set. Built on top. So as time went on. More and more data allows the weights the change how they point to things.

[-]

Southern_Sun_2106@reddit (OP)

I am glad hello scientists can reproduce the results.