Local Models is the Way - I cannot believe what I just saw
Posted by Southern_Sun_2106@reddit | LocalLLaMA | View on Reddit | 16 comments
So there's a meme going in Claude Code right now about the 'strawperry'. I thought it was a joke!
Then I ran this in the real Claude app:

AND the exact same question by Unsloth's Qwen 27B UD Q6_KXL gguf:

Mindblowing... on so many levels.
Fabulous_Fact_606@reddit
QWEN27B ALSO:
DeltaSqueezer@reddit
Qwen3-1.7B can handle it too ;)
versking@reddit
What local framework / interface is this?
Fabulous_Fact_606@reddit
Custom context injection. RAG; CAGRA GPU vector database with Qwen3-8B embedding 4096 vectors + a bunch of other stuff to surface 15 facts out of 200K at context injection.
Fabulous_Fact_606@reddit
Southern_Sun_2106@reddit (OP)
Awesome! Local assistants rule! <3
overflow74@reddit
tbh, just wasting tons of resources on trick questions is really meaningless maybe we can do some real benchmarking instead of just trying to fool the models would be better? (that’s just my point of view, i don’t mean any disrespect of course)
Southern_Sun_2106@reddit (OP)
Of course, this is no benchmark. I don't understand how 0.5 - 1.5T parameter model cannot get this question, while a 27B can. The question is so random, it cannot be in any of their's training data. I cannot wrap my head around it.
thewhzrd@reddit
It eats training data. It doesn’t regurgitate it. It flows along weighting params and finds answers. A lot of models are still based off the original training set. Built on top. So as time went on. More and more data allows the weights the change how they point to things.
the320x200@reddit
Google "LLM tokenization". You're basically showing the LLM it's equivalent of an optical illusion and then freaking out that it happens.
overflow74@reddit
there’s also this question about whether or not you should take your car (the distance is very small) and many more lol
overflow74@reddit
have you considered the nature of transformers? perhaps cause you have different tokenizers here ? after all , these models don’t really think
jkh911208@reddit
I dont personally ask those question for benchmark, but those are also benchmark questions for better reasoning
ATK_DEC_SUS_REL@reddit
Tricky_Ad_6872@reddit
Southern_Sun_2106@reddit (OP)
I am glad hello scientists can reproduce the results.