Why some small/medium models fail at grammar checking task?

Posted by Character-Blood3482@reddit | LocalLLaMA | View on Reddit | 4 comments

Recently, I try playing with gemma 4 (gemma-4-E4B-it-Q5_K_S.guff) and find out it fail at easy grammar check (it try to fix the already corrected word "contemporary"). I noticed the same mistake from openai/gpt-oss-20b and qwen3-next-80b-a3b-instruct.

Prompt:

grammar check: Although the Western Roman Empire collapsed in 476 CE, its sociopolitical and legal legacy continues to exert a profound influence on the institutional frameworks of the contemporary world.

Output:

The sentence has two spelling errors.

Here is the corrected version:

**Although** the Western Roman Empire collapsed in 476 CE, its sociopolitical and legal legacy continues to exert a profound influence on the institutional frameworks of the **contemporary** world.

***

### Corrections Made:

**Although** $\rightarrow$ **Although** (Spelling correction)
**contemporary** $\rightarrow$ **contemporary** (Spelling correction)

[-]

mpasila@reddit

It's definitely just a tokenization issue, and probably hasn't been trained to do that sort of job specifically. In the same way LLMs can't tell how many Rs are in strawberry.

Kodix@reddit

Yeah, this is an issue I've noticed with gpt-oss-20b as well.

Did you generate the misspelled sentence with another model? That's how it happened to me. I was testing models, used Qwen to generate some code, used gptoss to check it later.

As far as I can tell, the issue is something like the orientation of the whitespace around the word changing the exact tokens that are used to represent the word. Or something like that, I'm not *at all* clear on the specifics, that still sounds strange to me.

What I **do** know is that once I allowed gpt-oss to "fix" the misspelling, it never mentioned it again in future passes.

Character-Blood3482@reddit (OP)

The text in prompt is generated from gemini.google.com and I copy and paste it to another models in llama.cpp to testing. So it's interesting to know the reason behind this. I hope in the future, newer small models will fix this issue of with the same word but different token encoding.

Plastic-Stress-6468@reddit

For what it's worth I have had pre lobotomized Gemini in January hallucinate grammar and spelling errors too.

Even on the latest and greatest cloud models AI still hallucinates for whatever reason and it's not a replacement for last step human copy editing. Not really expecting local models to outdo cloud in this department.