I kept running into cases where retrieval was “working” but the model still gave bad answers

Posted by Silly-Effort-6843@reddit | LocalLLaMA | View on Reddit | 5 comments

I kept running into cases where retrieval was “working” but the model still gave bad answers.

So I tried something simpler:

- take ONE doc

- put it in a clean context

- ask a question that depends entirely on it

- check if the model actually uses the info

Ran this on \~20 docs (schemas, join logic, etc.) with LLaMA (Groq).

What helped a lot:

- shorter docs (<500 words)

- tables > paragraphs

- showing transformations explicitly (e.g. CUST-{id}) instead of describing them

- code examples > explanations

What didn’t work well:

- long narrative explanations

- implicit logic (“the system usually does X…”)

Example:

Postgres → Mongo join failed until I literally wrote the transformation format. After that it worked consistently.

Curious if others are doing something like this, or just relying on retrieval + prompt tuning?