Ollama retaining history?

Posted by DimensionEnergy@reddit | LocalLLaMA | View on Reddit | 11 comments

so ive hosted ollama locally on my system on http://localhost:11434/api/generate and was testing it out a bit and it seems that between separate fetch calls, ollama seems to be retaining some memory.

i don't understand why this would happen because as much as i have seen modern llms, they don't change their weights during inference.

Scenario:

  1. makes a query to ollama for topic 1 with a very specific keyword that i have created
  2. makes another query to ollama for a topic that is similar to topic 1 but has a new keyword.

Turns out that the first keyword shows up in the second response aswell. Not always, but this shouldn't happen at all as much as i know

Is there something that i am missing?
I checked the ollama/history file and it only contained prompts that i have made from the terminal using ollama run