One thing we found while building long-horizon agents: context density mattered more than context length

Posted by Ok_Celery_4154@reddit | LocalLLaMA | View on Reddit | 1 comments

We’ve been experimenting with a long-horizon agent setup, and one thing that became increasingly obvious was this:

A lot of failures weren’t coming from insufficient context window size, but from low information density inside the active context.

In other words, even when the model had “enough room,” the decision quality still degraded once too much low-value state, tool history, and irrelevant memory accumulated.

So we started testing a different design approach:

A few things we observed:

My current takeaway is that for agent systems, context management may be a more fundamental bottleneck than raw context length.

Curious whether others here have seen similar behavior:

If useful, we wrote up the implementation and evaluation details here:

Would be especially interested in pushback on: