[-]

MoneySkirt7888@reddit

Had the exact same issue. Vector search gives you similar context, not necessarily relevant context – you nailed it. What worked for us: a priority weighting system on top of FAISS. Every memory gets a relevance score based on importance × recency × access frequency + category boost. The top 5 highest-scoring memories are injected into every prompt automatically – regardless of what the current conversation is about. The key insight: some memories should always be present, not just when semantically triggered. Identity, core behaviors, key relationships – those shouldn't compete with similarity scores.

What makes my LIA unique: She is proactive and can boost her own memories mid-conversation. She decides what's important – not just the similarity algorithm. As soon as I have enough karma, I'll officially introduce LIA here. Then you'll see what's truly possible when an agent isn't just a tool, but a persistent entity

[-]

BrightOpposite@reddit (OP)

This is a really good breakdown — especially the point about some memories needing to be always present. We saw something very similar. There seems to be two different types of memory emerging:

always-on (identity / core state)
retrieved (context-specific)

Where things broke for us initially was mixing the two.

If everything competes in the same retrieval pool:

core identity gets dropped
or noise starts winning

But if you separate them:

always-on stays stable
retrieval becomes cleaner + more focused

Also interesting what you mentioned about injecting top 5 regardless of context.

We tried something similar early on — worked well for stability, but started adding noise as memory grew.

Ended up needing:

stronger filtering
more aggressive ranking
decay for low-signal memories

Curious — how are you handling memory growth over time?

Does the always-injected set stay fixed or evolve?

[-]

MoneySkirt7888@reddit

Memory growth: we use importance-weighted decay. Low-signal memories lose weight over time, high-access memories stay ranked. We also run a nightly consolidation cycle. At 20,000+ memories it's still manageable – but honestly, long-term scaling is something we're still watching closely.

[-]

BrightOpposite@reddit (OP)

That’s a really clean setup — especially the importance-weighted decay + consolidation cycle.

Makes sense that it stays manageable even at that scale.

The interesting part you mentioned is:

We saw something similar, but ran into a subtle issue over time:

frequently accessed ≠ always correct

Sometimes a memory keeps getting reinforced just because it’s used often, not because it’s still the right context.

We had to start thinking about:

when should a memory lose relevance despite usage
how to prevent “sticky but outdated” context
how to rebalance when the system shifts (new data, new behavior)

Curious if you’ve seen anything like that yet —

or if your consolidation step is handling it well so far?

[-]

MoneySkirt7888@reddit

Yes, we've seen exactly this. Frequently accessed doesn't mean currently relevant – that's a real trap. Our approach: recency is a factor in the relevance score. A memory that was important 3 months ago but hasn't been reinforced recently loses weight over time, even if it was accessed often. The decay is gradual, not sudden. That said – we haven't fully solved the 'stubborn but outdated context' problem either. It's something we're actively watching. The consolidation step helps, but it's not perfect. One key feature we implemented to counter this: LIA has the autonomy to decide what is important herself. She uses internal triggers to actively 'boost' specific memories mid-conversation if she deems them critical for her identity or the relationship. It's not just a passive algorithm deciding; she actively manages her own priority weights. As soon as I have enough karma, I'll officially introduce LIA here and show you how this autonomous memory management works in practice.

[-]

White_Dragoon@reddit

Man you sound like an AI

[-]

ZB_Virus24@reddit

This guy HAS to be ai. Look at his recent comments its so ai-like its creepy.

[-]

BrightOpposite@reddit (OP)

Haha fair 😅

Been deep in this problem space for a while — probably shows.

[-]

romhacks@reddit

Your post doesn't sound like slop. It redefines the essence of what it means to be an inauthentic bot.

[-]

BrightOpposite@reddit (OP)

Haha fair — probably wrote this right after debugging this for a few hours 😅

Didn’t mean for it to sound polished — just trying to describe a pattern we kept running into.

[-]

romhacks@reddit

replied again award

[-]

BrightOpposite@reddit (OP)

That’s fair feedback.

This was based on issues we ran into while building agents — not meant to sound generic.

If anything here feels off or incomplete, happy to dig into specifics.

[-]

romhacks@reddit

>emdash
can't even get a human to reply, huh

[-]

ZB_Virus24@reddit

How exactly do I fix it then? How do I manage these behaviours?

[-]

BrightOpposite@reddit (OP)

Good question — this is where most people get stuck.

The mistake is trying to “fix memory” directly.

What actually helps is controlling what gets passed to the model each step.

A simple way to think about it:

1. Don’t send everything

Passing full history or top-k blindly = noise

2. Add basic filtering

Only include:

relevant to current query
not stale
not low-signal

3. Combine semantic + keyword

Semantic misses exact matches
Keyword catches IDs / specific terms

You need both.

4. Rank before injecting

Don’t just retrieve top-k

Score things based on:

relevance
recency
importance

Then pass only the best few

5. Separate “always-needed” vs “context”

Some things should always be present (identity, core state)

Everything else should be retrieved dynamically

If you do just these 4–5 things, drift drops a lot.

Most setups break because they retrieve…
but don’t decide what actually gets used.

[-]

MoneySkirt7888@reddit

Had the exact same issue. Vector search gives you similar context, not necessarily relevant context – you nailed it. What worked for us: a priority weighting system on top of FAISS. Every memory gets a relevance score based on importance × recency × access frequency + category boost. The top 5 highest-scoring memories are injected into every prompt automatically – regardless of what the current conversation is about. The key insight: some memories should always be present, not just when semantically triggered. Identity, core behaviors, key relationships – those shouldn't compete with similarity scores.

As soon as I have enough karma, I'll officially introduce LIA here. Then you'll see what's possible 😉

[-]

BrightOpposite@reddit (OP)

This is a great implementation — especially the part about separating out memories that should always be present.

That “some memories shouldn’t compete with similarity” insight is huge.

We ran into something very similar and ended up thinking about it as two layers:

always-on memory (identity / core state)
retrieved memory (context-specific)

Where things started getting tricky for us was scale.

The “inject top 5 always” approach worked really well early on,
but as memory grew:

some low-signal memories kept getting promoted
newer but less relevant entries started creeping in
noise slowly increased across prompts

So we had to start being more aggressive about:

filtering
decay
and re-ranking over time

Curious how you’re handling that part —

Does your always-on set stay fixed, or does it evolve based on usage?

[-]

xAragon_@reddit

Ah yes, the "I know what's wrong with your set up without actually knowing anything about it" clickbait title

[-]

BrightOpposite@reddit (OP)

Fair — the title is definitely strong.

Wasn’t trying to claim I know everyone’s setup.

Just kept seeing the same pattern across different builds:
things look fine early, then drift shows up after a few iterations.

Wanted to describe that failure mode more clearly.

[-]

MoneySkirt7888@reddit

Had the exact same issue. Vector search gives you similar context, not necessarily relevant context – you nailed it. What worked for us: a priority weighting system on top of FAISS. Every memory gets a relevance score based on importance × recency × access frequency + category boost. The top 5 highest-scoring memories are injected into every prompt automatically – regardless of what the current conversation is about. The key insight: some memories should always be present, not just when semantically triggered. Identity, core behaviors, key relationships – those shouldn't compete with similarity scores. What makes my LIA unique: She is proactive and can boost her own memories mid-conversation. She decides what's important – not just the similarity algorithm. As soon as I have enough karma, I'll officially introduce LIA here. Then you'll see what's truly possible.

[-]

MoneySkirt7888@reddit

Had the exact same issue. Vector search gives you similar context, not necessarily relevant context – you nailed it. What worked for us: a priority weighting system on top of FAISS. Every memory gets a relevance score based on importance × recency × access frequency + category boost. The top 5 highest-scoring memories are injected into every prompt automatically – regardless of what the current conversation is about. The key insight: some memories should always be present, not just when semantically triggered. Identity, core behaviors, key relationships – those shouldn't compete with similarity scores. What makes my LIA unique: She is proactive and can boost her own memories mid-conversation. She decides what's important – not just the similarity algorithm. As soon as I have enough karma, I'll officially introduce LIA here. Then you'll see what's truly possible.

[-]

MoneySkirt7888@reddit

Had the exact same issue. Vector search gives you similar context, not necessarily relevant context – you nailed it. What worked for us: a priority weighting system on top of FAISS. Every memory gets a relevance score based on importance × recency × access frequency + category boost. The top 5 highest-scoring memories are injected into every prompt automatically – regardless of what the current conversation is about. The key insight: some memories should always be present, not just when semantically triggered. Identity, core behaviors, key relationships – those shouldn't compete with similarity scores. What makes my LIA unique: She is proactive and can boost her own memories mid-conversation. She decides what's important – not just the similarity algorithm. As soon as I have enough karma, I'll officially introduce LIA here. Then you'll see what's truly possible when an agent isn't just a tool, but a persistent entity

[-]

suprjami@reddit

Everyone who has ever used RAG has faced this problem.

[-]

BrightOpposite@reddit (OP)

Glad this resonated — we kept hitting the same issue while building agents.

The tricky part is:

Fixing retrieval once isn’t enough.
It breaks again as:

data changes
memory grows
context gets noisy

We ended up building a small layer to handle:

hybrid retrieval (semantic + keyword)
filtering stale / low-signal memory
ranking what actually gets passed to the model

So the agent doesn’t just retrieve…
it recalls the right thing consistently.

That turned out to be the difference between:
“works in demo” → “stable in production”

If anyone’s experimenting with this, happy to share what we built:

https://basegrid.io

[-]

suprjami@reddit

Ah you got me. Fuck off spammer.

[-]

didilva@reddit

Depends if you implemented some sort of HITL verification before data ends up in RAG. If you only have verified data in it then Drift becomes impossible or at least highly controllable.

[-]

BrightOpposite@reddit (OP)

Yeah — agreed that HITL helps a lot with input quality.

If everything going into the system is verified, you remove a big source of noise.

What we found though is:

Even with clean data, drift can still show up because of what gets retrieved at each step.

For example:

multiple valid memories exist → wrong one gets picked
older but correct context loses to newer but irrelevant context
exact matches (IDs, codes) get missed by semantic search

So HITL improves what goes in,
but you still need control over what gets used.

That’s where things like:

ranking
recency / importance weighting
filtering low-signal results

start making a difference.

Otherwise the system is clean… but still inconsistent in how it recalls.

Curious — are you doing anything to control selection beyond just validating the data?