The "bus factor" problem has no early warning system – so I built one (AWS AIdeas 2025 Finalist)
Posted by scode-in@reddit | ExperiencedDevs | View on Reddit | 2 comments
Hey r/ExperiencedDevs,
Every engineer has seen this: a key dev, ops person, or architect leaves, and suddenly no one knows how the payment gateway works, why that cron job runs at 3am, or who the vendor contact is.
The bus factor problem is real, but there's no continuous early warning system for it. You only find out after someone walks out.
I spent the last few months building RetainIQ — an observability layer for institutional knowledge. Think Datadog, but for your org's human knowledge layer.
How it works:
- Passively ingests signals: meeting transcripts, chat logs, workflow metadata (zero manual effort from employees)
- Assigns a Knowledge Fragility Score (0–100) per critical topic/process
- Builds a Knowledge Dependency Graph — nodes go red when there's no backup person for a topic
- Surfaces AI-generated interventions: e.g. "Schedule a knowledge transfer session for payment gateway integration" with a predicted score impact
- Privacy-first: PII redacted, originals deleted post-analysis
Stack: AWS Bedrock Nova Pro + Lambda + Cognito + S3 (fully serverless)
Just got selected as an AWS AIdeas 2025 Finalist.
Full write-up: https://builder.aws.com/content/3CV2aFroWhni2e6MGlj8kLSDbCY/aideas-finalist-retainiq
Curious: Have you ever dealt with a catastrophic knowledge-loss event when someone left? What did your team do about it?
okayifimust@reddit
The team can only scramble and fight fires. The problem is management failure long before the critical person disappears.
You need redundancy, and you need to constantly pay for that and the other unsexy things that do not create an immediate ROI: Documentation, testing, process.
Everybody knows that Dave is the only one who understands the database, or that we'll be fucked if Frank leaves and anyone wants a change made to the reporting system after that, ever.
But nobody is willing to stop Dave and Frank from shipping new features until everything they ever did has been committed to paper.
roodammy44@reddit
I’ve found that everyone knows when there is a high bus factor, but no-one in management cares. If they cared they would do something about it. Often the reason they don’t is because of budget or velocity. Everything is run super lean in the last 10 years.