Brendan Gregg's 55-minute outage story has a missing piece — the sos command
Posted by jlrueda@reddit | sysadmin | View on Reddit | 13 comments
Brendan Gregg wrote a post in 2024 that every sysadmin should read: "Linux Crisis Tools" — https://www.brendangregg.com/blog/2024-03-24/linux-crisis-tools.html
He walks through a fictional-but-real 55-minute outage where a team couldn't even install iostat because the firewall blocked apt, the filesystem was immutable, and nobody knew the package management policy. The site came back at 4:55pm — not because they found the root cause, but because they reverted a VM snapshot.
Then at 12:50am it happened again. Because nothing was actually fixed.
Brendan's takeaway: pre-install your crisis tools. He's absolutely right.
But there's a second problem his story illustrates that nobody talks about: the snapshot revert destroyed all forensic evidence. No logs. No command outputs. No idea what actually happened.
I think that this is where the sos command belongs on that crisis tools list. Run it once during the incident — even on a crawling system — and it captures logs, configs, and the output of dozens of diagnostic commands into a single archive. After the snapshot restore, your team still has everything they need to find the real root cause.
sos is open source, ships with every major Linux distro, and takes one command to run. Add it to your crisis toolkit.
What do you think? Is there any other tool like this (preferably open-source)?
Creative-Package6213@reddit
Nah not gonna read this blog spam...
MrNiceBalls@reddit
Well, tbf Brendan Gregg is the authority on Linux performance tuning, but this is just an attempt to create a mind link between the OP's tool and Brendan.
Creative-Package6213@reddit
Even worse.
The_Penguin22@reddit
Obvious AI is obvious
Creative-Package6213@reddit
Facts!
MrNiceBalls@reddit
Is this where you plug in your
sos-vaultor are you gonna wait for a pointed question instead?Fuzzmiester@reddit
Their post history is a little interesting that way...
NaturalIdiocy@reddit
awh... you scared them to hide it.
Fuzzmiester@reddit
It had been open before that. And it appeared to, primarily, be an account to plug sos-vault.
NaturalIdiocy@reddit
Yeah, doing a basic google of it found a subreddit with posts just from them.
stackjr@reddit
You want to try that again?
NaturalIdiocy@reddit
yes-but-no
Burgergold@reddit
Could have clonwd before reverting too