What takes more time in your infra: fixing issues or finding them?
Posted by Admirable-Risk-7245@reddit | sysadmin | View on Reddit | 15 comments
I often feel the real pain in server management is not always the remediation itself. It is the investigation before: what is installed, what is outdated, what is misconfigured, which service is running where, which server is different from the others…
Do you spend more time finding problems or fixing them?
groundhogcow@reddit
I normally spend most of my time making people tell me what the issue is.
Them: Hay something is wrong with the network.
Me: Oh I just fixed something on the network.
Them: Oh that wasn't it.
Me: Ok I'll fixed something else did that get it.
Them: No. Are you even doing anything.
Me: I am definitely doing things. I am just managing hundreds of systems on may terabytes of data with maybe 100 services of verious types running and the odds of me just finding something random is astronomically bad.
Them: Here let me tell you what problem I am having.
Me: I would like that very much.
pennyvis@reddit
Fanfic
DurandalJoyeuse@reddit
Getting folks to actually submit a ticket
bitslammer@reddit
Depends on the issue and the fix. There really 4 categories.
Easy to find, easy to fix.
Easy to find, hard to fix.
Hard to find, easy to fix.
Hard to find, hard to fix.
What tools and skills you have available will also make a huge difference. I can remember plenty of times where a Sniffer helped pinpoint a token ring issue in seconds. Without that it would have been a day long marathon of walking around a chemical plant unplugging stuff.
yeti-rex@reddit
The sniffer denied you all those extra steps!
Hey sysA, what are you doing? Troubleshooting the token ring. Oh, it looks like you're getting steps in.
bitslammer@reddit
The sniffer allowed me to stay in the air conditioning which was always my primary concern.
BrainWaveCC@reddit
Fixing problems almost always takes longer than finding them.
KnownUniverse@reddit
Good documentation goes a long way to shortening this cycle. Use a decent dependency modeling system. Everyone should document their work every day. If you can't do that, you have a bigger IT culture problem.
RansomStark78@reddit
Outage reports fsk
AniBMagal@reddit
Finding is always harder than fixing.
NoradIV@reddit
What takes more time is getting management to agree that this is an actual issue and that the solution is not more technology, it's more competent management.
North-Creative@reddit
Started recently at a company, where the sysadmin with a knack for not-documenting became chronically ill. THings are generally well-setup, but man, even with the best people, you discover tons of issues. Gotta say, ai does help a little here, especially when it is the game of "where-the-bloody-F-did-Microsoft-move-this-resource".
If your situation sounds similar, my suggestion that works for me: create timer for 60 minutes-->when it rings, take a step back, think what you just did-->document for at least 5 minutes
Helps me tremendously, because when the flood of issues is just too large, even a week feels like a year, and one forgets things.
Mammoth_War_9320@reddit
Finding. The fix is always something stupid and easy. It’s finding the source that’s the hard part.
USarpe@reddit
Finding takes much more time, if I found the reason there should be a solution
yeti-rex@reddit
We've got users, they'll find the problems. 🙃
Realistically you should be moving from a reactive culture to a proactive culture.