99.99% purity replacement part

Posted by pie_-_-_-_-_-_-_-_@reddit | talesfromtechsupport | View on Reddit | 52 comments

I'm an IT intern at a small clinic and radiology imaging center in the US.

Monday night, a pretty bad storm rolled through that caused power surges and outages all over town, including at our clinic. The day after, I came into work and my boss told me that our file server (hosts radiology images, administrative documents, and many other things) had been down since the power outage last night, and asked if I could take a look at it.

I walk in the server room just expecting to have to manually start some service or another. I hop on the jumpbox, and "no boot device found". Oh no. This is on a Dell PowerEdge, so I spend 20 minutes trying to find a laptop with an RJ45 port so I can plug into the confusing iDRAC interface and see what's up. Eventually someone finds me a USB-to-RJ45 adapter, I start looking in the web interface and discover that it doesn't see the storage controller at all.

I open up the server to start re-seating things and looking for damage, and I happen to notice that one of the pins on the RAID controller card has pretty bad electrical pitting, probably storm-related damage. I re-seat a bunch of stuff and try powering it on anyway, no dice, still doesn't even see the RAID card. Yep, that card is very likely the problem.

I'm about to go tell my boss that we need to set up another computer to be the jumpbox for now and that that'll take a few more hours. But then I consider that the RAID card looks remarkably intact except for a single pin, and if I could get it to work, the jumpbox would come back up.

Searching for materials to MacGyver this with, I find the perfect thing: some medical gold foil in a supply closet that we use for some other machine or medical procedure or something. I take out a bit of gold foil, press it into the pits on the RAID card, and for lack of luck in finding such perfect material again, electrical tape it in place at the top.

I plug the RAID card back into the jumpbox, power it on, and it miraculously boots straight into Windows, just like it did yesterday. The RAID config and everything survived.

I think I used up my luck for the quarter.

(Of course, I told my boss that this was a very temporary fix, it might not survive a second boot, and that we really should have a PDU or something behind the servers anyway, and was promptly ignored.)