How to check if HDD is failing
Posted by sdns575@reddit | linuxadmin | View on Reddit | 5 comments
Hi,
on my personal backup server (@home) I have an mdadm raid5 with 3x3TB wd red (I checked they are CMR).
One disk get detached from the array, I tried to read it but after some days it get detached again. I get error about speed level decrease from 6.0 gb/s to 3.0 gb/s
I checked smart logs and nothing is reported. I run badblocks to check if some block is gone but it is clean.
There is a way to check the connection port of the disk? I tried to change sata cable and sata port but it got the same message. At this point I don't know if is the motherboard sata controller or the disk itself.
I can attach the disk on another machine, but don't know what test runs to check this problem.
Any help is appreciated.
Thank you in advance
alpha417@reddit
I'm sure you have backups, right?
... right?
freightcar@reddit
Sure sounds that way. Take a look at smartctl --all on that device, if it doesn't see issues, then I think you're on the right track.
sdns575@reddit (OP)
Hi and thank you for your answer.
smartctl does not report nothing bad.
There are other thing that I can check?
freightcar@reddit
Seems to me you've already been pretty thorough, swapping cables and host port, even the host itself, testing it under load. The kernel messages you shared do point to a bus problem.
At this point I would just replace the disk, since it's part of a RAID5, if I understood you correctly.
If you really really want to keep the disk for whatever reason, there are companies that will replace the logic board of a disk for you, which might resolve the issue, but just buying a replacement disk will be a lot cheaper and easier, it seems to me.
sdns575@reddit (OP)
Thank you for your advice