RAID Rebuild Time
Posted by Agreeable_Permit2030@reddit | sysadmin | View on Reddit | 16 comments
Hey All!
Hoping someone with more storage experience could help me. I have a server that houses my company's VMS and Access Control System, It is currently at 44TB of Video storage and 16TB was just added today for expansion into a new site next door. I followed the instructions at How to Reconfigure a Virtual Disk With OpenManage Server Administrator (OMSA) | Dell to add the drives to the array but here 5 hours later it is still showing at 0% in OMSA. Anyone have any guess how long it will take a raid 5 array of this size to reconfigure? I heard it could take a week. Is that true? Im pretty good on the software side of Sysadmin but now that Im with a company that Im the single IT guy the hardware side of this is new to me. Thanks in advance and sorry if this is a stupid question lol
stufforstuff@reddit
60TB RAID05, youre either really brave or really stupid. RAID05 left with Elvis last century for good math reasons caused by larger capacity drives outgrowing the protocol. Time to update to this decade.
Extension-Rip6452@reddit
Funny. NASs still sell well. NASs are all RAID for multiple disks. Yes RAID is old, but what do you think has replaced it? Storage Spaces? ZFS?
Sure we have larger drives, and we have equally larger data sets. And the same need for slightly fault redundant storage, which JBOD is not.
No_Wear295@reddit
This just sounds like a cluster-fsk waiting to happen. R5 with spinners that big.... Why bother? Make sure that you've got full backups and run R0 because IIRC the chances of surviving a rebuild due to failure are exceedingly low.
Extension-Rip6452@reddit
Disagree entirely. All my RAID5 rebuilds have been successful, although I have lost a RAID due to more than one disk going over a period of time it took for quotes and work orders and stock to all be approved.
Backups of CCTV are very difficult or expensive. There's a huge volume of write data coming in continuously, and almost no reads.
RAID0 and then back it up. Well how often are you backing it up, because as soon as one drive dies, you've lost the array and if it's the only storage location for the VMS, you've now lost your VMS until you replace the array.
However many smaller RAID5/6 is much better than one giant one.
Extension-Rip6452@reddit
Not really possible to provide you an estimate because it depends on so many factors:
• HDD or SSD. If HDD, 5400, 7200, 10,000?
• I assume the array isn't being taken offline for the expansion operation, so what is the live activity on the array? Live activity varies massively with number of cameras, recording style (24/7 or motion), resolution, level of motion, etc.
• What is the array rebuild priority set to?
However, an array that size, rebalancing to add that many more drives, I'm gonna assume HDDs because of the size and that it's CCTV, and I'm going to assume you're still recording all cameras to the array, so it's quite busy, yes, the rebuild is going to take weeks and weeks, and now you can't stop it.
Some things I've learned about my CCTV arrays:
• I need fault redundancy, so I usually use RAID 5
• I need massive cost effective storage, so I usually use large WD Purple drives
• As you expend the size of the array, you are at significantly more risk of more than one drive dying in close proximity.
• RAID5 performance doesn't scale particularly well as you add additional drives, and I started seeing very high activity % on large arrays during large CCTV events.
• None of my clients want to pay to archive/backup their CCTV, so that means CCTV footage is inherently lower value and I explain that there may be instances when an array goes down and we lose footage. By having many smaller arrays, we lose less footage in a single bad failure (which has happened, on a larger array unfortunately).
• When you perform array recovery or expansion operations on an array, it stresses all the drives in the array, so when you have a \~4 yr old array that's been operating 24/7 @ high write speeds, and a drive starts to fail, then you swap in a new drive, you now have \~4 yr old drives thrashing for days trying to rebuild the array and it can hasten the next drive to die during a period when the array isn't fault tolerant.
I used to create RAID5 arrays around 8 drives and then iSCSI volumes over 2 arrays, but due to experience and all the things above, I've switched to a max of 8 drives in the iSCSI volume now and we lose less video. Better to create two RAID5 of 8 drives and specify multiple storage locations in the VMS. It also means rebuild times are much more sane. I don't expand RAIDs, I create new RAIDs and then add them as storage to the VMS. If the client wants to add a bunch of new cameras or increase resolution of a significant number of cameras, then usually the existing system is greater than 3 or 4 yrs old, and it's time to add another NAS anyway rather than try to rebuild the existing RAID with bigger drives.
Budget_Tradition_225@reddit
Iscsi sucks bigons. Use fiber storage instead. Iscsi is slow and unpredictable!
Budget_Tradition_225@reddit
Oh and don’t buy Dell either lol.
Agreeable_Permit2030@reddit (OP)
Its a PERC H750 raid and SAS HDD 7.2k 12Gbps. unfortunately accourding to the internet there is no way to change or view the rebuild priority in dell OSMA unless you have any tricks. Thank you for beingwilling to share your knowledge it really helps me going forward because as you said said it looks like im stuck in this rebuild process now lol
Budget_Tradition_225@reddit
Haven’t read all of it (sorry). The differences are the manufactures on the raid controllers. HPE for example allows you to use the disc during formatting procedures. Dell does not! Or at least it’s the way it used to be. I’m an old IT guy that worked for an MSP for over 20 yrs. I built almost all our clients servers.
OpacusVenatori@reddit
You don't mention which RAID card you have and which drives are currently in the system; though even with that info isn't not really possible to guesstimate.
If you're going from a current 4x16TB in RAID 5 (48TB usable) to 5x16TB in RAID-5 (64TB) usable... the card has to recalculate over 64TB into 80TB, and also move the appropriate data chunks between disks. It's going to take a while to process that amount of data. Even just rebuilding 16TB worth of data at an extremely-idealistic sequential 275MB/sec is going to take 17 hours.
But now you're not just rebuilding; you're recalculating first, and then moving. All the while you're still actively using the server in question, so blocks are continuously changing.
At the low-end, going by the numbers reported by some people in r/synology and r/qnap running the same operation on drives of roughly the same performance, many of them were reporting multi-day runtimes expanding their arrays.
But as u/Zealousideal_Fly8402 said, you better be damn sure about your backups and pray that you don't have any kind power interruption or some such.
Agreeable_Permit2030@reddit (OP)
Its a PERC H750 and SAS HDD 7.2k 12Gbps
OpacusVenatori@reddit
Well, that's a bit of a good news at least. The H750 has a dedicated hardware XOR engine. But it'll also depend on how the RAID was set up initially, in particular the RAID-5 stripe size. 4-5 days would be an optimistic estimate.
You might be able to see status and configuration through iDRAC, depending on which version you have.
ZestycloseAd2895@reddit
Curious. Someone do the math? How long for the rebuild?
Agreeable_Permit2030@reddit (OP)
a couple colleguaes i reached out to said potentially 4-5 days
Zealousideal_Fly8402@reddit
Yes. Because it has to recalculate parity information for the entire RAID virtual disk, and then move the appropriate blocks onto each disk.
It will also depend on which RAID card is in the system.
You better be sure of your backups and pray you don’t suffer any kind of interruption like a power outage.
RAID-5 for such a large dataset really isn’t a good idea.
Agreeable_Permit2030@reddit (OP)
Its a PERC H750 and SAS HDD 7.2k 12Gbps. unfortunately it was set up that way by the security company when put in last year :(