Raid5 mdadm array disappearing at reboot
Posted by _InvisibleRasta_@reddit | linuxadmin | View on Reddit | 15 comments
I got 3x2TB disks that i made a softraid with on my homeserver with webmin. After I created it i moved around 2TB of data into it overnight. As soon as it was done rsyncing all the files, I rebooted and both the raid array and all the files are gone. /dev/md0 is no longer avaiable. Also the fstab mount option I configured with UUID complains that it can't find such UUID. What is wrong?
I did add md_mod to the /etc/modules and also made sure to modprobe md_mod but it seems like it is not doing anything. I am running ubuntu server.
I also run update-initramfs -u
#lsmod | grep md
crypto_simd 16384 1 aesni_intel
cryptd 24576 2 crypto_simd,ghash_clmulni_intel
#cat /proc/mdstat
Personalities : [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
unused devices:
#lsblk
sdb 8:16 0 1.8T 0 disk
sdc 8:32 0 1.8T 0 disk
sdd 8:48 0 1.8T 0 disk
mdadm --detail --scan does not output any array at all.
It jsut seems that everything is jsut gone?
#mdadm --examine /dev/sdc /dev/sdb /dev/sdd
/dev/sdc:
MBR Magic : aa55
Partition[0] : 3907029167 sectors at 1 (type ee)
/dev/sdb:
MBR Magic : aa55
Partition[0] : 3907029167 sectors at 1 (type ee)
/dev/sdd:
MBR Magic : aa55
Partition[0] : 3907029167 sectors at 1 (type ee)
mdadm --assemble /dev/md0 /dev/sdb1 /dev/sdc1 /dev/sdd1
mdadm: cannot open device /dev/sdb1: No such file or directory
mdadm: /dev/sdb1 has no superblock - assembly aborted
It seems that the partitions on the 3 disks are just gone?
I created an ext4 partition on md0 before moving the data
#fdisk -l
Disk /dev/sdc: 1.82 TiB, 2000398934016 bytes, 3907029168 sectors
Disk model: WDC WD20EARS-00M
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 2E45EAA1-2508-4112-BD21-B4550104ECDC
Disk /dev/sdd: 1.82 TiB, 2000398934016 bytes, 3907029168 sectors
Disk model: WDC WD20EZRZ-00Z
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: D0F51119-91F2-4D80-9796-DE48E49B4836
Disk /dev/sdb: 1.82 TiB, 2000398934016 bytes, 3907029168 sectors
Disk model: WDC WD20EZRZ-00Z
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 0D48F210-6167-477C-8AE8-D66A02F1AA87
Maybe i should recreate the array ?
sudo mdadm --create /dev/md0 --level=5 --raid-devices=3 /dev/sdb /dev/sdc /dev/sdd --uuid=a10098f5:18c26b31:81853c01:f83520ff --assume-clean
I recreated the array and it mounts and all files are there. The problem is that when i reboot it is once again gone.
piorekf@reddit
mdadm --assemble /dev/md0 /dev/sdb1 /dev/sdc1 /dev/sdd1
Shouldn't it be full disks not partitions? So not
sdb1
, but justsdb
and so on?_InvisibleRasta_@reddit (OP)
From what I read there is no need to create partitions on each disk individually before creating the arrary so I created the array first and then partitioned md0.
I did run the mdadm.conf creation command and this is mdadm.conf
ARRAY /dev/md0 metadata=1.2 UUID=a10098f5:18c26b31:81853c01:f83520ff
Problem is that at reboot md0 is not present and i have to run the create command once again
piorekf@reddit
Yes, I get that. But in your listing you have provided the output (with errors) from the command
mdadm --assemble /dev/md0 /dev/sdb1 /dev/sdc1 /dev/sdd1
. My question is if you should run this command not with partitions, but with whole disks. So it shoud look like this:mdadm --assemble /dev/md0 /dev/sdb /dev/sdc /dev/sdd
_InvisibleRasta_@reddit (OP)
yeah it was a misstype sorry. The output is pretty much the same. It cant assemble
After reboot:
# mdadm --assemble /dev/md0 /dev/sdb /dev/sdc /dev/sdd
mdadm: Cannot assemble mbr metadata on /dev/sdb
mdadm: /dev/sdb has no superblock - assembly aborted
_InvisibleRasta_@reddit (OP)
it looks like at every reboot no matter what i have to just run this command else the array wont be avialable
sudo mdadm --create /dev/md0 --level=5 --raid-devices=3 /dev/sdb /dev/sdc /dev/sdd --uuid=a10098f5:18c26b31:81853c01:f83520ff --assume-clean
michaelpaoli@reddit
So ... what exactly have you got and what exactly are you trying to do? You seem to be saying you're doing md raid5 on 3 drives, direct on the drives themselves, and are then partitioning that md device (which is a bit odd, but, whatever), however you also show data which seems to suggest you have the drives themselves partitioned - you can't really do both, as those may quite be stepping on each other's data, and likely won't work and/or may corrupt data. Also, if you take your md device, say it's /dev/md0 (or md0 for short), and you partition it, the partitions would be md0p1 md0p2 etc., those would be pretty non-standard and atypical names, is that what you actually did? Or what did you do? If you did partitioning, e.g. MBR or GPT direct on the drive, after creating md raid5 direct on the drives, you likely clobbered at least some of your md device data.
So, which exactly is it and what are you trying to do?
Also, if you partition md device, you likely have to rescan the md device after it's started to be able to see/use the partitions, e.g. partx -a /dev/md0
But if you've got partitions on the drive, and are doing it that way, then you'd do your md devices on the drives' partitions - that would be more typical way - though can do driect on drives, but partitioning md device would be quite atypical. Typically one would put filesystem or swap or LVM PV or LUKS on md device, or use btrfs or zfs directly on it, but generally wouldn't do a partition table on it.
So, how exactly do you have your storage stack on those drives, from drive itself on up to filesystem or whatever you're doing for data on it? What are all the layers and what's the order you have them stacked?
That shows each drive MBR partitioned, with single partition of type
ee GPT, so, you have GPT partitioned drives, not md direct on drives.
So, if you put md on partitions, should look for it there:
# mdadm -E /dev/sd[bcd][1-9]*
So, which is it? What devices did you create md0 on, and what
device did you create the ext4 filesystem on?
So, you've got an empty GPT partition table on each.
Yeah, you can't have md device direct on drives and also have partition
table direct on on same device (e.g. /dev/sdb). You get one, or the other,
not both on same device.
Not like that, that may well corrupt your data on the target - but you
may have already messed that up anyway.
Might appear to, but no guarantees you haven't corrupted your data - and
that may not be readily apparent. Without knowing exactly what steps
were used to to create the filesystem, and layers beneath it, and other
things you may have done with those drives, no easy way to know whether
or not you've corrupted your data.
Also, what have you got in your mdadm.conf(5) file? That may provide information on how you created the md device, and on what ... but if you've been recreating it, that may have clobbered the earlier info. What's the mtime on the file, and does it correlate to when you first made the array, or when you subsequently recreated it?
webmin, huh? Well, also check logs around time you first created the md device, it may possibly show exactly how it was created and on what devices.
hokesujuku@reddit
not using partition table is just a bad idea regardless what other people claim
everybody who starts out learning linux learns that everything is a block device so they get this idea, not using partition table not necessary
yes is possible but its not a good idea regardless
ralfD-@reddit
This. The array was created (according to OP) with disks, not partitions. Your version should work (and OP should step back and read up on the basics before diving into array creation).
hokesujuku@reddit
it's standard to have a partition table on each drive
putting mdadm there instead is not standard
so now what happens is, probably, there was GPT partition on it before. GPT uses first ~34 sectors of the disk. additionally it puts a backup at end of disk.
your system reboots. your bios sees the "corrupted" primary GPT, the intact GPT backup, and restores it.
and at this point your mdadm headers are wiped out
at this point you have two option, 1) use wipefs to remove GPT both primary and backup, 2) go with the flow and make mdadm on partition instead since it's standard and much safer to do so
cause sooner or later something will wipe it again and then your data is gone
_InvisibleRasta_@reddit (OP)
so create a partition on each disk and then create the array?
so i should I run wipefs -a -f /dev/sdx and then create the partition?
hokesujuku@reddit
yes but your existing data would be lost, unless you make partition say at 1M offset, and tell mdadm --create to use 1M less offset (check mdadm --examine offset) so the offset is where your data is at
also might be a few sector missing at the end (previously part of md, now used by gpt backup of bare drive). if it's ext4 shrink the filesystem by a little, just in case
if you don't mind rsync your data again, might be less complicated to just start over from scratch entirely
_InvisibleRasta_@reddit (OP)
yes i will start from scratch as i have backups.
Could you help me out with the process?
How should i prepare the 3 drives and how should i recreate the raid array?
Tahnkyou
_InvisibleRasta_@reddit (OP)
yes i will start from scratch as i have backups.
Could you help me out with the process?
How should i prepare the 3 drives and how should i recreate the raid array?
Tahnkyou
_SimpleMann_@reddit
You can't scan for an array that isn't online.
You can safely rebuild the array using the same chunk size (stay on default if you didn't specify any) and your data would still be 100% there.
After your recreate the array do a:
sudo mdadm --detail --scan >> /etc/mdadm/mdadm.conf
And it should be persistent.
Also, here's a tip, use UUIDs instead of /dev/sdx UUIDs never change (backup everything first, when re-creating the array do it exactly how you created it the first time)
hrudyusa@reddit
Second that NEVER use device letters as Linux re-enumerates the drives each time you boot. Use UUIDs or, if you must disk labels.