Need help with mdadm

w1z4rd

Karmic Sangoma
Joined
Jan 17, 2005
Messages
52,146
Reaction score
8,340
Location
127.0.0.1
So this is the story.

Client brings in centos linux server with 5 hard disks. 4 of the disks are in linux software raid and 1 of the disks holds the operating system. I am unable to boot up the server and need a rescue cd to see anything.

Using the following command I am able to recreate the raid array:

Code:
mdadm --create /dev/md0 -n 4 -c 256 -l 5 -p left-symmetric --assume-clean /dev/sda1 /dev/sdc1 /dev/sdd1 /dev/sde1

Normally at this stage I could just mount the raid array with the command:
Code:
mount /dev/md0 /mntpoint

However, when I try to this I get the following error:

Code:
The device '/dev/md0' doesn't seem to have a valid NTFS. Maybe the wrong device is used? Or the whole disk instead of a partition (e.g. /dev/sda, not /dev/sda1)? Or the other way around?

None of the drives are NTFS drives, they all show as LINUX RAID when I fdisk -l.

Any suggestions? :(
 
Client brings in centos linux server with 5 hard disks. 4 of the disks are in linux software raid and 1 of the disks holds the operating system. I am unable to boot up the server and need a rescue cd to see anything.

Which version of CentOS?

Can you see the root filesystem? If so, check /etc/fstab

Maybe the md is a PV? pvs and/or pvdisplay will tell you
 
A better way to re-assemble mdadm arrays (without writing data to them) is:

mdadm -Es >> /etc/mdadm/mdadm.conf
Check what is in that file and then use mdadm --assemble /dev/md0 for example.

With mdadm --create you write data to the drives, and you never know but you might have created them incorrectly...

You have to however also check whether lvm was used to create volumes on them as well.

After assembling the drives, run:
vgdisplay
lvdisplay

to mount a lvm:
vgscan
vgchange -a y
mount /dev/$volumegroup/$logicalvolume /mnt/somefolderyoucreated

- replace $volumegroup with name from vgdisplay and $logicalvolume with name from lvdisplay
- there can be more than one logical volume you need to mount

you need the vgscan commands if you boot off a rescuecd or some livecd where the OS are unaware of the LVM stuff.
 
Which version of CentOS?

5.2 I think

Can you see the root filesystem? If so, check /etc/fstab

The OS file system is on another drive. The 5th drive. It is separate to the raid array. This is a cat of my fstab

Code:
root@sysresccd /home/etc % cat fstab
LABEL=/1                /                       ext3    defaults        1 1
/dev/md0                /var/flexshare/shares   ext3    defaults        1 2
LABEL=/boot1            /boot                   ext3    defaults        1 2
tmpfs                   /dev/shm                tmpfs   defaults        0 0
devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
sysfs                   /sys                    sysfs   defaults        0 0
proc                    /proc                   proc    defaults        0 0
LABEL=SWAP-sdb2         swap                    swap    defaults        0 0

Maybe the md is a PV? pvs and/or pvdisplay will tell you
Im not sure what this means. When I type pvs or pvdisplay nothing shows.

Also note. I am booted off a rescue cd in the mean time.

Other information. This is my fdisk -l

Code:
root@sysresccd /home/etc % fdisk -l

Disk /dev/sda: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00033916

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *          63  3907024064  1953512001   fd  Linux raid autodetect

Disk /dev/sdb: 250.1 GB, 250059350016 bytes
255 heads, 63 sectors/track, 30401 cylinders, total 488397168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00056c82

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *          63      208844      104391   83  Linux
/dev/sdb2          208845     4401809     2096482+  82  Linux swap / Solaris
/dev/sdb3         4401810   488392064   241995127+  83  Linux

Disk /dev/sdc: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00076244

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1   *          63  3907024064  1953512001   fd  Linux raid autodetect

Disk /dev/sde: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000b5173

   Device Boot      Start         End      Blocks   Id  System
/dev/sde1   *          63  3907024064  1953512001   fd  Linux raid autodetect

Disk /dev/sdd: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000955c5

   Device Boot      Start         End      Blocks   Id  System
/dev/sdd1   *          63  3907024064  1953512001   fd  Linux raid autodetect
 
The Ubuntu server CD in rescue mode is useful (even on non Ubuntu installs).
It automatically detects software RAID setups and LVM volumes when you boot so you don't need to fiddle with mdadm or the LVM tools by hand.
Just boot to a rescue shell and mount the file systems.

To reassemble a RAID array I always use the --assemble option.
Doesn't --create try to create a new RAID array? :wtf:
 
When I try create the raid from the rescue cd with the following command... everything looks alright:

Code:
root@sysresccd /home/etc % mdadm --create /dev/md0 -n 4 -c 256 -l 5 -p left-symmetric --assume-clean /dev/sda1 /dev/sdc1 /dev/sdd1 /dev/sde1
mdadm: /dev/sda1 appears to contain an ext2fs file system
    size=1565568512K  mtime=Wed Jul 18 05:51:09 2012
mdadm: /dev/sda1 appears to be part of a raid array:
    level=raid5 devices=3 ctime=Tue Aug  7 10:58:27 2012
mdadm: /dev/sdc1 appears to be part of a raid array:
    level=raid5 devices=3 ctime=Tue Aug  7 10:59:17 2012
mdadm: /dev/sdd1 appears to be part of a raid array:
    level=raid5 devices=3 ctime=Tue Aug  7 10:59:17 2012
mdadm: /dev/sde1 appears to contain an ext2fs file system
    size=491826688K  mtime=Wed Jul 18 05:51:09 2012
mdadm: /dev/sde1 appears to be part of a raid array:
    level=raid5 devices=3 ctime=Tue Aug  7 10:59:17 2012
Continue creating array? y
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.
root@sysresccd /home/etc %

However, when I mount it... this is the part that is pwning me

Code:
root@sysresccd /etc % mount /dev/md0 /mnt/raid
NTFS signature is missing.
Failed to mount '/dev/md0': Invalid argument
The device '/dev/md0' doesn't seem to have a valid NTFS.
Maybe the wrong device is used? Or the whole disk instead of a
partition (e.g. /dev/sda, not /dev/sda1)? Or the other way around?
root@sysresccd /etc %
 
Uhm, you created the array as a raid5 array, however there are 4 drives, are you sure it was a raid5? Raid10 sounds more plausible with 4x drives.
 
Uhm, you created the array as a raid5 array, however there are 4 drives, are you sure it was a raid5? Raid10 sounds more plausible with 4x drives.

According to this it is raid 5

Code:
root@sysresccd /etc % mdadm -E /dev/sdc1
/dev/sdc1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 528ac1ec:baedc77e:47eb854c:97fa539e
           Name : sysresccd:0  (local to host sysresccd)
  Creation Time : Tue Aug  7 11:15:03 2012
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 3907021954 (1863.01 GiB 2000.40 GB)
     Array Size : 11721063936 (5589.04 GiB 6001.18 GB)
  Used Dev Size : 3907021312 (1863.01 GiB 2000.39 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : a67ec651:635b43aa:bdb65c46:c8dd309e

    Update Time : Tue Aug  7 11:15:03 2012
       Checksum : 2834163f - correct
         Events : 0

         Layout : left-symmetric
     Chunk Size : 256K

   Device Role : Active device 1
   Array State : AAAA ('A' == active, '.' == missing)
 
A better way to re-assemble mdadm arrays (without writing data to them) is:

mdadm -Es >> /etc/mdadm/mdadm.conf
Check what is in that file and then use mdadm --assemble /dev/md0 for example.

With mdadm --create you write data to the drives, and you never know but you might have created them incorrectly...

You have to however also check whether lvm was used to create volumes on them as well.

After assembling the drives, run:
vgdisplay
lvdisplay

to mount a lvm:
vgscan
vgchange -a y
mount /dev/$volumegroup/$logicalvolume /mnt/somefolderyoucreated

- replace $volumegroup with name from vgdisplay and $logicalvolume with name from lvdisplay
- there can be more than one logical volume you need to mount

you need the vgscan commands if you boot off a rescuecd or some livecd where the OS are unaware of the LVM stuff.

Code:
root@sysresccd /etc % vgscan
  Reading all physical volumes.  This may take a while...
  No volume groups found
root@sysresccd /etc %
 
Looks like either the RAID5 was a 3 disk array with 1 hot spare or the Array went degraded and forgot 1 of the disks after the failure

Either would explain the problems mounting the Array as NTFS storage ( even though you may have all the disks the order may be off or your RAID info may be incorrect when you try to create the array ) or you may actually have been using it as LVM
 
If anyone thinks they can fix this. Im willing to pay :D Can create shell access.
 
Can you do mdadm -E /dev/sdx1 for all the drives,looks like the RAID order is wrong too
 
Can you do mdadm -E /dev/sdx1 for all the drives,looks like the RAID order is wrong too



Code:
root@sysresccd /home/etc % mdadm -E /dev/sda1
/dev/sda1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 528ac1ec:baedc77e:47eb854c:97fa539e
           Name : sysresccd:0  (local to host sysresccd)
  Creation Time : Tue Aug  7 11:15:03 2012
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 3907021954 (1863.01 GiB 2000.40 GB)
     Array Size : 11721063936 (5589.04 GiB 6001.18 GB)
  Used Dev Size : 3907021312 (1863.01 GiB 2000.39 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : d64adeed:f6341697:ece61d83:b199f55f

    Update Time : Tue Aug  7 11:15:03 2012
       Checksum : afa4a819 - correct
         Events : 0

         Layout : left-symmetric
     Chunk Size : 256K

   Device Role : Active device 0
   Array State : AAAA ('A' == active, '.' == missing)
root@sysresccd /home/etc %


root@sysresccd /home/etc % mdadm -E /dev/sdc1
/dev/sdc1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 528ac1ec:baedc77e:47eb854c:97fa539e
           Name : sysresccd:0  (local to host sysresccd)
  Creation Time : Tue Aug  7 11:15:03 2012
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 3907021954 (1863.01 GiB 2000.40 GB)
     Array Size : 11721063936 (5589.04 GiB 6001.18 GB)
  Used Dev Size : 3907021312 (1863.01 GiB 2000.39 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : a67ec651:635b43aa:bdb65c46:c8dd309e

    Update Time : Tue Aug  7 11:15:03 2012
       Checksum : 2834163f - correct
         Events : 0

         Layout : left-symmetric
     Chunk Size : 256K

   Device Role : Active device 1
   Array State : AAAA ('A' == active, '.' == missing)
root@sysresccd /home/etc %

root@sysresccd /home/etc % mdadm -E /dev/sdd1
/dev/sdd1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 528ac1ec:baedc77e:47eb854c:97fa539e
           Name : sysresccd:0  (local to host sysresccd)
  Creation Time : Tue Aug  7 11:15:03 2012
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 3907021954 (1863.01 GiB 2000.40 GB)
     Array Size : 11721063936 (5589.04 GiB 6001.18 GB)
  Used Dev Size : 3907021312 (1863.01 GiB 2000.39 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : f3370d67:d83fdcef:97e34af8:157f0798

    Update Time : Tue Aug  7 11:15:03 2012
       Checksum : 2ed8822a - correct
         Events : 0

         Layout : left-symmetric
     Chunk Size : 256K

   Device Role : Active device 2
   Array State : AAAA ('A' == active, '.' == missing)


root@sysresccd /home/etc % mdadm -E /dev/sde1
/dev/sde1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 528ac1ec:baedc77e:47eb854c:97fa539e
           Name : sysresccd:0  (local to host sysresccd)
  Creation Time : Tue Aug  7 11:15:03 2012
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 3907021954 (1863.01 GiB 2000.40 GB)
     Array Size : 11721063936 (5589.04 GiB 6001.18 GB)
  Used Dev Size : 3907021312 (1863.01 GiB 2000.39 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 7854e1ed:20c0c20f:7752997a:e4dd84f5

    Update Time : Tue Aug  7 11:15:03 2012
       Checksum : b55eeca6 - correct
         Events : 0

         Layout : left-symmetric
     Chunk Size : 256K

   Device Role : Active device 3
   Array State : AAAA ('A' == active, '.' == missing)
root@sysresccd /home/etc %

Thats all of them.
 
When I try create the raid from the rescue cd with the following command... everything looks alright:

Code:
root@sysresccd /home/etc % mdadm --create /dev/md0 -n 4 -c 256 -l 5 -p left-symmetric --assume-clean /dev/sda1 /dev/sdc1 /dev/sdd1 /dev/sde1
mdadm: /dev/sda1 appears to contain an ext2fs file system
    size=1565568512K  mtime=Wed Jul 18 05:51:09 2012
mdadm: /dev/sda1 appears to be part of a raid array:
    level=raid5 devices=3 ctime=Tue Aug  7 10:58:27 2012
mdadm: /dev/sdc1 appears to be part of a raid array:
    level=raid5 devices=3 ctime=Tue Aug  7 10:59:17 2012
mdadm: /dev/sdd1 appears to be part of a raid array:
    level=raid5 devices=3 ctime=Tue Aug  7 10:59:17 2012
mdadm: /dev/sde1 appears to contain an ext2fs file system
    size=491826688K  mtime=Wed Jul 18 05:51:09 2012
mdadm: /dev/sde1 appears to be part of a raid array:
    level=raid5 devices=3 ctime=Tue Aug  7 10:59:17 2012
Continue creating array? y
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.
root@sysresccd /home/etc %

Using --create instead of --assemble was likely a mistake, and might've hosed your data.

Notice this bit:

Code:
level=raid5 devices=3

Makes me agree with one of the other posters that it might've been a RAID5 with a hot spare.

You wouldn't happen to have mdadm -E output from before the mdadm --create?

The ext2fs on sda1 and sde1 in addition to the raid component is puzzling though...
 
Using --create instead of --assemble was likely a mistake, and might've hosed your data.

Notice this bit:

Code:
level=raid5 devices=3

Makes me agree with one of the other posters that it might've been a RAID5 with a hot spare.

You wouldn't happen to have mdadm -E output from before the mdadm --create?

The ext2fs on sda1 and sde1 in addition to the raid component is puzzling though...

Ive had this problem before. Actually earlier this week. Where I used that create command to be able to create the raid array. Mount it and recover it. The mdadm -E is pretty much the same.

I was following the instructions as per here: http://wiki.centos.org/fr/TipsAndTricks/Repair_RAID5_Volumes
 
Ive had this problem before. Actually earlier this week. Where I used that create command to be able to create the raid array. Mount it and recover it. The mdadm -E is pretty much the same.

I was following the instructions as per here: http://wiki.centos.org/fr/TipsAndTricks/Repair_RAID5_Volumes

Whats worrying is that level=raid5 devices=3 that's now mounted with 4 devices,and this is also why I hate RAID5 passionately

You may have hosed this array somewhat now doing a --create,you should be using --assemble flag first to see if you can connect to the existing array ( if the superblocks are untouched usually this is enough to access the array when mounting )
 
If it was a raid5 array, recreating it as a raid5 wouldn't screw up the data, however if it was a raid10 and he created a raid5 over it...that is where the problem comes in. Never use the create command when you don;t want to lose data.

Do you have output form "mdadm -E" from before you used --create the 1st time?
 
If you seriously need to data on the array and you can't build the array correctly using assemble take a windows desktop with 4 spare SATA ports
Install windows xp or 7 on it,grab Reclaime have it scan the disks to retrieve the original raid configuration,then use R-studio or similar to create a virtual array and start recovering data

*Caveat - this may take a few days worth of scanning

I've had a few RAID5's fail so i've done my fair share of rebuilds
 
Top
Sign up to the MyBroadband newsletter
X