howto change MDADM UUID / superblock , "superblock on /dev/nnn doesn't match others"

原贴:http://ubuntuforums.org/archive/index.php/t-410136.html

View Full Version : howto change MDADM UUID / superblock , "superblock on /dev/nnn doesn't match others"


djamu
April 15th, 2007, 12:41 PM
After switching my server from 6.06 > 7.04 ( i'm skipping 6.10 because of broken sata_uli with M5281 chipset ) to support my new hardware ( sata_mv + fixed sata_uli on 2.6.20 kernel )

To be on the safe side I removed my RAID
clean installation of 7.04 +complete update + mdadm install ( before attaching any raid disk )
halt
attached raid array
boot

array is raid 5 with 4 disks, previously I removed 1 disk
array was temporary running in degraded mode ( which should have been fine )

array refuses to assemble

root@ubuntu:/# mdadm --assemble /dev/md0 /dev/hdg1 /dev/hdh1 /dev/sda1
mdadm: superblock on /dev/sda1 doesn't match others - assembly aborted


brought array back to 6.06 server, same result.... so this means :evil: 7.04 wrote something on /dev/sda1, without noticing or consulting me - it wasn't even mounted, doesn't exist in fstab, I always mount manually -

it's similar ( yet different ) from my previous post
http://ubuntuforums.org/showthread.php?t=405782&highlight=mdadm


root@ubuntu:/# mdadm --examine /dev/sda1
/dev/sda1:
Magic : a92b4efc
Version : 00.90.03
UUID : 8d754c1d:5895bb70:b1e8e808:894665ea
Creation Time : Sun Sep 10 22:51:43 2006
Raid Level : raid5
Device Size : 156288256 (149.05 GiB 160.04 GB)
Array Size : 468864768 (447.14 GiB 480.12 GB)
--------snipsnip
-
/dev/hdg1:
Magic : a92b4efc
Version : 00.90.03
UUID : 8d754c1d:5895bb70:c89ffdee:815a6cef
Creation Time : Sun Sep 10 22:51:43 2006
Raid Level : raid5
Device Size : 156288256 (149.05 GiB 160.04 GB)
Array Size : 468864768 (447.14 GiB 480.12 GB)
-------snipsnip
-
/dev/hdh1:
Magic : a92b4efc
Version : 00.90.03
UUID : 8d754c1d:5895bb70:c89ffdee:815a6cef
Creation Time : Sun Sep 10 22:51:43 2006
Raid Level : raid5
Device Size : 156288256 (149.05 GiB 160.04 GB)
Array Size : 468864768 (447.14 GiB 480.12 GB)

root@ubuntu:/# mdadm --assemble /dev/md0 /dev/hdg1 /dev/hdh1 /dev/sda1
mdadm: superblock on /dev/sda1 doesn't match others - assembly aborted


while the hdh1 and hdg1 drive are ok, the UUID / superblock for sda ( and the removed sdb ) changed.

How do I fix this ?

guess here goes my sunday afternoon.

Thanks alot !

djamu
April 24th, 2007, 05:21 PM
OK. resolved this,

Don't know ( yet ) how this happened, but seems like you can re-create your array
( tested this first with dummy arrays on vmware )
I used one missing device when defining the array to make sure it didn't start (possibly wrong ) resyncing.
It's very important to define the EXACT sequence your array previously was in ( make sense as it then finds it's spare blocks where they where before ),
do:
mdadm -E /dev/sda1 ( or whatever device + partition your using ) and this for any device in the array

root@feisty-server:/# mdadm -E /dev/sda1

----snip

Number Major Minor RaidDevice State
this 0 8 1 0 active sync /dev/sda1

0 0 8 1 0 active sync /dev/sda1
1 1 0 0 1 faulty removed
2 2 22 1 2 active sync /dev/hdc1
3 3 22 65 3 active sync /dev/hdd1


and then do

mdadm --create /dev/md0 --assume-clean --level=5 --raid-devices=4 /dev/sda1 missing /dev/hdc1 /dev/hdd1


it gave me some info about the previous state of the array, and asked me if I really wanted to continue > Yes

mounted it & voila it worked again :)

( tested this on vmware without the "--assume-clean" flag, and even with --zero-superblock, but I didn't dare to do this with my real array - guess that should be ok-

if you give the wrong sequence >

/dev/sda1 missing /dev/hdd1 /dev/hdc1
or
/dev/sda1 /dev/hdc1 /dev/hdd1 missing


the array will still be created but will refuse to mount, DO NOT RUN FSCK on the MD device as it will definetly kill your data. Just try again with another sequence
( the underlying physical device blocks have actually nothing to do with the actual filesystem of the MD device )
As I was studying the subject, I noticed that there's a lot of confusion regarding the MD superblocks, just keep in mind that both the physical device (as part of the array ) & the MD device ( the complete array with filesystem ) have superblocks....

hope it helps someone.

Gruelius
April 29th, 2007, 04:56 AM
That worked for me however after i rebooted mdadm tells me that it cant find any of the devices.

julius@tuxserver:~$ sudo mdadm --assemble /dev/md0
mdadm: no devices found for /dev/md0


julius@tuxserver:~$ sudo mdadm -E /dev/hdb/dev/hdb:
Magic : a92b4efc
Version : 00.90.00
UUID : daf47178:4eba9cde:1ed6dcb2:94163062
Creation Time : Sun Apr 29 15:55:56 2007
Raid Level : raid5
Device Size : 195360896 (186.31 GiB 200.05 GB)
Array Size : 781443584 (745.24 GiB 800.20 GB)
Raid Devices : 5
Total Devices : 5
Preferred Minor : 0

Update Time : Sun Apr 29 18:31:17 2007
State : clean
Active Devices : 5
Working Devices : 5
Failed Devices : 0
Spare Devices : 0
Checksum : 1dd6f07c - correct
Events : 0.12

Layout : left-symmetric
Chunk Size : 128K

Number Major Minor RaidDevice State
this 0 3 64 0 active sync /dev/hdb

0 0 3 64 0 active sync /dev/hdb
1 1 33 0 1 active sync /dev/hde
2 2 33 64 2 active sync /dev/hdf
3 3 34 0 3 active sync /dev/hdg
4 4 34 64 4 active sync /dev/hdh


and then

julius@tuxserver:~$ sudo mdadm --assemble /dev/md0 /dev/hdb /dev/hde /dev/hdf /dev/hdg /dev/hdh
mdadm: superblock on /dev/hde doesn't match others - assembly aborted



..sigh.. any ideas?

djamu
April 30th, 2007, 08:29 AM
That worked for me however after i rebooted mdadm tells me that it cant find any of the devices.

julius@tuxserver:~$ sudo mdadm --assemble /dev/md0
mdadm: no devices found for /dev/md0


julius@tuxserver:~$ sudo mdadm -E /dev/hdb/dev/hdb:
Magic : a92b4efc
Version : 00.90.00
UUID : daf47178:4eba9cde:1ed6dcb2:94163062
Creation Time : Sun Apr 29 15:55:56 2007
Raid Level : raid5
Device Size : 195360896 (186.31 GiB 200.05 GB)
Array Size : 781443584 (745.24 GiB 800.20 GB)
Raid Devices : 5
Total Devices : 5
Preferred Minor : 0

Update Time : Sun Apr 29 18:31:17 2007
State : clean
Active Devices : 5
Working Devices : 5
Failed Devices : 0
Spare Devices : 0
Checksum : 1dd6f07c - correct
Events : 0.12

Layout : left-symmetric
Chunk Size : 128K

Number Major Minor RaidDevice State
this 0 3 64 0 active sync /dev/hdb

0 0 3 64 0 active sync /dev/hdb
1 1 33 0 1 active sync /dev/hde
2 2 33 64 2 active sync /dev/hdf
3 3 34 0 3 active sync /dev/hdg
4 4 34 64 4 active sync /dev/hdh


and then

julius@tuxserver:~$ sudo mdadm --assemble /dev/md0 /dev/hdb /dev/hde /dev/hdf /dev/hdg /dev/hdh
mdadm: superblock on /dev/hde doesn't match others - assembly aborted



..sigh.. any ideas?

sure,

sidenote not really relevant but worth the info:
> it seems that your not using partitions ( not that it matters much ) but since there's no partition table, other OS's ( M$ ) might write something ( initialize ) on it destroying at least a couple of sectors -as I said not really relevant if those disks never see windows, but still good practice to use partitions-

sudo mdadm -E /dev/hdb/dev/hdb
on the beginning is probably a typo right?

just do this for any hd ( and post )
it's very probable that some drive letters changed name ( ex. /dev/hdf that became /dev/hdi )
assembling them with the old drive names won't work in that case.
( got an earlier post about that here
http://ubuntuforums.org/showthread.php?t=405782
This happened after I inserted a removable drive + reboot while it was inserted.
It didn't happen again since then ( did you recently do a system upgrade, if yes chances are that MDADM got upgraded to ..... )

an example:
note: examining /dev/hdc1 gives a result for /dev/hdg1
I've put it in red. Read this as follows " The device /dev/hdc1 formerly known as /dev/hdg1 "


root@ubuntu:/# mdadm -E /dev/hdc1
/dev/hdc1:
Magic : a92b4efc
Version : 00.90.00
UUID : 23d10d44:d59ed967:79f65471:854ffca8
Creation Time : Mon Apr 9 21:21:13 2007
Raid Level : raid0
Device Size : 0
Raid Devices : 2
Total Devices : 2
Preferred Minor : 1

Update Time : Mon Apr 9 21:21:13 2007
State : active
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Checksum : 2e17ac36 - correct
Events : 0.1

Chunk Size : 64K

Number Major Minor RaidDevice State
this 0 34 1 0 active sync /dev/hdg1

0 0 34 1 0 active sync /dev/hdg1
1 1 34 65 1 active sync /dev/hdh1



To make a long story short. Do a #mdadm -E for all devices, write down their current & old names & make a table [ raid device nr. / new name / old name ]

- if all UUIDs ( of the physical device !, not the MD device ) match, just assemble it using the new names ( use the raid device nr. to define the correct sequence )

- if UUID's differ, recreate using new names & raid device nr for correct sequence ( use 1 missing, so your array doesn't start resyncing & possibly wiping a wrong assembled array, if everything is fine and your able to mount it, you can re-add the missing disk )

If in doubt just ask again, I'll give you the correct command

good luck

DannyW
May 12th, 2007, 11:59 AM
Hello. Nice how to, but unfortunately the UUID's are messed up again after rebooting.

I posted a thread earlier today before I found yours:
http://ubuntuforums.org/showthread.php?t=441040

Basically, I want to set up a raid5 array consisting of sda1(500GB), sdc1(500GB) and md0(raid0:200GB300GB).

The raid0 array (/dev/md0) created fine. When creating the raid5 array (/dev/md1) it initally started as a degraded, rebuilding array, which is normal for raid5. After a couple of hours when this had finished the raid5 array looked normal.

After rebooting it initialised with just 2 drives. I could fix this by mdadm --manage /dev/md1 --add /dev/md0, then after a couple of hours of rebuilding it was fine again. Until the reboot.

After reading this thread I noticed the UUID's given by mdadm --misc -E /dev/sda1 /dev/sdc1 /dev/md0 were not all the same. md0 was different; it was the same as the UUID's of the devices of the raid0 array, /dev/md0.

So I used your method. The /dev/md1 was mountable with /dev/md0 missing, and so I knew it was safe to add /dev/md0 to the array. I did this and then after a couple of hours the array looked fine.

I rebooted, and now sda1 and md1 share the UUID of the devices in the raid0 array, and sdc1 has a unique UUID.

I hope I have explained this clearly enough.

Any help would be greatly appreciated!

Thank you.

danny@danny-desktop:~$ cat /etc/mdadm/mdadm.conf
# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#

# by default, scan all partitions (/proc/partitions) for MD superblocks.
# alternatively, specify devices to scan, using wildcards if desired.
DEVICE partitions

# auto-create devices with Debian standard permissions
CREATE owner=root group=disk mode=0660 auto=yes

# automatically tag new arrays as belonging to the local system
HOMEHOST

# instruct the monitoring daemon where to send mail alerts
MAILADDR root

# definitions of existing MD arrays
ARRAY /dev/md0 level=raid0 num-devices=2 UUID=2ef71727:a367450b:4f12a4b2:e95043a1
ARRAY /dev/md1 level=raid5 num-devices=3 UUID=4c4b144b:ae4d69bc:355a5a07:0f3721ab


# This file was auto-generated on Fri, 11 May 2007 18:08:52 +0100
# by mkconf $Id: mkconf 261 2006-11-09 13:32:35Z madduck $

danny@danny-desktop:~$ cat /etc/fstab
# /etc/fstab: static file system information.
#
#
# /dev/sda3
UUID=99873af1-7d82-476c-975e-7165fedb7cee / ext3 defaults,errors=remount-ro 0 1
# /dev/sda1
UUID=56703a2b-449e-4c73-b937-41e5de341a0d /boot ext3 defaults 0 2
/dev/mapper/vghome-lvhome /home reiserfs defaults,nodev,nosuid 0 2
# /dev/sda2
UUID=801702de-257d-481b-b0de-9ad2108893da none swap sw 0 0
/dev/scd0 /media/cdrom0 udf,iso9660 user,noauto 0 0
proc /proc proc defaults 0 0


Note: I noticed during boot there was a message:
"mdadm: no devices listed in conf file were found"

Also, I'm using LVM2, though I doubt this is related. /dev/md1 is the only PhysicalVolume in my LVM VolumeGroup.

djamu
May 12th, 2007, 03:45 PM
Basically, I want to set up a raid5 array consisting of sda1(500GB), sdc1(500GB) and md0(raid0:200GB300GB).


mmm, interesting but according to me doomed to work consistently.
Why ?
For MD1 to work MD0 has to be completely assembled first, otherwise MD1 will start with 1 device missing ( MD0 )
Since you won't benefit ( speed wise ) from that stripe ( remember RAID5 is also a stripe, actually RAID0 + parity ) because your actual speed depends on the slowest device. ( like an internet connection ).

Guess the idea was to have raid 5 with 3 * 500 gb devices
Because of the speed issue I mentioned before. It would work equally good with a linear raid.
which does the same thing as a LVM / EVMS volume.... I prefer using EVMS whenever I can, since this has a lot more options - if your using a desktop ( Gnome ? ) there is a nice GUI in the repository.

In your case you'll have to stick with LVM because EVMS doesn't have kernel support ( yet )- I might be wrong on this one regarding the new feisty kernel -

So instead of using a raid0 for the 2 drives ( 200 + 300 ) use LVM to built a 500 gb device and use that one for your RAID5

A workaround would be to remove your MD1 device from your mdadm.conf to make sure MD0 gets assembled properly before assembling MD1 manually ( script this :) ) ...... not very handy, but doable...


After rebooting it initialised with just 2 drives. I could fix this by mdadm --manage /dev/md1 --add /dev/md0, then after a couple of hours of rebuilding it was fine again. Until the reboot.

So I used your method. The /dev/md1 was mountable with /dev/md0 missing, and so I knew it was safe to add /dev/md0 to the array. I did this and then after a couple of hours the array looked fine.

I rebooted, and now sda1 and md1 share the UUID of the devices in the raid0 array, and sdc1 has a unique UUID.


What method ? :) , like I said before there's no way to tell what device gets assembled first. use LVM for your third RAID5 device.


Note: I noticed during boot there was a message:
"mdadm: no devices listed in conf file were found"


Just ignore this. ( your on Feisty right ? ) It has something to do with the new MDADM version.
For your info - & contrary to what the manual says - you don't need mdadm.conf ...
( actually it depends on the mode mdadm is running ), your arrays will be assembled as soon as the mdadm kernel module detects them.
None of my servers has a mdadm.conf ( allthough I must tell you, that those only run dapper & edgy ),
nor does the arrays appear in fstab because I prefer to manually ( some cron scripts ) mount them.
I got a feisty Desktop ( with raid ) which has a automatically generated mdadm.conf, didn't check if you can delete this. Dapper / Edgy & Feisty use all different versions of MDADM ( dapper v 1.xx -got to check this- edgy & feisty v 2.xx )


Also, I'm using LVM2, though I doubt this is related. /dev/md1 is the only PhysicalVolume in my LVM VolumeGroup.


huh ? mmmm... maybe you better post following, ... you made a LVM volume out of the RAID volume.
use EVMS instead. way more options, and since it doesn't run ( yet ) at boottime....


fdisk -l

cat /proc/mdstat




( I'll have to check some things, so expect this reply to get altered )

cheers

Jan

DannyW
May 13th, 2007, 06:11 AM
Wow! Thank you very much for the informative and speedy reply!

The only reason I used LVM on the raid5 was so that I could extend the volume if I ever needed to. But I see this can be done easily with mdadm, which makes more sense. I just wasn't sure if it was possible to remove LVM whilst keeping my data safe, so just left it in place.

Using LVM for the raid0 makes perfect sense, and I assume this can be done without losing data, as the raid 5 can run degraded whilst I set up the 3rd device.

Would there then be issues with which starts first, LVM or mdadm?

Some good news is, I again used information from your earlier post to stop the array and recreate it and, for now, all is working ok.

I stopped the arrays (this time both of them, previously I only did the raid5), zeroed the super blocks and recreated with the same device order.

mdadm --manage /dev/md1 --stop
mdadm --manage /dev/md0 --stop
mdadm --zero-superblock /dev/sdb1
mdadm --zero-superblock /dev/sdd1
mdadm --zero-superblock /dev/sda1
mdadm --zero-superblock /dev/sdc1
mdadm --zero-superblock /dev/md0
mdadm --create /dev/md0 --level=0 --raid-devices=2 /dev/sdb1 /dev/sdd1
mdadm --create /dev/md1 --level=5 --raid-devices=3 /dev/sda1 /dev/sdc1 /dev/md0

After the couple hour rebuild things are ok, I haven't rebooted many times since I completed this step, so I'm not very confident about it holding up.

Meanwhile, I shall do some research on EVMS, to see what it can offer me.

I can switch md0 over to LVM easily, but I don't know if I can remove LVM, keeping my data in place, I'll try to find some info on this too.

Once again, thank you for that great response! I may be back in touch very soon when it breaks ;)

Kind regards,
Danny

djamu
May 13th, 2007, 06:48 AM
np.
It won't break that easy, as long as it runs. :lol:.

Moreover, one of the problems I had lately was that my linux still recognized a reiserFS on a full ( not quick ! ) formatted NTFS ( which had reiserFS before ).
Needed to zero "dd" the first GB's ( didn't have the patience to do a full zero dd )
http://ubuntuforums.org/showthread.php?t=422549
It's actually quite assuring these things are so persistent :-/"

Just make sure you post your results, others might benefit from it to.


Jan

 

你可能感兴趣的:(raid)