No complete reboot with sw-raid when one disk get faulty

Bug #125561 reported by gerbra
This bug report is a duplicate of:  Bug #120375: cannot boot raid1 with only one disk. Edit Remove
4
Affects Status Importance Assigned to Milestone
Ubuntu
Incomplete
Undecided
Brian Murray

Bug Description

System: Edubuntu 7.04 Kernel 2.6.20-16-generic x86_64

The server has 4 scsi disks.
There is a combination of sw raid1 and raid10.
/boot and swap are raid1
/ and /home are raid10.

I simulate an totally fault of one disk.
Then the system doesn't reboot correctly with 3 of 4 devices.
I got stuck in busybox/initramfs. There i were able to reassemble the md arrays
with mdadm.
But the server does not reboot with only 3 disks. And that should not be.

I've put a script with mdadm -A -s in the various script sections of
initramfs and generated similar new initrd images. Also put raid1, raid10
an md/scsi modules in these initrds.
As an effect i see that during boot of initrd the md devices become "up" with the 3 of 4 devices.
But further mount of the root fs and starting init is still not possibly, i run only in busybox.

Before busybox i see these in tty-Logs (Sorry, only notices on paper, i am not on the server)
Trying to resume from /dev/md0
Then i see an modprobe call but only the usage text of modprobe-
Then an entry that /etc/fstab could not be read-
Then that /proc and /sys could not mounted
Then: Target filesystem doesn't have /sbin/init
Then i got in busybox.

I have tested the same scenario on an other distribution in viritual machine and there the system
boots after a "disk fault" only with 3 of 4 devices.

Sorry for my bad english.
I could provide more information if needed.

Thank you.

Revision history for this message
Brian Murray (brian-murray) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. You reported this bug a while ago and there hasn't been any activity in recently. We were wondering if this is still an issue for you? Thanks in advance.

Revision history for this message
gerbra (gerhard-brauer) wrote :

Thank for your answer, yeah i should have gave this report maybe a higher priority.

We updated this server at least to Edubuntu 7.10. I could not exactly say if above scenario is gone after update cause the server is in production. I could/will do a test maybe on weekend in week 50.
But i asume that our raid problem is still "alive"

At boot time of the server (it is not always running) we sometimes have degraded SW-Raid-Arrays.
After update to 7.10 it is better, but by ~10 Boots we get ~2 degraded array(s).
The harddisks are 100% ok, i asume this may be a timing problem on boot.
The degrated /deb/md Device changes randomly an also the /dev/sdXY device in the array.
Tested with rootdelay parameter, also with some sleeps in initrd-Skripts.

It's a little annoying to always do a: mdadm /dev/mdX -a /dev/sdXY, watch mdstat output an wait for the next false alarm ;-)

Revision history for this message
Alexander Dietrich (adietrich) wrote :

Hi, this is a duplicate of bug #120375, which contains a potential fix in one of the last comments by Ken.

Unfortunately, bug #120375 is marked as "not in Ubuntu", so it doesn't get much attention, I'm afraid.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.