Bug #120375 “cannot boot raid1 with only one disk” : Bugs : mdadm package : Ubuntu

Revision history for this message

peterh (peter-holik) wrote on 2007-06-14:

#1

Sorry folks, being tooo fast.

This is not working if all discs are present, now i understand the include of MD_DEGRADED_ARGS

my new workaround is to add a bootmenue entry in grub

title Ubuntu, kernel 2.6.20-16-generic (raid defect)
root (hd0,1)
kernel /boot/vmlinuz-2.6.20-16-generic root=/dev/md1 ro raid_degraded
initrd /boot/initrd.img-2.6.20-16-generic

and /etc/initramfs-tools/scripts/init-premount/raid_degraded

#!/bin/sh

set -eu

PREREQ="udev"

prereqs()
{
echo "$PREREQ"
}

case ${1:-} in
  prereqs)
    prereqs
    exit 0
    ;;
  *)
    . /scripts/functions
    ;;
esac

if [ -e /scripts/local-top/md ]; then
log_warning_msg "old md initialisation script found, getting out of its way..."
exit 1
fi

MDADM=$(command -v mdadm)
[ -x $MDADM ] || exit 0

if grep raid_degraded /proc/cmdline 2>/dev/null; then
echo "MD_DEGRADED_ARGS=' '" >> /conf/md.conf
fi

exit 0

now if a disc is defect and i will to start only with one disc i choose raid defect from bootmenu

Revision history for this message

Peter Haight (peterh-sapros) wrote on 2007-08-18:

#2

Patch to boot after timeout using degraded RAID drives. Edit (1.1 KiB, text/plain)

I've got a different way to fix this. After reading Bug #75681, it became clear why they have MD_DEGRADED_ARGS in there. They have that because /scripts/local-top/mdadm gets called every time a device marked raid gets called, but they only want mdadm to build the array once all the devices have come up.

So I've stuck a line in the startup script that tries to mount root and if mounting root times out, then we try to run mdadm again, but this time we let it try and run with degraded disks. This way it will still startup automatically in the presence of RAID failures.

Revision history for this message

Peter Haight (peterh-sapros) wrote on 2007-08-18:

#3

So, there is something wrong with that patch. Actually it seems to be working great, but when I disconnect a drive to fail it, it boots up immediately instead of trying mdadm after the timeout. So I'm guessing that the mdadm script is getting called without the from-udev parameter somewhere else. But it is working in some sense because the machine boots nicely with one of the RAID drives disconnected, or with both of them properly setup. So there might be some race problem with this patch.

Revision history for this message

brunus (reg-paolobrunello) wrote on 2007-11-15:

#4

Hey Peter, when I try to apply your patch I get questioned where the /scripts/local file is located. I coulnd't find it. Could you please specify it better? thanks ,

brunus

Revision history for this message

peterh (peter-holik) wrote on 2007-11-15: Re: [Bug 120375] Re: cannot boot raid1 with only one disk

#5

> Hey Peter, when I try to apply your patch I get questioned where the
> /scripts/local file is located. I coulnd't find it. Could you please
> specify it better? thanks ,

this patch is not from me - ist is from the other Peter in this bugreport

cu Peter

Revision history for this message

brunus (reg-paolobrunello) wrote on 2007-11-16:

#6

Hey Peter Haight,
brunus again. I can't find any mdadm file in /usr/share/initramfs-tools/scripts/local-top so the patch just hangs. I'm using Edubuntu 7.10 btw: is this patch just for Feisty?

thanks,

Paolo

Revision history for this message

Peter Haight (peterh-sapros) wrote on 2007-11-16:

#7

I've only tried it on fiesty. I don't have a gutsy machine handy. You have the 'mdadm' package installed, right? If so, then they must have moved stuff around. You could try this:

find / -type f -print0 | xargs -0 grep MD_DEGRADED_ARGS

That will search the whole file system for the a file with MD_DEGRADED_ARGS in it, but if they've moved stuff around, the patch probably won't work anyway.

Revision history for this message

Daniel Pittman (daniel-rimspace) wrote on 2007-11-16:

#8

Peter Haight <email address hidden> writes:

> I've only tried it on fiesty. I don't have a gutsy machine handy. You
> have the 'mdadm' package installed, right? If so, then they must have
> moved stuff around. You could try this:

The model for starting mdadm disks has changed substantially in gutsy;
it is now driven from a udev rule, building the devices as disks are
discovered.

Regards,
Daniel
--
Daniel Pittman <email address hidden> Phone: 03 9621 2377
Level 4, 10 Queen St, Melbourne Web: http://www.cyber.com.au
Cybersource: Australia's Leading Linux and Open Source Solutions Company

Revision history for this message

brunus (reg-paolobrunello) wrote on 2007-11-19:

#9

Thanks for the information Daniel,
but it's still unclear to me whether this problem has been solved yet.

brunus

Revision history for this message

Fumihito YOSHIDA (hito) wrote on 2007-12-03:

#10

Dear brunus,

The problem is not solved in 7.04/7.10.

You have to use "raid_degraded" grub entry with /etc/initramfs-tools/scripts/init-premount/raid_degraded way.
(and update your initramfs, execute "sudo update-initramfs -u")

And, I test with 7.10, mdadm_2.6.2-1ubuntu2_<arch>.deb(7.10's) *does not work* in this way.
When you use 7.10, please down-grade mdadm package.

Use 7.04's mdadm package: "mdadm_2.5.6-7ubuntu5_<arch>.deb".

Revision history for this message

Davias (davias) wrote on 2007-12-14:

#11

Dear Fumihito,
Thank you for your help in this very annoying bug with ubuntu (so long for the "new&improved" version)...

I'm running 7.10, tried the above procedure and realized (as you stated) that mdadm_2.6.2-1ubuntu2 DOES NOT work on 7.10 with grub trick.

3 questions:

1) How to degrade to mdadm_2.5.6-7ubuntu5_<arch>.deb as you suggest? In Synaptic packet manager the option "force" is not available...

2) I'm a little bit scared of running a previous version of mdadm on my RAID1 created with 2.6.2: is it safe?

3) Since this is just a workaround and not a solution (the system should automatically start without user selecting a grub option), shall I wait for a new mdadm?

Thanks in advance for yours and everybody's help

Revision history for this message

Fumihito YOSHIDA (hito) wrote on 2007-12-14:

#12

Dear Davias,

1)
I suppose that you use 386, when you use another one(e.g. amd64),
please use your arch's pkg.

- Please download from http://archive.ubuntu.com/ubuntu/pool/main/m/mdadm/
$ wget http://archive.ubuntu.com/ubuntu/pool/main/m/mdadm/mdadm_2.5.6-7ubuntu5_i386.deb

- Install with dpkg command.(in your gnome-terminal)
$ sudo dpkg -i mdadm_2.5.6-7ubuntu5_i386.deb
In this case, we cannot depend on Synaptic.
Downloading and dpkg exec is not a well-mannered procedure, but useful.

- And, "hold" this package. This command is important. If you not set "hold",
update-manager will upgrade mdadm package...(and break your effort).
$ sudo dpkg-hold mdadm

2)
I cannot assert about your concern, but I tested in some case,
system was working well.
If you have worry, re-run mdadm settings with 2.5.6-7ubuntu5.

3)
hmm... This is a difficult quesition.
Old version of mdadm is working well, but I cant understood any reason...
So I cant tell you how you had better do it.

Revision history for this message

Davias (davias) wrote on 2007-12-14:

#13

Dear Fumihito,
first of all, thanks for your fast reply!

1) No, ?No, I'm on amd64. I just went to the archive ubuntu site you suggested and found out that the following exists:

mdadm-udeb_2.6.3+200709292116+4450e59-3ubuntu1_amd64.udeb 13-Dec-2007 01:04 76K
mdadm-udeb_2.6.3+200709292116+4450e59-3ubuntu1_i386.udeb 13-Dec-2007 00:04 73K
mdadm-udeb_2.6.3+200709292116+4450e59-3ubuntu1_powerpc.udeb 13-Dec-2007 01:04 85K
mdadm-udeb_2.6.3+200709292116+4450e59-3ubuntu1_sparc.udeb 13-Dec-2007 01:04 91K

...it seems a fresh new 2.6.3 version of mdadm - maybe it cures the bug ?!?

2) First time I deal with mdadm RAID, but from experiences on other OS with SW RAID, I learned (the hard way) that it is safe not to mess around with driver version different than the ones that created the arrays. But if I have no alternatives... I will try.

3) I tried to find details of this new 2.6.3, but found none. Good common sense makes me think that the mantainer od mdadm was aware of the bug and solved it in this new version, making the all "scripting from grub menu" solution unnecessary...

Suggestions?

Revision history for this message

Davias (davias) wrote on 2007-12-15:

#14

Searching, I found out that latest release is 2.6.4 - but no option in ubuntu repository (yet?).

ANNOUNCE: mdadm 2.6.4 - A tool for managing Soft RAID under Linux
From: Neil Brown <email address hidden>
To: <email address hidden>
Subject: ANNOUNCE: mdadm 2.6.4 - A tool for managing Soft RAID under Linux
Date: Fri, 19 Oct 2007 16:06:29 +1000
Message-ID: <email address hidden>
Archive-link: Article, Thread

I am pleased to announce the availability of
mdadm version 2.6.4

It is available at the usual places:
http://www.cse.unsw.edu.au/~neilb/source/mdadm/

Does some of the following changes applies to our bug?

Changes Prior to 2.6.4 release
    - Make "--create --auto=mdp" work for non-standard device names.
    - Fix restarting of a 'reshape' if it was stopped in the middle.
    - Fix a segfault when using v1 superblock.
    - Make --write-mostly effective when re-adding a device to an array.
    - Various minor fixes

Changes Prior to 2.6.3 release
    - allow --write-behind to be set for --grow.
    - When adding new disk to an array, don't reserve so much bitmap
        space that the disk cannot store the required data. (Needed when
1.x array was created with older mdadm).
    - When adding a drive that was a little too small, we did not get
the correct error message.
    - Make sure that if --assemble find an array in the critical region
of a reshape, and cannot find the critical data to restart the
reshape, it gives an error message.
    - Fix segfault with '--detail --export' and non-persistent
        superblocks.
    - Various manpage updates.
    - Improved 'raid4' support (--assemble, --monitor)
    - Option parsing fixes w.r.t -a
    - Interpret "--assemble --metadata=1" to allow any version 1.x
metadata, and be more specific in the "metadata=" message printed
with --examine --brief
    - Fix spare migration in --monitor.

Changes Prior to 2.6.2 release
    - --fail detached and --remove faulty can be used to fail and
remove devices that are no longer physically present.
    - --export option for --detail or present information in a format
that can be processed by udev.
    - fix internal bitmap allocation problems with v1.1, v1.2 metadata.
    - --help now goes to stdout so you can direct it to a pager.
    - Various manpage updates.
    - Make "--grow --add" for linear arrays really work.
    - --auto-detect to trigger in-kernel autodetect.
    - Make return code for "--detail --test" more reliable. Missing
devices as well as failed devices cause an error.

Searching, I found out that latest release is 2.6.4 - but no option in ubuntu repository (yet?).

ANNOUNCE: mdadm 2.6.4 - A tool for managing Soft RAID under Linux
From: 	 	Neil Brown <neilb@suse.de>
To: 	 	linux-raid@vger.kernel.org
Subject: 	 	ANNOUNCE: mdadm 2.6.4 - A tool for managing Soft RAID under Linux
Date: 	 	Fri, 19 Oct 2007 16:06:29 +1000
Message-ID: 	 	<18200.18789.190714.387367@notabene.brown>
Archive-link: 	 	Article, Thread

I am pleased to announce the availability of
   mdadm version 2.6.4

It is available at the usual places:
   http://www.cse.unsw.edu.au/~neilb/source/mdadm/

Does some of the following changes applies to our bug?

Changes Prior to 2.6.4 release
    -   Make "--create --auto=mdp" work for non-standard device names.
    -   Fix restarting of a 'reshape' if it was stopped in the middle.
    -   Fix a segfault when using v1 superblock.
    -   Make --write-mostly effective when re-adding a device to an array.
    -   Various minor fixes

Changes Prior to 2.6.3 release
    -   allow --write-behind to be set for --grow.
    -   When adding new disk to an array, don't reserve so much bitmap
        space that the disk cannot store the required data. (Needed when
	1.x array was created with older mdadm).
    -   When adding a drive that was a little too small, we did not get
	the correct error message.
    -   Make sure that if --assemble find an array in the critical region
	of a reshape, and cannot find the critical data to restart the
	reshape, it gives an error message.
    -   Fix segfault with '--detail --export' and non-persistent
        superblocks.
    -   Various manpage updates.
    -   Improved 'raid4' support (--assemble, --monitor)
    -   Option parsing fixes w.r.t -a
    -   Interpret "--assemble --metadata=1" to allow any version 1.x
	metadata, and be more specific in the "metadata=" message printed
	with --examine --brief
    -   Fix spare migration in --monitor.

Changes Prior to 2.6.2 release
    -   --fail detached and --remove faulty can be used to fail and
	remove devices that are no longer physically present.
    -   --export option for --detail or present information in a format
	that can be processed by udev.
    -   fix internal bitmap allocation problems with v1.1, v1.2 metadata.
    -   --help now goes to stdout so you can direct it to a pager.
    -   Various manpage updates.
    -   Make "--grow --add" for linear arrays really work.
    -   --auto-detect to trigger in-kernel autodetect.
    -   Make return code for "--detail --test" more reliable.  Missing
	devices as well as failed devices cause an error.

Revision history for this message

brunus (reg-paolobrunello) wrote on 2007-12-15:

#15

Davias,
have you tried any of the 2 releases?

brunus

Revision history for this message

Davias (davias) wrote on 2007-12-16:

#16

Dear brunus,
no: I have not. I was thinking about using Fumihito procedure with 2.5.6, but then I discovered that 2.6.3 existed in ubuntu repository (although Synaptic packet manager does not find any update over 2.6.2) and was thinking about installing that. Then I discovered 2.6.4 is the latest mdadm version as shown above. I downloaded the source... and stopped.

I do not have enough knowledge to compile & install something as critical as driver for RAID (my RAID with data on it...); is not that I'm scared of it, but is my production machine and cannot risk data failure or restore time.

Also I'm not convinced of the results. If this is to just result in a "so-so" procedure like selecting "foulty drive" from grub menu... I'll wait for a cleaner solution - like stable and safe mdadm driver release that will make my RAID1 start with only 1 disk WITHOUT me having to do anything, like it should be.

Thanks all for your precious thinking.

Dave out

Revision history for this message

dbendlin (diego-bendlin-hotmail) wrote on 2007-12-17:

#17

Well,

I went down to the root and installed Debian etch and tried mdadm and it worked just ok.

Regards,

Diego Bendlin

Revision history for this message

Davias (davias) wrote on 2007-12-17:

#18

Meaning what, exactly?

You replaced ubuntu with debian?

Revision history for this message

dbendlin (diego-bendlin-hotmail) wrote on 2007-12-18:

#19

Davias,

What id did is I replaced an ubuntu server 7.10 with a debian etch installation where I setup raid 1 using two sata disks, and after disconnecting one of the raid members System still was able to boot.

AFAIK ubuntu is debian's child, so I needed to try if the parent reproduced the error too, just to give you some more information to finally fix this issue on ubuntu 8.04 maybe.

Kind Regards,

Diego Bendlin

Revision history for this message

Davias (davias) wrote on 2007-12-18:

#20

Dear Diego,
thank you for clarifying matters, I'm glad that it works for you.

So now you got RAID1 as it is supposed to run. Just by changing OS...
Is it that difficult to get it to work on Ubuntu?!?
I mean, it is a serious bug - not dependent on the SW component but on OS version - and no solution?
I thought Ubuntu was one of the most supported distributions... We have to wait for another release?

Revision history for this message

dbendlin (diego-bendlin-hotmail) wrote on 2007-12-19:

#21

Well,

Im not a linux guru, soi I don't really understand why this is happening on Ubuntu (I find the same error since 7.04), I guess this could be issued a a bug fix to the current release but that will depend on the ubuntu development team.

In my opinion Ubuntu is great for desktop usage, I use it as dayly desktop and development workstation, I really like Ubuntu as desktop, haven't found a competitor thats is just that easy to install and setup, I know other distros are also great but normally you need a lot of expertise and configuration time to make it work as Ubuntu does out of the box.

As for the Ubuntu server release I must agree that this is a serious bug, for now I wouldn't install an ubuntu server if it needs to work over software raid (mdadm), after spending 2 weeks trying to make Ubuntu server work I finally switched to debian etch for my server installation.

Kind Regards,

Diego Bendlin

Revision history for this message

Jan Krupa (plnt) wrote on 2007-12-25:

#22

Hi,

The problem can be workarounded by issuing this command on the BusyBox shell when Ubuntu is missing one of the RAID disks:

/sbin/mdadm --assemble --scan

+ reboot

It will remove the missing disk from RAID 1 and allow Ubuntu to boot in degraded mode next time.

I think the root cause of the problem is in forcing mdadm not to start in degraded mode by "--no-degraded" parameter in /etc/udev/rules.d/85-mdadm.rules. If you remove "--no-degraded" parameter from mdadm in /etc/udev/rules.d/85-mdadm.rules and rerun "sudo update-initramfs -u", Ubuntu doesn't refuse to boot even if one of the disks is missing (after this change, no workarounds are needed). The problem is that it starts in degraded mode in some cases even if both disks are present.

Tested on Ubuntu/Gutsy.

I appreciate any comments.

Thanks,
Jan

Revision history for this message

Ken (ksemple) wrote on 2007-12-28:

#23

Hi,

I have been following this post for a couple of weeks and trying to solve this problem as a background job for around a month. So far I have tried many things to overcome this problem and the comment by Plnt seemed to be the most promising.

I tried issuing the --assemble --scan from the BusyBox shell as suggested with no luck, the system still wouldn't boot even though I was able to activate my degraded RAID sets in BusyBox.

I found the same problem when modifying the udev rules. Often the arrays would start degraded even when all the disks were available. I think the solution to this problem lies in modifying the udev rules, maybe we could add some code after the --no-degraded start attempt to start the arrays degraded if they haven't already started.

In my view this is a major problem, there is no point using a RAID1 root disk if you can't boot from a single disk if it's mirror fails.

Cheers,
Ken

Revision history for this message

Jan Krupa (plnt) wrote on 2007-12-28:

#24

Hi Ken,

I wasn't able to boot from degraded array after running "/sbin/mdadm --assemble --scan" in few cases when I had other disks in my computer. If I disconneted the disks and attached just the working one, system booted without a problem in degraded mode (after running the command mentioned above). I think the reason is that mdadm scans for any RAID devices by their signatures on the disk (because there is no /etc/raidtab accessible) and maybe it finds the singatures in different order each time.

There is also "--run" parameter in mdadm which can help assembling RAID in degraded mode.

Sorry for non-detailed description but I currently don't have the computer with Ubuntu+RAID1 physically with me so I can't do the tests.

Jan

Revision history for this message

Davias (davias) wrote on 2007-12-28:

#25

Dear Plnt & Ken,
thanks for providing & trying solutions to this "major problem" as it looks to me too.

But have any of you tried mdadm version 2.6.4, that I found around, as I stated a few posts up?

Regards,
Dave

Revision history for this message

Peter Haight (peterh-sapros) wrote on 2007-12-28:

#26

I haven't messed around with the problem in Gutsy, but what Daniel said above about the difference between Fiesty and Gusty is not correct. Fiesty was also launching mdadm as devices were discovered from udev and that is exactly the problem. Both are pretty much the same, just the code has moved around some. Unfortunately the box I fixed this on is now in production and I haven't had the chance to setup another test one to port my fix to Gutsy, so I'll explain what's going on, and maybe someone else can fix it. The version of mdadm doesn't have anything to do with this problem. This problem is entirely due to the Ubuntu startup scripts.

What Ubuntu is doing is as each device gets discovered by udev it runs:

mdadm --assemble --scan --no-degraded

If your RAID is made up of say sda1 and sdb1, then when 'sda1' is discovered by Linux, udev runs 'mdadm --assemble --scan --no-degraded'. Mdadm tries to build a RAID device using just 'sda1' because that's the only drive discovered so far. It fails because of the '--no-degraded' flag which tells it to not assemble the RAID unless all of the devices are present. If it didn't include the '--no-degraded' flag, it would assemble the RAID in degraded mode. This would be bad because at this point we don't know if 'sdb1' is missing or it just hasn't been discovered by udev yet.

So, then Linux chugs along and finds 'sdb1', so it calls 'mdadm --assemble --scan --no-degraded' again. This time both parts of the RAID (sda1 and sdb1) are available, so the command succeeds and the RAID device gets assembled.

This all works great if all the RAID devices are working, but since it allways runs mdadm with the '--no-degraded' option, it won't assemble the RAID device if say 'sda1' is broken or missing.

My solution was to wait until mounting root failed due to a timeout and then try 'mdadm --assemble --scan' without '--no-degraded' to see if we can assemble a degraded RAID device. Hopefully by the time the root mount has timed out, Linux has discovered all of the disks that it can. This works on my Fiesty box, but as I said above, stuff got moved around for Gutsy and I haven't had a chance to build another box to try it out and fix Gutsy. Also I think my script didn't take into account the scenario where the RAID device isn't root.

I haven't messed around with the problem in Gutsy, but what Daniel said above about the difference between Fiesty and Gusty is not correct. Fiesty was also launching mdadm as devices were discovered from udev and that is exactly the problem. Both are pretty much the same, just the code has moved around some. Unfortunately the box I fixed this on is now in production and I haven't had the chance to setup another test one to port my fix to Gutsy, so I'll explain what's going on, and maybe someone else can fix it. The version of mdadm doesn't have anything to do with this problem. This problem is entirely due to the Ubuntu startup scripts.

What Ubuntu is doing is as each device gets discovered by udev it runs:

mdadm --assemble --scan --no-degraded

If your RAID is made up of say sda1 and sdb1, then when 'sda1' is discovered by Linux, udev runs 'mdadm --assemble --scan --no-degraded'. Mdadm tries to build a RAID device using just 'sda1' because that's the only drive discovered so far. It fails because of the '--no-degraded' flag which tells it to not assemble the RAID unless all of the devices are present. If it didn't include the '--no-degraded' flag, it would assemble the RAID in degraded mode. This would be bad because at this point we don't know if 'sdb1' is missing or it just hasn't been discovered by udev yet.

So, then Linux chugs along and finds 'sdb1', so it calls 'mdadm --assemble --scan --no-degraded' again. This time both parts of the RAID (sda1 and sdb1) are available, so the command succeeds and the RAID device gets assembled.

This all works great if all the RAID devices are working, but since it allways runs mdadm with the '--no-degraded' option, it won't assemble the RAID device if say 'sda1' is broken or missing.

My solution was to wait until mounting root failed due to a timeout and then try 'mdadm --assemble --scan' without '--no-degraded' to see if we can assemble a degraded RAID device. Hopefully by the time the root mount has timed out, Linux has discovered all of the disks that it can. This works on my Fiesty box, but as I said above, stuff got moved around for Gutsy and I haven't had a chance to build another box to try it out and fix Gutsy. Also I think my script didn't take into account the scenario where the RAID device isn't root.

Revision history for this message

Ken (ksemple) wrote on 2007-12-28:

#27

Hi,

Plnt, I am currently re-syncing my RAID set and will then try again with my other drives disconnected. I tried removing my other disks at one stage, but can't recall whether I tried your suggestion in the BusyBox shell at the same time.

Davias, No I haven't tried 2.6.4. Wherever possible I try to use supported Ubuntu packages. This ensures that I have a simple support and upgrade path, and makes management of my machines considerably easier.

Peter; I agree, the problem isn't mdadm, it's the udev scripts (another reason I didn't pursue the mdadm version option). I am new to this, and only in the last week or so have been researching how udev works. How do you detect that mounting root has failed, and how do you hold off running mdadm until this point?

Thanks,
Ken

Revision history for this message

Peter Haight (peterh-sapros) wrote on 2007-12-28:

#28

I've forgotten about how this works exactly, but if you take a look in:

/usr/share/initramfs-tools/scripts/local

If you look for the comment
# We've given up, but we'll let the user fix matters if they canomment:

The bit inside the while loop with the panic is the part that gets executed if there is a timeout trying to mount root. Here's what I put in my Fiesty version right before that comment.

if [ ! -e "${ROOT}" ]; then
# Try mdadm in degraded mode in case some drive has failed.
/scripts/local-top/mdadm
fi

This doesn't work anymore because of the changes to Gutsy. You could just try putting 'mdadm --assemble --scan' there, but that probably won't work. Everything is a little tricky in these scripts because they run before root is mounted, so stuff doesn't always work as you would expect.

Also, you can't just modify these scripts. After you change them, you have to use 'mkinitramfs' to generate the image that contains these scripts that is used during the boot up. I'd put in instructions, but I've forgotten how to do it myself.

Revision history for this message

Ken (ksemple) wrote on 2007-12-28:

#29

Thanks, I'll give it a go. This is still a background task so it may take me a few days.

I discovered that you can't just edit these scripts a couple of weeks ago when I did some edits and they didn't take. It was Plnt's post which told me the how to rebuild the image (sudo update-initramfs -u). This helped me make some progress with udev.

I'll let you know how I go.

Cheers,
Ken

Revision history for this message

Ken (ksemple) wrote on 2007-12-29:

#30

Plnt, My RAID set finished re-syncing and I tried your suggestion. I removed all drives except one /dev/sda, the first of my RAID set. When I rebooted and was presented with the BusyBox shell I entered "/sbin/mdadm --assemble --scan" and rebooted. Still no luck!

I will persist with modifying the udev scripts as suggested by Peter.

Thanks,
Ken

Revision history for this message

Ken (ksemple) wrote on 2007-12-29:

#31

Hi,

Thanks everybody for your help. I have now fixed this on my machine using code similar to that suggested by Peter Haight.

Edit "/usr/share/initramfs-tools/scripts/local" and find the following comment "# We've given up, but we'll let the user fix matters if they can".

Just before this comment add the following code:

# The following code was added to allow degraded RAID arrays to start
if [ ! -e "${ROOT}" ] || ! /lib/udev/vol_id "${ROOT}" >/dev/null 2>&1; then
  # Try mdadm and allow degraded arrays to start in case a drive has failed
  log_begin_msg "Attempting to start RAID arrays and allow degraded arrays"
   /sbin/mdadm --assemble --scan
  log_end_msg
fi

Peter's suggestion of just using [ ! -e "${ROOT}" ] as the condition test didn't work, so I used the condition test from the "if" block above this code and it worked fine.

To rebuilt the boot image use "sudo update-initramfs -u" as suggested by Plnt. This script calls the "mkinitramfs" script mentioned by Peter and is easier to use as you don't have to supply the image name and other options.

I have tested this a couple of times, with and without my other drives plugged in without any problems. Just make sure you have a cron item setup to run "mdadm --monitor --oneshot" to ensure the System Administrator gets an email when an array is running degraded.

This has worked on my machine and I think this is a sound solution. Please let me know if it solves your problems also.

This bug has the title "Bug #120375 is not in Ubuntu", does this mean that it is not considered to be an Ubuntu bug? I believe it is due to the fact that involves the way the startup scripts are configured (it is definitely not a bug in mdadm). How do we get this escalated to an Ubuntu bug so that it will be solved in future releases?

Good luck,
Ken

Revision history for this message

Johannes Mockenhaupt (mockenh-deactivatedaccount) wrote on 2007-12-30:

#32

Ken,

Thanks for the patch. I've followed your instructions and tested booting with both physical discs and booting with one disc detached. The second test failed, the system would just stop like it did without the patch. Unfortunately I know next to nothing about udev, initramfs and friends so I can't do much by myself other than test.

Has anybody else tried Ken's solution?

Joe

Revision history for this message

Peter Haight (peterh-sapros) wrote on 2007-12-30:

#33

Did you wait 3 minutes on the test with one disc detached? I think that by default, there is a three minute timeout before it gets to the place where Ken's patch is.

Revision history for this message

Ken (ksemple) wrote on 2007-12-30:

#34

Johannes,

I agree with Peter, there is a 180 second delay in the code in an "if" block just before the suggested location for the patch. The comment above this delay says it is to ensure there is enough time for an external device to start should the root file system be on an external device. I changed this 180 second delay to 30 seconds for my testing.

I think it would be reasonable to put the mdadm patch before this delay also. The delay is OK if you are watching the system boot as it will grab your attention and remind you that you may have a problem.

Cheers,
Ken

Revision history for this message

Johannes Mockenhaupt (mockenh-deactivatedaccount) wrote on 2007-12-30:

#35

I thought I waited long enough - I had read that comment about the 180s delay - but I didn't. I just tested it again and booting with a detached disc continues after 3 minutes. Even mail notification worked right away :-) After re-attaching the second drive I was dropped into the BusyBox shell. Just restarting "fixed" that and the machine started and is just resyncing. I think that may be another problem on my machine that's unrelated with this bug. Thanks Ken and Peter for the help!

Joe

Revision history for this message

dbendlin (diego-bendlin-hotmail) wrote on 2007-12-31:

#36

Download full text (5.6 KiB)

Hello Guys,

Reading all of your post's helped me a lot understanding linux a little be more, thanks for sharing your knowledge.

As I stated a few post above I've tried mdadm on Debian and it works the just fine, so I compared the scripts in /usr/share/initramfs-tools/scripts and found out they are not so diferent from the ones in ubuntu but still there's a diference I wanned to share with you.

When you setup mdadm on debian, a mdadm file is created in /usr/share/initramfs-tools/scripts/local-top,(I'll paste its contend below). The scripts in this folder get called from the local script file (The one you guys suggest to patch). And if you compare the debian version of the local scrip, with ubuntu's version you'll find out its pretty similar. So I guess this could be a better solution since for example you wont have to wait 180 secs, and you don't include "intrusive" code in the local script.

Here goes the debian version of the local script file
[code]
# Local filesystem mounting -*- shell-script -*-

# Parameter: Where to mount the filesystem
mountroot ()
{
[ "$quiet" != "y" ] && log_begin_msg "Running /scripts/local-top"
run_scripts /scripts/local-top
[ "$quiet" != "y" ] && log_end_msg

# If the root device hasn't shown up yet, give it a little while
# to deal with removable devices
if [ ! -e "${ROOT}" ]; then
log_begin_msg "Waiting for root file system..."

  # Default delay is 180s
  if [ -z "${ROOTDELAY}" ]; then
   slumber=180
  else
   slumber=${ROOTDELAY}
  fi
  if [ -x /sbin/usplash_write ]; then
   /sbin/usplash_write "TIMEOUT ${slumber}" || true
  fi

  slumber=$(( ${slumber} * 10 ))
  while [ ${slumber} -gt 0 ] && [ ! -e "${ROOT}" ]; do
   /bin/sleep 0.1
   slumber=$(( ${slumber} - 1 ))
  done

  if [ ${slumber} -gt 0 ]; then
   log_end_msg 0
  else
   log_end_msg 1 || true
  fi
  if [ -x /sbin/usplash_write ]; then
   /sbin/usplash_write "TIMEOUT 15" || true
  fi
fi

# We've given up, but we'll let the user fix matters if they can
while [ ! -e "${ROOT}" ]; do
  echo " Check root= bootarg cat /proc/cmdline"
  echo " or missing modules, devices: cat /proc/modules ls /dev"
  panic "ALERT! ${ROOT} does not exist. Dropping to a shell!"
done

# Get the root filesystem type if not set
if [ -z "${ROOTFSTYPE}" ]; then
  eval $(fstype < ${ROOT})
else
  FSTYPE=${ROOTFSTYPE}
fi
if [ "$FSTYPE" = "unknown" ] && [ -x /lib/udev/vol_id ]; then
  FSTYPE=$(/lib/udev/vol_id -t ${ROOT})
  [ -z "$FSTYPE" ] && FSTYPE="unknown"
fi

[ "$quiet" != "y" ] && log_begin_msg "Running /scripts/local-premount"
run_scripts /scripts/local-premount
[ "$quiet" != "y" ] && log_end_msg

if [ ${readonly} = y ]; then
roflag=-r
else
roflag=-w
fi

# FIXME This has no error checking
modprobe -q ${FSTYPE}

# FIXME This has no error checking
# Mount root
mount ${roflag} -t ${FSTYPE} ${ROOTFLAGS} ${ROOT} ${rootmnt}

[ "$quiet" != "y" ] && log_begin_msg "Running /scripts/local-bottom"
run_scripts /scripts/local-bottom
[ "$quiet" != "y" ] && log_end_msg
}
[/code]

And here is the contend of the mdadm script file located in /usr/share/initramfs-tools/local-top folder
[code]
#!/bin/sh
#
# Copyright © 2006 Martin F. Krafft <madduck@debian...

Hello Guys,

Reading all of your post's helped me a lot understanding linux a little be more, thanks for sharing your knowledge.

As I stated a few post above I've tried mdadm on Debian and it works the just fine, so I compared the scripts in /usr/share/initramfs-tools/scripts and found out they are not so diferent from the ones in ubuntu but still there's a diference I wanned to share with you.

When you setup mdadm on debian, a mdadm file is created in /usr/share/initramfs-tools/scripts/local-top,(I'll paste its contend below). The scripts in this folder get called from the local script file (The one you guys suggest to patch). And if you compare the debian version of the local scrip, with ubuntu's version you'll find out its pretty similar. So I guess this could be a better solution since for example you wont have to wait 180 secs, and you don't include "intrusive" code in the local script.

Here goes the debian version of the local script file
[code]
# Local filesystem mounting			-*- shell-script -*-

# Parameter: Where to mount the filesystem
mountroot ()
{
	[ "$quiet" != "y" ] && log_begin_msg "Running /scripts/local-top"
	run_scripts /scripts/local-top
	[ "$quiet" != "y" ] && log_end_msg

# If the root device hasn't shown up yet, give it a little while
	# to deal with removable devices
	if [ ! -e "${ROOT}" ]; then
		log_begin_msg "Waiting for root file system..."

# Default delay is 180s
		if [ -z "${ROOTDELAY}" ]; then
			slumber=180
		else
			slumber=${ROOTDELAY}
		fi
		if [ -x /sbin/usplash_write ]; then
			/sbin/usplash_write "TIMEOUT ${slumber}" || true
		fi

slumber=$(( ${slumber} * 10 ))
		while [ ${slumber} -gt 0 ] && [ ! -e "${ROOT}" ]; do
			/bin/sleep 0.1
			slumber=$(( ${slumber} - 1 ))
		done

if [ ${slumber} -gt 0 ]; then
			log_end_msg 0
		else
			log_end_msg 1 || true
		fi
		if [ -x /sbin/usplash_write ]; then
			/sbin/usplash_write "TIMEOUT 15" || true
		fi
	fi

# We've given up, but we'll let the user fix matters if they can
	while [ ! -e "${ROOT}" ]; do
		echo "	Check root= bootarg cat /proc/cmdline"
		echo "	or missing modules, devices: cat /proc/modules ls /dev"
		panic "ALERT!  ${ROOT} does not exist.  Dropping to a shell!"
	done

# Get the root filesystem type if not set
	if [ -z "${ROOTFSTYPE}" ]; then
		eval $(fstype < ${ROOT})
	else
		FSTYPE=${ROOTFSTYPE}
	fi
	if [ "$FSTYPE" = "unknown" ] && [ -x /lib/udev/vol_id ]; then
		FSTYPE=$(/lib/udev/vol_id -t ${ROOT})
		[ -z "$FSTYPE" ] && FSTYPE="unknown"
	fi

[ "$quiet" != "y" ] && log_begin_msg "Running /scripts/local-premount"
	run_scripts /scripts/local-premount
	[ "$quiet" != "y" ] && log_end_msg

if [ ${readonly} = y ]; then
		roflag=-r
	else
		roflag=-w
	fi

# FIXME This has no error checking
	modprobe -q ${FSTYPE}

# FIXME This has no error checking
	# Mount root
	mount ${roflag} -t ${FSTYPE} ${ROOTFLAGS} ${ROOT} ${rootmnt}

[ "$quiet" != "y" ] && log_begin_msg "Running /scripts/local-bottom"
	run_scripts /scripts/local-bottom
	[ "$quiet" != "y" ] && log_end_msg
}
[/code]

And here is the contend of the mdadm script file located in /usr/share/initramfs-tools/local-top folder
[code]
#!/bin/sh
#
# Copyright © 2006 Martin F. Krafft <madduck@debian.org>
# based on the scripts in the initramfs-tools package.
# released under the terms of the Artistic Licence.
#
# $Id: script.local-top 290 2006-12-19 08:18:50Z madduck $
#

set -eu

PREREQ="udev_helper"

prereqs()
{
	echo "$PREREQ"
}

case ${1:-} in
  prereqs)
    prereqs
    exit 0
    ;;
esac

. /scripts/functions

if [ -e /scripts/local-top/md ]; then
  log_warning_msg "old md initialisation script found, getting out of its way..."
  exit 1
fi

MDADM=/sbin/mdadm
[ -x "$MDADM" ] || exit 0

verbose()
{
  case "$quiet" in y*|Y*|1|t*|T*)
    return 1;;
  *)
    return 0;;
  esac
}

MD_DEVS=all
MD_MODULES='linear multipath raid0 raid1 raid456 raid5 raid6 raid10'
[ -s /conf/md.conf ] && . /conf/md.conf

verbose && log_begin_msg Loading MD modules
for module in ${MD_MODULES:-}; do
  if modprobe --syslog "$module"; then
    verbose && log_success_msg "loaded module ${module}."
  else
    log_failure_msg "failed to load module ${module}."
  fi
done
log_end_msg

if [ ! -f /proc/mdstat ]; then
  verbose && panic "cannot initialise MD subsystem (/proc/mdstat missing)"
  exit 1
fi

# handle /dev/md/X nodes
mkdir --parent /dev/md

CONFIG=/etc/mdadm/mdadm.conf
# in case the hook failed to install a configuration file, this is our last
# attempt... the "emergency procedure"... <drumroll>
if [ ! -e $CONFIG ]; then
  log_warning_msg "missing mdadm.conf file, trying to create one..."
  mkdir -p ${CONFIG%/*}
  echo DEVICE partitions > $CONFIG
  $MDADM --examine --scan >> $CONFIG
  if [ -s $CONFIG ]; then
    verbose && log_success_msg "mdadm.conf created."
  else
    verbose && log_failure_msg "could not create mdadm.conf, the boot will likely fail."
  fi
  MD_DEVS=all
fi

if [ "$MD_DEVS" = all ]; then
  
  verbose && log_begin_msg "Assembling all MD arrays"
  extra_args=''
  [ -n "$MD_HOMEHOST" ] && \
    extra_args="--homehost='$MD_HOMEHOST' --auto-update-homehost"
  if $MDADM --assemble --scan --run --auto=yes $extra_args; then
    verbose && log_success_msg "assembled all arrays."
  else
    log_failure_msg "failed to assemble all arrays."
  fi
  verbose && log_end_msg

elif [ "$MD_DEVS" != none ]; then
  for dev in $MD_DEVS; do

verbose && log_begin_msg "Assembling MD array $dev"
    if $MDADM --assemble --run --auto=yes $dev; then
      verbose && log_success_msg "started $dev"
    else
      log_failure_msg "failed to start $dev"
    fi
    verbose && log_end_msg

done
fi

exit 0
[/code]

Hope this works, didn't have time nor PC to try this, but maybe someone could give it a try :)

Kind Regards,

Diego Bendlin

Revision history for this message

dbendlin (diego-bendlin-hotmail) wrote on 2007-12-31:

#37

Sorry for my last long post,

The idea is to leave the local script file as its shipped by default, and only add the mdadm script file to /usr/share/initramfs-tools/local-top directory.

Finally don't forget to rebuilt the boot image issuing "sudo update-initramfs -u" as suggested by Plnt.

Kind Regards,

Diego Bendlin

Revision history for this message

Ken (ksemple) wrote on 2008-01-01:

#38

I have had a bit of a look into what Diego has found.

Took me a while, nearly gave up until I found this script: "/usr/share/initramfs-tools/hooks/mdadm". This script looks be a modified version of the one Diego has found on his Debian machine. It has the identical header and some similar code.

I would be hesitant to add the Debian script to the local-top folder. If you wanted a simple file copy solution and don't want to wait the 180seconds on the rare occasions you boot with a degraded array, create a script file with my suggested code in it and place is in the "/usr/share/initramfs-tools/scripts/init-top" folder and it will be called from the "local" script before the time delay.

Ken

Revision history for this message

dbendlin (diego-bendlin-hotmail) wrote on 2008-01-02:

#39

Guys,

I think this issue could have many work arounds as it has been proved here.
In my opinion the idea behind sharing our experiences and efforts is to improve Ubuntu so that a future version will handle this automatically for the user. "Linux for human beings", remember?

Analyzing my Debian machines I have noticed that not only mdadm scripts exist in the local-top folder, but others like lvm for example, so I guess Debian installer handles this based on the user configuration made at install time, following this direction I think a better way to deal with this mayor issue will be to improve Ubuntu's installer so it can copy a template file to local-top folder based on the user input at installation time.

Finally I hope Ubuntu developers get to read this topic in order to address this issue for the upcoming 8.04 version of Ubuntu with is near =)

Kind Regards,

Diego Bendlin

Revision history for this message

Tomas (tvinar-gmail) wrote on 2008-01-11:

#40

I have run into the same problem where my disk upgrade path included booting from temprorarily degraded array.

I have tried to update /etc/mdadm/mdadm.conf with the new file system UUIDs
(using mdadm --detail --scan and replacing the corresponding line in /etc/mdadm/mdadm.conf)
and after that update-initrd

Now the system seems to be booting without any problems.

I have also used -e 0.90 paramater in mdadm when assembling the degraded array (to create an older version of superblock that is recognized by the kernel), though I am not sure whether this had anything to do with the outcome.

Jens (jens.timmerman) on 2008-04-29

Changed in mdadm:
status:	New → Confirmed

Dustin Kirkland  (kirkland) on 2008-06-03

Changed in initramfs-tools:
assignee:	nobody → kirkland
status:	New → Confirmed
Changed in mdadm:
assignee:	nobody → kirkland
importance:	Undecided → Medium
milestone:	none → ubuntu-8.10

Dustin Kirkland  (kirkland) on 2008-07-25

Changed in mdadm:
status:	Confirmed → Triaged

Kees Cook (kees) on 2008-07-25

Changed in initramfs-tools:
status:	Confirmed → In Progress
Changed in mdadm:
status:	Triaged → In Progress
Changed in initramfs-tools:
assignee:	nobody → kirkland
status:	New → In Progress
assignee:	kirkland → nobody
status:	In Progress → Confirmed
importance:	Undecided → Medium
milestone:	none → ubuntu-8.10

Launchpad Janitor (janitor) on 2008-07-28

Changed in initramfs-tools:
status:	In Progress → Fix Released
Changed in mdadm:
status:	In Progress → Fix Released

Ace Suares (acesuares) on 2008-08-19

description:

updated

Revision history for this message

no!chance (ralf-fehlau) wrote on 2008-09-04:

#110

Full ack to miguel. Why is somebody using a raid1 on his system. Does he want to have trouble, if a disk fails, or does he want a running system and to be informed about a hardware failue? Ubuntu raid is useless! The "conservative" mode is useless! If i had a 4 disk RAID1 system or a RAID5 with a spare disk and ONE fails. Is it useful to stop the boot process???

I have the same issue on my new system. First, I had one disk and decided to upgrade to software raid. With the second HD, I created a degraded raid1, copied the contents of the first disk to the second, and wanted to add the first disk to the raid. Ubuntu drops me into the shell. :-( And in spite of booting with a live CD and adding the first disk to the raid, the system refused to boot.

Because it is a new system without any data on it, I will do a new installation with debian or suse. For my home server, I will see.

Revision history for this message

no!chance (ralf-fehlau) wrote on 2008-09-04:

#111

Another thing to mention: Which systems are using raid? .... Right! ... Server! Such systems are usually maintained remotely and rebooted through a ssh connection. This message an the question is very very useful. :-(

> If you abort now, you will be provided with a recovery shell.
> Do you wish to boot the degraded RAID? [y/N]:

The last you will see from this server is ....

"rebooting now"

Revision history for this message

Stas Sușcov (sushkov) wrote on 2008-10-09:

#112

Can somebody explain, is this bug fixed in hardy packages?

A lot of comments, and not a single clear report!!!

If it is not fixed... Is there a patch for local script in initramfs-tools and mdadm or is there any rebuilt package with the fixes?

Currently in Hardy i got:
~$ apt-cache policy mdadm
mdadm:
  Installed: 2.6.3+200709292116+4450e59-3ubuntu3
  Candidate: 2.6.3+200709292116+4450e59-3ubuntu3
  Version table:
*** 2.6.3+200709292116+4450e59-3ubuntu3 0
        500 http://ro.archive.ubuntu.com hardy/main Packages
        100 /var/lib/dpkg/status
stas@baikonur:~$ apt-cache policy initramfs-tools
initramfs-tools:
  Installed: 0.85eubuntu39.2
  Candidate: 0.85eubuntu39.2
  Version table:
*** 0.85eubuntu39.2 0
        500 http://ro.archive.ubuntu.com hardy-updates/main Packages
        100 /var/lib/dpkg/status
     0.85eubuntu36 0
        500 http://ro.archive.ubuntu.com hardy/main Packages
~$ apt-cache policy mdadm
mdadm:
  Installed: 2.6.3+200709292116+4450e59-3ubuntu3
  Candidate: 2.6.3+200709292116+4450e59-3ubuntu3
  Version table:
*** 2.6.3+200709292116+4450e59-3ubuntu3 0
        500 http://ro.archive.ubuntu.com hardy/main Packages
        100 /var/lib/dpkg/status

Thank you in advance!

Revision history for this message

Dustin Kirkland  (kirkland) wrote on 2008-10-09:

#113

This is fixed in Intrepid, not in Hardy.

:-Dustin

Revision history for this message

Ace Suares (acesuares) wrote on 2008-10-09:

#114

On Thursday 09 October 2008, Stanislav Sushkov wrote:
> Can somebody explain, is this bug fixed in hardy packages?
>
> A lot of comments, and not a single clear report!!!
>
> If it is not fixed... Is there a patch for local script in initramfs-
> tools and mdadm or is there any rebuilt package with the fixes?
>
> Currently in Hardy i got:
> ~$ apt-cache policy mdadm
> mdadm:
> Installed: 2.6.3+200709292116+4450e59-3ubuntu3
> Candidate: 2.6.3+200709292116+4450e59-3ubuntu3
> Version table:
> *** 2.6.3+200709292116+4450e59-3ubuntu3 0
> 500 http://ro.archive.ubuntu.com hardy/main Packages
> 100 /var/lib/dpkg/status
> stas@baikonur:~$ apt-cache policy initramfs-tools
> initramfs-tools:
> Installed: 0.85eubuntu39.2
> Candidate: 0.85eubuntu39.2
> Version table:
> *** 0.85eubuntu39.2 0
> 500 http://ro.archive.ubuntu.com hardy-updates/main Packages
> 100 /var/lib/dpkg/status
> 0.85eubuntu36 0
> 500 http://ro.archive.ubuntu.com hardy/main Packages
> ~$ apt-cache policy mdadm
> mdadm:
> Installed: 2.6.3+200709292116+4450e59-3ubuntu3
> Candidate: 2.6.3+200709292116+4450e59-3ubuntu3
> Version table:
> *** 2.6.3+200709292116+4450e59-3ubuntu3 0
> 500 http://ro.archive.ubuntu.com hardy/main Packages
> 100 /var/lib/dpkg/status
>
> Thank you in advance!

It's not fixed in hardy LTS, which is very strange...

ace

Revision history for this message

Stas Sușcov (sushkov) wrote on 2008-10-09:

#115

Can you point to a wiki page or a comment in this thread where I'll find
a solution for hardy?

Or I should install Intrepid packages?

Thank you.

On Thu, 2008-10-09 at 15:14 +0000, Dustin Kirkland wrote:
> This is fixed in Intrepid, not in Hardy.
>
> :-Dustin
>
--
() Campania Panglicii în ASCII
/\ http://stas.nerd.ro/ascii/

Revision history for this message

RpR (tom-lecluse) wrote on 2008-10-09:

#116

Dustin I love the work that you've put in to this but I need to stand by Ace Suares and for a LTS version it should be fixed.
It prevents me to use ubuntu for my servers which uses software raid. The ones with hardware raid could use ubuntu.

But sticking with debian for the moment because of this.
If hardy wasn't a LTS version I would understand when you would just say upgrade to ...

Revision history for this message

Stas Sușcov (sushkov) wrote on 2008-10-09:

#117

Just post a file with patch which works for hardy, and that's all folks :)

Revision history for this message

Ace Suares (acesuares) wrote on 2008-10-09:

#118

On Thursday 09 October 2008, Stanislav Sushkov wrote:
> Just post a file with patch which works for hardy, and that's all folks
>
> :)

No it's not.

There is a procedure for that, so it will be updated automatically. That
procedure I followed but at some point in the procedure, powers
of 'normal' people are insufficient. We have to find an overlord who will
sponsor the process and move it forward. Even the developer who made the
patch for Ibex, can not move this forward on his own.

Where are the Power Puff Girls when you need them ?

Revision history for this message

Stas Sușcov (sushkov) wrote on 2008-10-09:

#119

You mean this procedure?

It didn't work for me.
I mean I patched my "local" manually, but it broke my init image
in /boot after update-initramfs...

Maybe I the patch really works, but no one from this thread reported it
as a solution...

On Thu, 2008-10-09 at 20:37 +0000, Ace Suares wrote:
> On Thursday 09 October 2008, Stanislav Sushkov wrote:
> > Just post a file with patch which works for hardy, and that's all folks
> >
> > :)
>
> No it's not.
>
> There is a procedure for that, so it will be updated automatically. That
> procedure I followed but at some point in the procedure, powers
> of 'normal' people are insufficient. We have to find an overlord who will
> sponsor the process and move it forward. Even the developer who made the
> patch for Ibex, can not move this forward on his own.
>
> Where are the Power Puff Girls when you need them ?
>
--
() Campania Panglicii în ASCII
/\ http://stas.nerd.ro/ascii/

Revision history for this message

Dustin Kirkland  (kirkland) wrote on 2008-10-09:

#120

Fixing Hardy would mean, at the very least:
1) porting the patches for:
   * mdadm
   * initramfs-tools
   * grub
   * grub-installer
2) Rebuilding the installation ISO's.
3) Obsessively regression testing the new install media.

After Intrepid releases on October 30, 2008, I will spend a few cycles
considering the port to Hardy 8.04.2. No guarantees, but rest assured
that I am a Canonical/Ubuntu Server developer, who runs Hardy+RAID on
my own systems, and have plenty of motivation to see this fixed.

If anyone wants to volunteer to do (1), that would help move things
along. And I certainly hope at the very least some of you are willing
to help with (3).

:-Dustin

p.s. Please understand that your sarcasm is completely unappreciated.

Revision history for this message

Stas Sușcov (sushkov) wrote on 2008-10-09:

#121

Dustin,
what about those packages for ibex. If I update my hardy with those, so
I risk serious troubles?

Or it is not possible cause of difference between kernels?

Someone did this before?

On Thu, 2008-10-09 at 21:21 +0000, Dustin Kirkland wrote:
> Dustin
--
() Campania Panglicii în ASCII
/\ http://stas.nerd.ro/ascii/

Revision history for this message

Dustin Kirkland  (kirkland) wrote on 2008-10-09:

#122

Stanislav-

First, the Intrepid packages won't work, due to toolchain (glibc,
klibc) differences. I just tested out of curiosity in a virtual
machine.

Second, under no circumstances would I recommend this as an acceptable
thing to do. If you are beholden to running Hardy, I presume that's
because Hardy is supported as an LTS. Once you start changing the
Hardy LTS packages and replacing them with Intrepid packages, you're
no longer in a supported configuration. Especially when we're talking
about things that are so fundamental to booting your system.

If you would be willing to upgrade these four key packages to
Intrepid, I'd say you would be much better served upgrading to
Intrepid across the board.

:-Dustin

Revision history for this message

Ace Suares (acesuares) wrote on 2008-10-09:

#123

Dustin,

I am glad you may be spending some time on this bug.

You mention rebuilding ISO's. But why can't it just be an upgrade to the existing installations?
I mean, on an existing system, all we need to do is upgrade ?

Also, I am not being sarcastic at all, when I say that I can not understand that there needs to be so much regression testing for the patch that removes the bug, since to introduce the bug was possible at all. The bug is not present in Debian, and was not present in Dapper.

Anyway, I am going to install the workaround on all my affected machines because I cannot wait that long. (But then I will get some trouble when an update finally comes around..).

And I am not going to advice using Ubuntu for servers that use software raid anymore. I am really disappointed in the way this is going. I am happy it will be fixed in Ibex tough. Just keep smiling...

I am also unsubscribing from this bug. I feel that I am becoming unconstructive.

Revision history for this message

Ross Becker (ross-becker) wrote on 2008-10-09:

#124

Dustin,
I've been doing sysadmin work for 15 years. I chose to try out Ubuntu
for a home RAID server project, and loaded up Hardy as it was an LTS
edition. In my first day working with Ubuntu, I ran into this bug, a bug
where the version of mdadm (pretty well out of date) on Hardy was unable to
resume a RAID reshape operation, and the ext2resize tools incorrectly detect
the RAID stride. All of these bugs are BASIC functionality of the storage
management tools which should have never made it through any sort of QA.

I reported all of them as bugs, and this is the ONLY one which has even
recieved a developer response after 2 months.

For a Long Term Support edition, that's shameful. Not only that, but a bug
which in a COMMON situation can prevent boot and your response as to whether
a fix will be backported is "I'll spend a few cycles considering backporting
it"

Your lack of understanding for someone's sarcasm is completely unjustified,
and the level of developer/bugfix support I'm seeing for Ubuntu is
pathetic. With this level of support, using Ubuntu for any sort of a
corporate server application would be a really poor decision.

Revision history for this message

brunus (reg-paolobrunello) wrote on 2008-10-10:

#125

Hello,
I second RpR post word by word: it's sincerely hard to accept that such a serious bug is, not present until 7.04, is still open 14 months and 2 distros after, one of them being LTS. And this is even more true for the - arguably called - server edition: it's like putting on the market a TIR truck that has no double tyres and doing it twice.

Dustin,
could you please explain how do you run Hardy+RAID: it is still not clear to me after reading the whole thread, sorry.

Thanks,

brunus

Revision history for this message

Steve Langasek (vorlon) wrote on 2008-10-10:

#126

Unsubscribing ubuntu-sru. Please do not subscribe the SRU team to bugs that don't actually include proposed fixes to previous releases. a) this is not the documented procedure for SRU fixes, b) the SRU team has other things to do that actually benefit Ubuntu users, instead of following a "bug report" that consists of lambasting Ubuntu for a bug that the relevant developers have already agreed should receive attention.

With regard to the last, I've accepted the nomination for hardy based on Dustin's statement that he'll work on this for 8.04.2.

Ross, as for the other bugs you mentioned: the ext2resize package is not part of Ubuntu main, which means it's not part of the set of software supported by Canonical. It's a third-party tool that ext3 upstream has repeatedly recommended against using. The package from Debian is included in the universe repository in the hope that it's useful to someone, but there is no one at all in Ubuntu tending that package - there's no reason to think it was subjected to any QA for it to fail. If you believe this problem makes the package unusable, then I'm happy to escalate bug #256669 and have the ext2resize package removed from subsequent Ubuntu releases.

Changed in initramfs-tools:
status:	New → Confirmed
Changed in mdadm:
importance:	Undecided → Medium
status:	New → Confirmed
Changed in initramfs-tools:
importance:	Undecided → Medium

Revision history for this message

agent 8131 (agent-8131) wrote on 2008-10-10:

#127

Download full text (4.0 KiB)

I think it's time for some tough love. No one would be taking the time to comment on this if they didn't want to see Ubuntu Server be a better product. I personally feel this is a significant issue for because it demonstrates Canonical's interest in supporting an LTS release and seriousness about becoming a presence in the server market. I know it can be difficult when people are lambasting a product you've put a lot of time into, believe me, I got more than my fair share of that this week. However, sometimes you have to step back and realize that your product quality has been lower than many people expect and you have to either step up or risk losing more customers. Make no mistake, this bug has lost a lot of sysadmins, some of whom had to fight hard to get Ubuntu onto servers in their workplaces in the first place. I was one of them, and I know a few more personally. I pitched that it made sense to have Ubuntu on the Server because of the benefits of the LTS release, including the longer support time and the large community that contributes to Ubuntu, therefore leading to more bugs being found and resolved. However, I doubt I will be able to propose Ubuntu again until version 10.04. If this bug were to be resolved in 8.04.2 I might at least start pitching it next year, barring any other bugs of this level of severity.

To respond to Steve Langasek, while I understand that a lot of these emails are not terribly useful, this bug is exactly what the SRU team is supposed to be addressing. There have been many proposed fixes in this thread and 8 patches uploaded. Do any of them work correctly? Well that is the question, but it's inaccurate to state that this bug does not contain proposed fixes. Furthermore this fits the SRU criteria of being a high impact bug representing a severe regression from earlier versions and one which also may lead to a loss of user data. When the average user is confronted with the initramfs shell during a drive failure I suspect they have the potential to do serious damage to their file systems in an attempt to fix the problem.

I don't feel it's possible for me to understate the severity of this bug and how badly sysadmins are going to react when they encounter it or read about it. It is certainly not the kind of bug one can dismiss in an LTS release if LTS is to say anything about quality, and hence suggestions to upgrade to Intrepid, while acceptable to a home user building a server, are not going to be acceptable in the workplace. If this is a market segment that Ubuntu Server is catered to than this issue needs to be addressed. If on the other hand Ubuntu Server is meant merely for enthusiasts with their home file servers than the solution should be to make sure that goal is clearly articulated.

To keep us focused on the work at hand and to avail myself of the opportunity that having this number of people working for fix this bug represents I'll say that I've tried a number of solutions on this page but none have been satisfactory. I tried changing the udev rule as suggested above (see Plnt 2007-12-25) but got the same results that have been reported: I can get the system to boot any time the ...

	Status	Importance	Assigned to	Milestone
initramfs-tools	Fix Released	Undecided	Unassigned
grub (Ubuntu)	Fix Released	Undecided	Unassigned
Hardy	Fix Released	Medium	Unassigned	Ubuntu ubuntu-8.04.2
initramfs-tools (Ubuntu)	Fix Released	Medium	Unassigned	Ubuntu ubuntu-8.10
Hardy	Fix Released	Medium	Unassigned	Ubuntu ubuntu-8.04.2
mdadm (Ubuntu)	Fix Released	Medium	Unassigned	Ubuntu ubuntu-8.10
Hardy	Fix Released	Medium	Unassigned	Ubuntu ubuntu-8.04.2

Changed in initramfs-tools:
status:	Confirmed → Fix Released
Changed in grub (Ubuntu Hardy):
assignee:	Dustin Kirkland (kirkland) → nobody
Changed in initramfs-tools (Ubuntu Hardy):
assignee:	Dustin Kirkland (kirkland) → nobody
Changed in mdadm (Ubuntu Hardy):
assignee:	Dustin Kirkland (kirkland) → nobody

Ubuntu
mdadm package

cannot boot raid1 with only one disk

Bug Description

Related branches

Duplicates of this bug

Other bug subscribers

Related questions

Related blueprints

Patches

Bug attachments

Remote bug watches

Changed in grub:
status:	In Progress → Fix Released
Changed in initramfs-tools:
status:	In Progress → Fix Released
Changed in mdadm:
status:	In Progress → Fix Released

Ubuntumdadm package

cannot boot raid1 with only one disk

Bug Description

Related branches

Duplicates of this bug

Other bug subscribers

Related questions

Related blueprints

Patches

Bug attachments

Remote bug watches

Ubuntu
mdadm package