Critical Dapper bug: RAID1, reconnect secondary drive, no attempt made to resolve drives

Bug #58892 reported by Timothy Miller
6
Affects Status Importance Assigned to Milestone
Ubuntu
Fix Released
Undecided
Unassigned

Bug Description

I have two identical drives that I have configured as RAID1. I set up software RAID1 on them and installed Dapper.

After installing, I shut down and disconnected power from the seconday disk. Everything booted just fine.

I shut down again and reconnected the secondary drive. When I booted up, absolutely nothing was done to resolve the two drives. Clearly, they would be in an inconsistent state. Moreover, if the new disk connected were blank, that would be an even bigger problem. This is a critical bug, because this makes it impossible to restore a RAID1 array after a drive failure.

Revision history for this message
Timothy Miller (theosib) wrote :

I would suggest that, after logging in, the user should see a dialog that reports that it thinks a new drive has been added that looks like one they'd like to add to the array and ask them if they want to hot-add it.

Revision history for this message
Micah Cowan (micahcowan) wrote :

Thanks for your report. Your idea might get more attention and have the possibility of being implemented if you submit a specification for it. First check whether the idea is already registered <https://launchpad.net/ubuntu/+specs>, and if so, contact the specification's drafter about your ideas. Otherwise, you can start writing a spec yourself. <https://wiki.ubuntu.com/FeatureSpecifications>

Revision history for this message
Timothy Miller (theosib) wrote :

I would love to take your suggestion, but unfortunately, I'm just too busy. With a day job, my own startup (Traversal Technology), a research assistantship, other responsibilities regarding my Ph.D., and my own open source project (Open Graphics), I just have too many things on my plate. I chose Ubuntu so that I would be able to AVOID sysadmin-related things such as manually configuring software RAID. The main problem is that I just don't know enough to write this spec, and I don't have the time to learn it. For the moment, I will just have to accept that Ubuntu is awesome for the desktop but not quite ready yet for servers.

Thank you for your time.

Revision history for this message
Dustin Kirkland  (kirkland) wrote :

I'm marking this bug as confirmed, although the behavior has improved to some degree in Intrepid.

Here are my observations...

 1) Boot with diskA and diskB attached, the md is actively sync'd. (EXPECTED, OKAY)
 2) Disconnect diskB, the md will (optionally) come up in a degraded mode, using only diskA (EXPECTED, OKAY)
 3) Re-attach diskB. The md will come up in degraded mode, using only diskA. This is (EXPECTED, OKAY), because the state of diskB is unknown. You can manually add diskB back to the array, and once it has re-synced, it will be an active member of the array on subsequent boots.
 4) Disconnect diskA, attach diskB. The md will (optionally) come up in degraded mode (as several other bugs have solved this for Ubuntu Intrepid). (EXPECTED, OKAY)
 5) Now, we have booted each of diskA and diskB as the only member of the array. Each of them "think" that they are the only active member of a degraded array. What is the desired behavior now? Clearly the two disks are out of sync, and cannot be booted together in the same array. Currently, in Intrepid, the disk "last touched" is booted. I *think* this is the best behavior, but perhaps something more interactive from the system administrator's standpoint would be most desired.

Timothy-

Is there any way that you can test this behavior in Intrepid? I would be very curious to get your take on the new behavior.

:-Dustin

Revision history for this message
Timothy Miller (theosib) wrote :

I want to thank you very much, Dustin Kirkland, for all of the work I have seen you put into the various RAID issues I and others had observed with Ubuntu. It's people like you who polish the "little things" that turn a good Linux experience into a fantastic Ubuntu experience.

Although I have recently installed Hoary on multiple computers for friends, my current needs do not require Ubuntu. I'm running Mac OS X on the desktop and Gentoo Linux on a server. From the looks of it, (a) all of your various changes are spot-on, and (b) you appear to be have everything under control, and I think I can trust that you've done it right. I'm afraid I won't have an immediate opportunity to test Intrepid; I apologize for that.

In any case, many thanks again. I'm sure I and many others will benefit from your improvements in the future.

Revision history for this message
Dustin Kirkland  (kirkland) wrote : Re: [Bug 58892] Re: Critical Dapper bug: RAID1, reconnect secondary drive, no attempt made to resolve drives

On Tue, Oct 7, 2008 at 11:42 AM, Timothy Miller <email address hidden> wrote:
> I want to thank you very much, Dustin Kirkland, for all of the work I
> have seen you put into the various RAID issues I and others had observed
> with Ubuntu. It's people like you who polish the "little things" that
> turn a good Linux experience into a fantastic Ubuntu experience.

Thank you for the kind words.

> Although I have recently installed Hoary on multiple computers for
> friends, my current needs do not require Ubuntu. I'm running Mac OS X
> on the desktop and Gentoo Linux on a server. From the looks of it, (a)
> all of your various changes are spot-on, and (b) you appear to be have
> everything under control, and I think I can trust that you've done it
> right. I'm afraid I won't have an immediate opportunity to test
> Intrepid; I apologize for that.

Understood. Thank you for the original bug reports. As you
expressed, I too was much surprised and disappointed to see the
incomplete RAID support in Ubuntu when I made my switch to the
distribution. My apologies that these problems were not solved
sooner.

I do believe that the vast majority of the problems you reported with
respect to software RAID in Ubuntu have been solved in Intrepid. I'm
trying to weave through them now, retesting and closing them
appropriately.

:-Dustin

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.