Deploy fails with RAID 5 and Bcache

Bug #1512857 reported by Blake Rouse
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Invalid
Critical
Unassigned
curtin
Fix Released
Critical
Unassigned

Bug Description

I selected 3 disks in the WebUI and added them to a RAID 5 device. That device was also setup to be formatted with ext4 and mounted at "/srv/data".

It fails to deploy with the "mdadm --create" command.

Get:1 http://security.ubuntu.com trusty-security InRelease [64.4 kB]
Ign http://archive.ubuntu.com trusty InRelease
Get:2 http://archive.ubuntu.com trusty-updates InRelease [64.4 kB]
Hit http://archive.ubuntu.com trusty Release.gpg
Hit http://archive.ubuntu.com trusty Release
Get:3 http://security.ubuntu.com trusty-security/main amd64 Packages [357 kB]
Get:4 http://archive.ubuntu.com trusty-updates/main amd64 Packages [639 kB]
Get:5 http://security.ubuntu.com trusty-security/universe amd64 Packages [117 kB]
Get:6 http://archive.ubuntu.com trusty-updates/universe amd64 Packages [326 kB]
Hit http://archive.ubuntu.com trusty/main amd64 Packages
Hit http://archive.ubuntu.com trusty/universe amd64 Packages
Fetched 1568 kB in 0s (3810 kB/s)
Reading package lists...
Reading package lists...
Building dependency tree...
Reading state information...
The following extra packages will be installed:
  libdevmapper-event1.02.1 liblzo2-2 libreadline5 postfix ssl-cert watershed
Suggested packages:
  thin-provisioning-tools procmail postfix-mysql postfix-pgsql postfix-ldap
  postfix-pcre sasl2-bin dovecot-common postfix-cdb mail-reader postfix-doc
  openssl-blacklist xfsdump acl attr quota
Recommended packages:
  default-mta mail-transport-agent
The following NEW packages will be installed:
  bcache-tools btrfs-tools libdevmapper-event1.02.1 liblzo2-2 libreadline5
  lvm2 mdadm postfix ssl-cert watershed xfsprogs
0 upgraded, 11 newly installed, 0 to remove and 178 not upgraded.
Need to get 2991 kB of archives.
After this operation, 12.2 MB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu/ trusty/main libdevmapper-event1.02.1 amd64 2:1.02.77-6ubuntu2 [10.8 kB]
Get:2 http://archive.ubuntu.com/ubuntu/ trusty-updates/main liblzo2-2 amd64 2.06-1.2ubuntu1.1 [46.1 kB]
Get:3 http://archive.ubuntu.com/ubuntu/ trusty/main libreadline5 amd64 5.2+dfsg-2 [130 kB]
Get:4 http://archive.ubuntu.com/ubuntu/ trusty-updates/main bcache-tools amd64 1.0.7-1~14.04.1 [17.4 kB]
Get:5 http://archive.ubuntu.com/ubuntu/ trusty/main btrfs-tools amd64 3.12-1 [335 kB]
Get:6 http://archive.ubuntu.com/ubuntu/ trusty/main watershed amd64 7 [11.4 kB]
Get:7 http://archive.ubuntu.com/ubuntu/ trusty/main lvm2 amd64 2.02.98-6ubuntu2 [470 kB]
Get:8 http://archive.ubuntu.com/ubuntu/ trusty-updates/main mdadm amd64 3.2.5-5ubuntu4.2 [361 kB]
Get:9 http://archive.ubuntu.com/ubuntu/ trusty/main ssl-cert all 1.0.33 [16.6 kB]
Get:10 http://archive.ubuntu.com/ubuntu/ trusty-updates/main postfix amd64 2.11.0-1ubuntu1 [1084 kB]
Get:11 http://archive.ubuntu.com/ubuntu/ trusty/main xfsprogs amd64 3.1.9ubuntu2 [508 kB]
Preconfiguring packages ...
Fetched 2991 kB in 0s (58.9 MB/s)
Selecting previously unselected package libdevmapper-event1.02.1:amd64.
(Reading database ... 56007 files and directories currently installed.)
Preparing to unpack .../libdevmapper-event1.02.1_2%3a1.02.77-6ubuntu2_amd64.deb ...
Unpacking libdevmapper-event1.02.1:amd64 (2:1.02.77-6ubuntu2) ...
Selecting previously unselected package liblzo2-2:amd64.
Preparing to unpack .../liblzo2-2_2.06-1.2ubuntu1.1_amd64.deb ...
Unpacking liblzo2-2:amd64 (2.06-1.2ubuntu1.1) ...
Selecting previously unselected package libreadline5:amd64.
Preparing to unpack .../libreadline5_5.2+dfsg-2_amd64.deb ...
Unpacking libreadline5:amd64 (5.2+dfsg-2) ...
Selecting previously unselected package bcache-tools.
Preparing to unpack .../bcache-tools_1.0.7-1~14.04.1_amd64.deb ...
Unpacking bcache-tools (1.0.7-1~14.04.1) ...
Selecting previously unselected package btrfs-tools.
Preparing to unpack .../btrfs-tools_3.12-1_amd64.deb ...
Unpacking btrfs-tools (3.12-1) ...
Selecting previously unselected package watershed.
Preparing to unpack .../archives/watershed_7_amd64.deb ...
Unpacking watershed (7) ...
Selecting previously unselected package lvm2.
Preparing to unpack .../lvm2_2.02.98-6ubuntu2_amd64.deb ...
Unpacking lvm2 (2.02.98-6ubuntu2) ...
Selecting previously unselected package mdadm.
Preparing to unpack .../mdadm_3.2.5-5ubuntu4.2_amd64.deb ...
Unpacking mdadm (3.2.5-5ubuntu4.2) ...
Selecting previously unselected package ssl-cert.
Preparing to unpack .../ssl-cert_1.0.33_all.deb ...
Unpacking ssl-cert (1.0.33) ...
Selecting previously unselected package postfix.
Preparing to unpack .../postfix_2.11.0-1ubuntu1_amd64.deb ...
Unpacking postfix (2.11.0-1ubuntu1) ...
Selecting previously unselected package xfsprogs.
Preparing to unpack .../xfsprogs_3.1.9ubuntu2_amd64.deb ...
Unpacking xfsprogs (3.1.9ubuntu2) ...
Processing triggers for man-db (2.6.7.1-1) ...
Processing triggers for ureadahead (0.100.0-16) ...
Processing triggers for ufw (0.34~rc-0ubuntu2) ...
Setting up libdevmapper-event1.02.1:amd64 (2:1.02.77-6ubuntu2) ...
Setting up liblzo2-2:amd64 (2.06-1.2ubuntu1.1) ...
Setting up libreadline5:amd64 (5.2+dfsg-2) ...
Setting up bcache-tools (1.0.7-1~14.04.1) ...
update-initramfs: deferring update (trigger activated)
Setting up btrfs-tools (3.12-1) ...
update-initramfs: deferring update (trigger activated)
Setting up watershed (7) ...
update-initramfs: deferring update (trigger activated)
Setting up lvm2 (2.02.98-6ubuntu2) ...
update-initramfs: deferring update (trigger activated)
Setting up mdadm (3.2.5-5ubuntu4.2) ...
Generating mdadm.conf... done.
 Removing any system startup links for /etc/init.d/mdadm-raid ...
update-initramfs: deferring update (trigger activated)
/usr/sbin/grub-probe: error: failed to get canonical path of `overlayroot'.
invoke-rc.d: policy-rc.d denied execution of start.
Setting up ssl-cert (1.0.33) ...
hostname: Name or service not known
make-ssl-cert: Could not get FQDN, using "maas1".
make-ssl-cert: You may want to fix your /etc/hosts and/or DNS setup and run
make-ssl-cert: make-ssl-cert generate-default-snakeoil --force-overwrite
make-ssl-cert: again.
Setting up postfix (2.11.0-1ubuntu1) ...
Adding group `postfix' (GID 112) ...
Done.
Adding system user `postfix' (UID 106) ...
Adding new user `postfix' (UID 106) with group `postfix' ...
Not creating home directory `/var/spool/postfix'.
Creating /etc/postfix/dynamicmaps.cf
Adding tcp map entry to /etc/postfix/dynamicmaps.cf
Adding sqlite map entry to /etc/postfix/dynamicmaps.cf
Adding group `postdrop' (GID 113) ...
Done.
setting myhostname: maas1
setting alias maps
setting alias database
mailname is not a fully qualified domain name. Not changing /etc/mailname.
setting destinations: localdomain, localhost, localhost.localdomain, localhost
setting relayhost:
setting mynetworks: 127.0.0.0/8 [::ffff:127.0.0.0]/104 [::1]/128
setting mailbox_size_limit: 0
setting recipient_delimiter: +
setting inet_interfaces: all
setting inet_protocols: all
/etc/aliases does not exist, creating it.
WARNING: /etc/aliases exists, but does not have a root alias.

Postfix is now set up with a default configuration. If you need to make
changes, edit
/etc/postfix/main.cf (and others) as needed. To view Postfix configuration
values, see postconf(1).

After modifying main.cf, be sure to run '/etc/init.d/postfix reload'.

Running newaliases
invoke-rc.d: policy-rc.d denied execution of restart.
Setting up xfsprogs (3.1.9ubuntu2) ...
Processing triggers for libc-bin (2.19-0ubuntu6.3) ...
Processing triggers for initramfs-tools (0.103ubuntu4.2) ...
update-initramfs: Generating /boot/initrd.img-3.13.0-35-generic
cryptsetup: WARNING: failed to detect canonical device of /media/root-ro/
cryptsetup: WARNING: could not determine root device from /etc/fstab
W: mdadm: /etc/mdadm/mdadm.conf defines no arrays.
Processing triggers for ureadahead (0.100.0-16) ...
Processing triggers for ufw (0.34~rc-0ubuntu2) ...
mdadm: No arrays found in config file or automatically
Error: /dev/sdb: unrecognised disk label
mdadm: No arrays found in config file or automatically
Error: /dev/sdb: unrecognised disk label
Error: /dev/sdc: unrecognised disk label
mdadm: No arrays found in config file or automatically
Error: /dev/sdc: unrecognised disk label
Error: /dev/sdd: unrecognised disk label
mdadm: No arrays found in config file or automatically
Error: /dev/sdd: unrecognised disk label
Error: /dev/sde: unrecognised disk label
mdadm: No arrays found in config file or automatically
Error: /dev/sde: unrecognised disk label
Error: /dev/sdc: unrecognised disk label
Error: /dev/sdd: unrecognised disk label
Error: /dev/sde: unrecognised disk label
mdadm: Unrecognised md component device - /dev/sdc
mdadm: Unrecognised md component device - /dev/sdd
mdadm: Unrecognised md component device - /dev/sde
mdadm: unexpected failure opening /dev/md0
yes: standard output: Broken pipe
yes: write error
An error occured handling 'md0': ProcessExecutionError - Unexpected error while running command.
Command: yes | mdadm --create /dev/md0 --level=5 --raid-devices=3 /dev/sdc /dev/sdd /dev/sde
Exit code: 1
Reason: -
Stdout: ''
Stderr: ''
Unexpected error while running command.
Command: yes | mdadm --create /dev/md0 --level=5 --raid-devices=3 /dev/sdc /dev/sdd /dev/sde
Exit code: 1
Reason: -
Stdout: ''
Stderr: ''
Installation failed with exception: Unexpected error while running command.
Command: ['curtin', 'block-meta', 'custom']
Exit code: 3
Reason: -
Stdout: 'Get:1 http://security.ubuntu.com trusty-security InRelease [64.4 kB]\nIgn http://archive.ubuntu.com trusty InRelease\nGet:2 http://archive.ubuntu.com trusty-updates InRelease [64.4 kB]\nHit http://archive.ubuntu.com trusty Release.gpg\nHit http://archive.ubuntu.com trusty Release\nGet:3 http://security.ubuntu.com trusty-security/main amd64 Packages [357 kB]\nGet:4 http://archive.ubuntu.com trusty-updates/main amd64 Packages [639 kB]\nGet:5 http://security.ubuntu.com trusty-security/universe amd64 Packages [117 kB]\nGet:6 http://archive.ubuntu.com trusty-updates/universe amd64 Packages [326 kB]\nHit http://archive.ubuntu.com trusty/main amd64 Packages\nHit http://archive.ubuntu.com trusty/universe amd64 Packages\nFetched 1568 kB in 0s (3810 kB/s)\nReading package lists...\nReading package lists...\nBuilding dependency tree...\nReading state information...\nThe following extra packages will be installed:\n libdevmapper-event1.02.1 liblzo2-2 libreadline5 postfix ssl-cert watershed\nSuggested packages:\n thin-provisioning-tools procmail postfix-mysql postfix-pgsql postfix-ldap\n postfix-pcre sasl2-bin dovecot-common postfix-cdb mail-reader postfix-doc\n openssl-blacklist xfsdump acl attr quota\nRecommended packages:\n default-mta mail-transport-agent\nThe following NEW packages will be installed:\n bcache-tools btrfs-tools libdevmapper-event1.02.1 liblzo2-2 libreadline5\n lvm2 mdadm postfix ssl-cert watershed xfsprogs\n0 upgraded, 11 newly installed, 0 to remove and 178 not upgraded.\nNeed to get 2991 kB of archives.\nAfter this operation, 12.2 MB of additional disk space will be used.\nGet:1 http://archive.ubuntu.com/ubuntu/ trusty/main libdevmapper-event1.02.1 amd64 2:1.02.77-6ubuntu2 [10.8 kB]\nGet:2 http://archive.ubuntu.com/ubuntu/ trusty-updates/main liblzo2-2 amd64 2.06-1.2ubuntu1.1 [46.1 kB]\nGet:3 http://archive.ubuntu.com/ubuntu/ trusty/main libreadline5 amd64 5.2+dfsg-2 [130 kB]\nGet:4 http://archive.ubuntu.com/ubuntu/ trusty-updates/main bcache-tools amd64 1.0.7-1~14.04.1 [17.4 kB]\nGet:5 http://archive.ubuntu.com/ubuntu/ trusty/main btrfs-tools amd64 3.12-1 [335 kB]\nGet:6 http://archive.ubuntu.com/ubuntu/ trusty/main watershed amd64 7 [11.4 kB]\nGet:7 http://archive.ubuntu.com/ubuntu/ trusty/main lvm2 amd64 2.02.98-6ubuntu2 [470 kB]\nGet:8 http://archive.ubuntu.com/ubuntu/ trusty-updates/main mdadm amd64 3.2.5-5ubuntu4.2 [361 kB]\nGet:9 http://archive.ubuntu.com/ubuntu/ trusty/main ssl-cert all 1.0.33 [16.6 kB]\nGet:10 http://archive.ubuntu.com/ubuntu/ trusty-updates/main postfix amd64 2.11.0-1ubuntu1 [1084 kB]\nGet:11 http://archive.ubuntu.com/ubuntu/ trusty/main xfsprogs amd64 3.1.9ubuntu2 [508 kB]\nPreconfiguring packages ...\nFetched 2991 kB in 0s (58.9 MB/s)\nSelecting previously unselected package libdevmapper-event1.02.1:amd64.\n(Reading database ... 56007 files and directories currently installed.)\nPreparing to unpack .../libdevmapper-event1.02.1_2%3a1.02.77-6ubuntu2_amd64.deb ...\nUnpacking libdevmapper-event1.02.1:amd64 (2:1.02.77-6ubuntu2) ...\nSelecting previously unselected package liblzo2-2:amd64.\nPreparing to unpack .../liblzo2-2_2.06-1.2ubuntu1.1_amd64.deb ...\nUnpacking liblzo2-2:amd64 (2.06-1.2ubuntu1.1) ...\nSelecting previously unselected package libreadline5:amd64.\nPreparing to unpack .../libreadline5_5.2+dfsg-2_amd64.deb ...\nUnpacking libreadline5:amd64 (5.2+dfsg-2) ...\nSelecting previously unselected package bcache-tools.\nPreparing to unpack .../bcache-tools_1.0.7-1~14.04.1_amd64.deb ...\nUnpacking bcache-tools (1.0.7-1~14.04.1) ...\nSelecting previously unselected package btrfs-tools.\nPreparing to unpack .../btrfs-tools_3.12-1_amd64.deb ...\nUnpacking btrfs-tools (3.12-1) ...\nSelecting previously unselected package watershed.\nPreparing to unpack .../archives/watershed_7_amd64.deb ...\nUnpacking watershed (7) ...\nSelecting previously unselected package lvm2.\nPreparing to unpack .../lvm2_2.02.98-6ubuntu2_amd64.deb ...\nUnpacking lvm2 (2.02.98-6ubuntu2) ...\nSelecting previously unselected package mdadm.\nPreparing to unpack .../mdadm_3.2.5-5ubuntu4.2_amd64.deb ...\nUnpacking mdadm (3.2.5-5ubuntu4.2) ...\nSelecting previously unselected package ssl-cert.\nPreparing to unpack .../ssl-cert_1.0.33_all.deb ...\nUnpacking ssl-cert (1.0.33) ...\nSelecting previously unselected package postfix.\nPreparing to unpack .../postfix_2.11.0-1ubuntu1_amd64.deb ...\nUnpacking postfix (2.11.0-1ubuntu1) ...\nSelecting previously unselected package xfsprogs.\nPreparing to unpack .../xfsprogs_3.1.9ubuntu2_amd64.deb ...\nUnpacking xfsprogs (3.1.9ubuntu2) ...\nProcessing triggers for man-db (2.6.7.1-1) ...\nProcessing triggers for ureadahead (0.100.0-16) ...\nProcessing triggers for ufw (0.34~rc-0ubuntu2) ...\nSetting up libdevmapper-event1.02.1:amd64 (2:1.02.77-6ubuntu2) ...\nSetting up liblzo2-2:amd64 (2.06-1.2ubuntu1.1) ...\nSetting up libreadline5:amd64 (5.2+dfsg-2) ...\nSetting up bcache-tools (1.0.7-1~14.04.1) ...\nupdate-initramfs: deferring update (trigger activated)\nSetting up btrfs-tools (3.12-1) ...\nupdate-initramfs: deferring update (trigger activated)\nSetting up watershed (7) ...\nupdate-initramfs: deferring update (trigger activated)\nSetting up lvm2 (2.02.98-6ubuntu2) ...\nupdate-initramfs: deferring update (trigger activated)\nSetting up mdadm (3.2.5-5ubuntu4.2) ...\nGenerating mdadm.conf... done.\n Removing any system startup links for /etc/init.d/mdadm-raid ...\nupdate-initramfs: deferring update (trigger activated)\n/usr/sbin/grub-probe: error: failed to get canonical path of `overlayroot\'.\ninvoke-rc.d: policy-rc.d denied execution of start.\nSetting up ssl-cert (1.0.33) ...\nhostname: Name or service not known\nmake-ssl-cert: Could not get FQDN, using "maas1".\nmake-ssl-cert: You may want to fix your /etc/hosts and/or DNS setup and run\nmake-ssl-cert: make-ssl-cert generate-default-snakeoil --force-overwrite\nmake-ssl-cert: again.\nSetting up postfix (2.11.0-1ubuntu1) ...\nAdding group `postfix\' (GID 112) ...\nDone.\nAdding system user `postfix\' (UID 106) ...\nAdding new user `postfix\' (UID 106) with group `postfix\' ...\nNot creating home directory `/var/spool/postfix\'.\nCreating /etc/postfix/dynamicmaps.cf\nAdding tcp map entry to /etc/postfix/dynamicmaps.cf\nAdding sqlite map entry to /etc/postfix/dynamicmaps.cf\nAdding group `postdrop\' (GID 113) ...\nDone.\nsetting myhostname: maas1\nsetting alias maps\nsetting alias database\nmailname is not a fully qualified domain name. Not changing /etc/mailname.\nsetting destinations: localdomain, localhost, localhost.localdomain, localhost\nsetting relayhost: \nsetting mynetworks: 127.0.0.0/8 [::ffff:127.0.0.0]/104 [::1]/128\nsetting mailbox_size_limit: 0\nsetting recipient_delimiter: +\nsetting inet_interfaces: all\nsetting inet_protocols: all\n/etc/aliases does not exist, creating it.\nWARNING: /etc/aliases exists, but does not have a root alias.\n\nPostfix is now set up with a default configuration. If you need to make \nchanges, edit\n/etc/postfix/main.cf (and others) as needed. To view Postfix configuration\nvalues, see postconf(1).\n\nAfter modifying main.cf, be sure to run \'/etc/init.d/postfix reload\'.\n\nRunning newaliases\ninvoke-rc.d: policy-rc.d denied execution of restart.\nSetting up xfsprogs (3.1.9ubuntu2) ...\nProcessing triggers for libc-bin (2.19-0ubuntu6.3) ...\nProcessing triggers for initramfs-tools (0.103ubuntu4.2) ...\nupdate-initramfs: Generating /boot/initrd.img-3.13.0-35-generic\ncryptsetup: WARNING: failed to detect canonical device of /media/root-ro/\ncryptsetup: WARNING: could not determine root device from /etc/fstab\nW: mdadm: /etc/mdadm/mdadm.conf defines no arrays.\nProcessing triggers for ureadahead (0.100.0-16) ...\nProcessing triggers for ufw (0.34~rc-0ubuntu2) ...\nmdadm: No arrays found in config file or automatically\nError: /dev/sdb: unrecognised disk label\nmdadm: No arrays found in config file or automatically\nError: /dev/sdb: unrecognised disk label\nError: /dev/sdc: unrecognised disk label\nmdadm: No arrays found in config file or automatically\nError: /dev/sdc: unrecognised disk label\nError: /dev/sdd: unrecognised disk label\nmdadm: No arrays found in config file or automatically\nError: /dev/sdd: unrecognised disk label\nError: /dev/sde: unrecognised disk label\nmdadm: No arrays found in config file or automatically\nError: /dev/sde: unrecognised disk label\nError: /dev/sdc: unrecognised disk label\nError: /dev/sdd: unrecognised disk label\nError: /dev/sde: unrecognised disk label\nmdadm: Unrecognised md component device - /dev/sdc\nmdadm: Unrecognised md component device - /dev/sdd\nmdadm: Unrecognised md component device - /dev/sde\nmdadm: unexpected failure opening /dev/md0\nyes: standard output: Broken pipe\nyes: write error\nAn error occured handling \'md0\': ProcessExecutionError - Unexpected error while running command.\nCommand: yes | mdadm --create /dev/md0 --level=5 --raid-devices=3 /dev/sdc /dev/sdd /dev/sde\nExit code: 1\nReason: -\nStdout: \'\'\nStderr: \'\'\nUnexpected error while running command.\nCommand: yes | mdadm --create /dev/md0 --level=5 --raid-devices=3 /dev/sdc /dev/sdd /dev/sde\nExit code: 1\nReason: -\nStdout: \'\'\nStderr: \'\'\n'
Stderr: ''

Tags: storage

Related branches

Changed in curtin:
status: New → Confirmed
importance: Undecided → Critical
Revision history for this message
Ryan Harper (raharper) wrote :

Was this raid5 on root then? Can you attach the storage configuration passed to curtin ?

Revision history for this message
Ryan Harper (raharper) wrote :

Need storage config file

Changed in curtin:
status: Confirmed → Incomplete
Revision history for this message
Blake Rouse (blake-rouse) wrote :

Looking more into it only fails when I have a bcache created as well. If I remove the bcache it deploys fine. If I add the bcache it fails. I tried multiple times in different orders to make sure that had nothing to do with it and it always fails the same way.

Configuration without Bcache and deploys: http://paste.ubuntu.com/13105043/
Configuration with Bcache and fails to deploy: http://paste.ubuntu.com/13104998/

summary: - RAID 5 deployment on 3 disks fails
+ Deploy fails with RAID 5 and Bcache
Changed in curtin:
status: Incomplete → Confirmed
Revision history for this message
Blake Rouse (blake-rouse) wrote :

I was deploying Trusty from the MAAS releases stream.

Changed in maas:
status: Triaged → Invalid
Revision history for this message
Ryan Harper (raharper) wrote :

I've created a vmtest for this configuration and I've not been able to reproduce it so far. However, I suspect that it has something to do with disks with previous data on them possibly not getting cleared properly. Both bcache and mdadm have sticky metadata on the devices which when rediscovered will claim the underlying device preventing access to the underlying device.

I'm going to attempt to introduce some dirty disks which already have been used as bcache and raid devices so we can validate that curtin is wiping/cleaning them at the appropriate time.

Changed in curtin:
status: Confirmed → In Progress
Revision history for this message
Ryan Harper (raharper) wrote :

I've attempted many variations of injecting existing metadata on to the disks (making them raid members, lvm members, bcache devices etc) and curtin does it's job of properly wiping metadata from the partitions and devices prior to using them for something new. Subsequently, I'm not able to recreate this under wily nor trusty so I must be missing something in the configuration.

Was this trusty instance running with an HWE kernel? if so, which one?
If you have a setup to recreate this, I'd be happy to help debug so I can determine what the delta is between your environment and our test setup.

Changed in curtin:
status: In Progress → Incomplete
Revision history for this message
Blake Rouse (blake-rouse) wrote :

No I was deploying trusty with with the default kernel. I was using trusty from the releases stream for MAAS, is that the same stream you are using?

It seemed to be an issue were the bcache was created before the raid. Are you creating the bcache before the RAID device?

Revision history for this message
Ryan Harper (raharper) wrote : Re: [Bug 1512857] Re: Deploy fails with RAID 5 and Bcache

OK.

The devices are created in-order per the storage configuration, so no I'm
not and nor does curtin in general.

When I was attempting to break it, I did create bcache devices on all of
the drives that were to be come part
of the raid. Curtin properly removes any existing metadata and then
creates the devices in the order mentioned.

I also forced creation of a bcache device, leaving it active, and then ran
the mdadm command to create the raid
and the error message from mdadm is not the same as the one in the bug;
specifically, mdadm complains about
which device it cannot access/open in the error message, versus the
'unexpectedly failed to open /dev/md0'.

I asked about hwe kernels as I was thinking maybe one of those didn't have
raid built in to the kernel; I've seen
other users report mdadm issues with similar error message; all of those
cases we related to not having
md module loaded. Ubuntu kernels have md support compiled in so I don't
think that's the issue but I've not
verified the hwe kernel variants on top of trusty rootfs.

If you can recreate the issue, I'd be happy to hop on and debug the system
in question.

On Thu, Nov 12, 2015 at 10:35 AM, Blake Rouse <email address hidden>
wrote:

> No I was deploying trusty with with the default kernel. I was using
> trusty from the releases stream for MAAS, is that the same stream you
> are using?
>
> It seemed to be an issue were the bcache was created before the raid.
> Are you creating the bcache before the RAID device?
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1512857
>
> Title:
> Deploy fails with RAID 5 and Bcache
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/curtin/+bug/1512857/+subscriptions
>

Revision history for this message
Ryan Harper (raharper) wrote :

Actually, raid isn't built in; there are raid1, raid10, raid456 modules. Testing if I get that message when the module isn't loaded and maybe somehow the module wasn't loaded in your case (or loaded too late)?

No, mdadm will load the modules for you. If I move them out of the way, mdadm fails with:

mdadm: RUN_ARRAY failed: Invalid argument

Revision history for this message
Ryan Harper (raharper) wrote :

And I've reproduced it! Unloaded raid modules, re-run curtin install and I get the same error message.

It appears that the raid modules are not loaded afterall, so something isn't happening soon/fast enough.

root@ubuntu:/curtin# lsmod | grep raid
async_raid6_recov 12984 0
async_tx 13509 4 async_pq,async_xor,async_memcpy,async_raid6_recov
raid6_pq 97812 3 async_pq,btrfs,async_raid6_recov

Let's see if I can reliably reproduce it now.

Changed in curtin:
status: Incomplete → In Progress
Revision history for this message
Ryan Harper (raharper) wrote :

It doesn't appear to be a module load race; I can reproduce with preloading modules in curtin prior to the create.
Exploring preventing udev events right before creating the raid, and then restorting after creation.

Revision history for this message
Ryan Harper (raharper) wrote :

A reliable reproducer is to:

1) run the install but don't reboot when completed
2) mdadm --stop /dev/md0
3) modprobe -vr raid0 raid1 raid456 raid10
4) re-run the install

Every other run trips up or so.

Revision history for this message
Ryan Harper (raharper) wrote :

Using the reproduce from #12, I've tested the following sequence:

udevadm settle
udevadm control --stop-exec-queue
<mdadm create>
udevadm control --start-exec-queue
udevadm settle

The above effectively "pauses" udev events on the block device while we create our /dev/mdX device
then we unpause and let any queued events take place.

With that in place, I've successfully installed this configuration 10/10 times in a row. I'm preparing a branch
with this fix including a vmtest which executes this configuration. If we're still racy here then we will see
it there, albeit intermittently.

It's not clear to me yet if mdadm itself needs a bug w.r.t learning to "wait" on the creation of the device node
under /dev or if the race is somewhere else.

Revision history for this message
Scott Moser (smoser) wrote :

fix committed in 304. 309 or later should be good.

Changed in curtin:
status: In Progress → Fix Committed
Revision history for this message
Scott Moser (smoser) wrote :

fix went in at 305 actually.

Changed in maas:
milestone: 1.9.0 → none
Revision history for this message
Scott Moser (smoser) wrote : Fixed in Curtin 17.1

This bug is believed to be fixed in curtin in 17.1. If this is still a problem for you, please make a comment and set the state back to New

Thank you.

Changed in curtin:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.