Deploy with RAID as / fails if there is a newer kernel available

Bug #1569549 reported by Andreas Hasenack
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
curtin
Fix Released
High
Unassigned

Bug Description

maas 1.9.1+bzr4543-0ubuntu1~trusty1

Full installation log, as given by maas, attached.

I configured a node to use RAID1 for /, and did a deployment. It failed. Relevant lines:
run-parts: executing /etc/kernel/postinst.d/zz-update-grub 3.13.0-85-generic /boot/vmlinuz-3.13.0-85-generic
Generating grub configuration file ...
grub-probe: error: disk `md0' not found.
run-parts: /etc/kernel/postinst.d/zz-update-grub exited with return code 1
(...)
E: Sub-process /usr/bin/dpkg returned an error code (1)
Unexpected error while running command.
Command: ['chroot', '/tmp/tmp1Ds3mQ/target', 'eatmydata', 'apt-get', '--quiet', '--assume-yes', '--option=Dpkg::options::=--force-unsafe-io', '--option=Dpkg::Options::=--force-confold', 'install', 'linux-generic']
Exit code: 100

I have deployed this node before in this same configuration with no issues.

This is where I'm making a guess: it failed this time because there was a kernel update available. Had there been none (i.e., the daily image was more up-to-date), then there would be no new kernel installed and the error would not have happened.

I'm not sure how to test the "good" case here, other than wait a day or two for the image to be updated and try again. To test the failure case it should suffice to configure maas to use released images (not daily) and do a deploy with this RAID configuration.

Related branches

Revision history for this message
Andreas Hasenack (ahasenack) wrote :
tags: added: kanban-cross-team landscape
description: updated
Revision history for this message
Blake Rouse (blake-rouse) wrote :

Please include the output of the following commands:

maas [my-maas-session] node get-curtin-config [node-system-id]
maas [my-maas-session] block-devices read [node-system-id]

This is a curtin issue dealing with a newer kernel. MAAS has already handed the node to curtin to install. Curtin needs to handle this correctly.

Changed in curtin:
status: New → Incomplete
no longer affects: maas
Revision history for this message
Andreas Hasenack (ahasenack) wrote :
Revision history for this message
Andreas Hasenack (ahasenack) wrote :
Changed in curtin:
status: Incomplete → New
Revision history for this message
Andreas Hasenack (ahasenack) wrote :

The last two files I attached were obtained today, so the image is still out-of-date regarding the new kernel.

Revision history for this message
Ryan Harper (raharper) wrote :

Target release is Trusty ?

Revision history for this message
Andreas Hasenack (ahasenack) wrote : Re: [Bug 1569549] Re: Deploy with RAID as / fails if there is a newer kernel available

Yes, the node was on trusty, if that's what you are asking.
On Apr 13, 2016 11:00, "Ryan Harper" <email address hidden> wrote:

> Target release is Trusty ?
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1569549
>
> Title:
> Deploy with RAID as / fails if there is a newer kernel available
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/curtin/+bug/1569549/+subscriptions
>

Revision history for this message
Ryan Harper (raharper) wrote :

I can reproduce this issue in our vmtests.

The key trigger is:

kernel:
  mapping: {}
  package: linux-generic

Without this section, curtin doesn't look/find the new kernel. We've not had this section in our vmtest harness; but surely we will from now on; certainly since maas is using this.

Changed in curtin:
importance: Undecided → High
status: New → Confirmed
Revision history for this message
Ryan Harper (raharper) wrote :

Looks like the kernel package section has curtin install 'linux-generic' before we've installed mdadm in target.
smoser notes that this would happen without the section if we were using an HWE kernel for the target as well.

The error results from grub-probe expecting to use mdadm --examine <device> to extra information for devices.map
The error message from grub-probe is less than helpful since the real error was that /sbin/mdadm was not present
when grub-probe was run.

The fix involves moving install_missing_packages() sooner in curthooks.

I'll put up a branch with the fix and missing test-case coverage.

Ryan Harper (raharper)
tags: added: curtin-sru
Changed in curtin:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.