Ubuntu

update-grub runs and fails in containers

Reported by Serge Hallyn on 2012-10-02
64
This bug affects 9 people
Affects Status Importance Assigned to Milestone
Auto Upgrade Testing
Undecided
Martin Pitt
grub2 (Ubuntu)
High
Colin Watson
Precise
High
Colin Watson
Quantal
High
Colin Watson
lxc (Ubuntu)
High
Unassigned
Precise
High
Serge Hallyn
Quantal
High
Unassigned

Bug Description

[Impact] GRUB upgrades fail in containers.
[Test Case] Upgrade the grub-pc package in a container.
[Regression Potential] In itself, this postinst fix should be quite safe. It's possible it won't solve the whole problem - e.g. linux-image-* upgrades calling update-grub - but I wanted to backport just what was in quantal/raring rather than getting creative in an SRU.

[XXX edit - removed the SRU justification for lxc part. The proposed solution
was not safe, and was undone in a later commit. devtmpfs cannot be mounted
in a container, because changes under the container's /dev are then
reflected in the host's /dev.

If grub is installed in a container (as happens, for instance, with the ubuntu-cloud template) then an update of grub or linux-image will cause update-grub to be run. It tries, finds it can't access the root device, fails, and causes the update to fail.

It would be better for update-grub to detect that it is in a container and simply exit 0, so that the apt-get can succeed. I'm attaching a debdiff which does that.

Serge Hallyn (serge-hallyn) wrote :
Changed in grub2 (Ubuntu):
status: New → Confirmed
Serge Hallyn (serge-hallyn) wrote :

(Marking this confirmed as it has been seen by at least 3 people)

Note that to easily reproduce,

   lxc-create -t ubuntu-cloud -n cloud1
   lxc-start -n cloud1
   (login as ubuntu/ubuntu)
      sudo apt-get update && sudo apt-get dist-upgrade # assuming linux-image is slated to be updated.

Changed in grub2 (Ubuntu):
importance: Undecided → High
tags: added: patch
Colin Watson (cjwatson) wrote :

I think this is very much a temporary fix, and it would be better to make update-grub able to generate a basic grub.cfg, that perhaps might need you to set GRUB_MODULES in order to work properly but would at least be in principle usable. However, it's OK for 12.10.

Changed in lxc (Ubuntu):
importance: Undecided → High
status: New → Confirmed
Serge Hallyn (serge-hallyn) wrote :

After talking to cjwatson on irc, there is another option. update-grub is failing because /dev/disk/ is not set up in the container. So we could add an upstart job which runs only in containers, and manually sets up /dev/disk/. This also might help other software which relies on /dev/disk/. However, as grub is not installed in ubuntu containers by default (but is installed in ubuntu-cloud containers) we can't blindly re-use grub-probe. If we use grub-probe only if it is available, then we are helping fewer cases. If we detect the root device by hand, I fear we are duplicating too much fragile code.

Perhaps the best option is simply an upstart job like the below (but smarter and safer):

start on starting mountall
task
script
  type grub-probe > /dev/null 2>&1 || { exit 0; stop; }
  mkdir -p /dev/disk/by-label /dev/disk/by-uuid
  rootdev=`grub-probe -t device /`
  do_mknod /dev/rootdev
  rootlabel=`grub-probe -t fs_label /`
  ln -s ../../$rootdev /dev/disk/by-label/$rootlabel
  rootuuid=`grub-probe -t fs_uuid /`
  ln -s ../../$rootdev /dev/disk/by-uuid/$rootuuid
end script

Serge Hallyn (serge-hallyn) wrote :

As hrw pointed out on #ubuntu-devel, a third option is to simply add

devtmpfs dev devtmpfs defaults 0 0

to /var/lib/lxc/$c/fstab

Serge Hallyn (serge-hallyn) wrote :

I've pushed a patch to ubuntu:lxc which will cause devtmpfs to be mounted in ubuntu containers.

There remains the problem that grub-install then queries the user about installing grub on /dev/vda1 (or whatever is hosting the container fs).

Changed in upstart (Ubuntu):
status: New → Invalid
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package lxc - 0.8.0~rc1-4ubuntu37

---------------
lxc (0.8.0~rc1-4ubuntu37) quantal; urgency=low

  * update 0222-debian-dhcp3-package: use dhcp3-client, not server!
  * 0224-ubuntu-templates-devtmpfs: mount devtmpfs in ubuntu containers.
    (LP: #1060404)
 -- Serge Hallyn <email address hidden> Thu, 04 Oct 2012 12:06:02 -0500

Changed in lxc (Ubuntu):
status: Confirmed → Fix Released
Colin Watson (cjwatson) on 2012-10-05
Changed in grub2 (Ubuntu):
status: Confirmed → Fix Committed
assignee: nobody → Colin Watson (cjwatson)
Changed in lxc (Ubuntu Precise):
status: New → Triaged
importance: Undecided → High
assignee: nobody → Serge Hallyn (serge-hallyn)
description: updated
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package grub2 - 2.00-7ubuntu3

---------------
grub2 (2.00-7ubuntu3) quantal; urgency=low

  * If the postinst is running in a container, skip grub-install and all its
    associated questions (LP: #1060404).
  * Merge UEFI secure boot tweaks from Fedora:
    - Don't error on insmod on UEFI/SB, but also don't do any insmodding.
    - Add sleep to the list of modules in the signed image.
  * Move Ubuntu modifications to the Fedora linuxefi patch into separate
    patches, to ease maintenance.
  * Implement secure boot handling policy as outlined by Steve Langasek:
    - Make the linux module call linuxefi when necessary, simplifying
      configuration. Add the linux module to the signed image.
    - If secure boot is enabled and the kernel is signed, linux will call
      linuxefi to hand over to it without calling ExitBootServices.
    - Otherwise, linux will fall through to previous code, call
      ExitBootServices itself, and boot the kernel normally.
    - Change linuxefi to return GRUB_ERR_ACCESS_DENIED rather than
      GRUB_ERR_INVALID_COMMAND in the case of an invalid signature, to make
      it easier to implement different handling of unsigned kernels in
      future if necessary.
  * Build two images for signing: one with prefix /EFI/BOOT for use on
    removable media, and one with prefix /EFI/ubuntu (and with the lvm,
    mdraid09, and mdraid1x modules added) for use on fixed disks. Setup
    mostly borrowed from Fedora.
  * Generate configuration for signed UEFI kernels if available.
 -- Colin Watson <email address hidden> Sun, 07 Oct 2012 11:36:29 +0100

Changed in grub2 (Ubuntu Quantal):
status: Fix Committed → Fix Released
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in grub2 (Ubuntu Precise):
status: New → Confirmed
Changed in upstart (Ubuntu Precise):
status: New → Confirmed
Andreas Hasenack (ahasenack) wrote :

Any idea if this will be fixed for precise, or is it too risky?

description: updated
Serge Hallyn (serge-hallyn) wrote :

@Andreas,

it needs to be fixed in grub. However the fix in raring should be SRUable.

Changed in grub2 (Ubuntu Precise):
importance: Undecided → High
no longer affects: upstart (Ubuntu)
no longer affects: upstart (Ubuntu Precise)
no longer affects: upstart (Ubuntu Quantal)
Andreas Hasenack (ahasenack) wrote :

Right, grub, and for now my workaround is to remove grub-pc if I'm in a container.

Colin Watson (cjwatson) on 2013-01-08
description: updated
Changed in grub2 (Ubuntu Precise):
status: Confirmed → In Progress
assignee: nobody → Colin Watson (cjwatson)
Henning Eggers (henninge) wrote :

What is the work-around that I might be missing? Trying out juju with LXC by following https://juju.ubuntu.com/docs/getting-started.html will make you run directly into this bug. Quite a let-down as everything else runs so smoothly with juju ...

Serge Hallyn (serge-hallyn) wrote :

@Henning,

At one point we thought a workaround was to have devtmpfs mounted in the container. However that is a bad thing to do as changes in the container's /dev will be reflected in the host's /dev.

So the only workaround right now (until grub is fixed in precise) is to check (with 'mount') which device / is mounted from in the container, and then create that device in the container.

Hello Serge, or anyone else affected,

Accepted grub2 into precise-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/grub2/1.99-21ubuntu3.8 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in grub2 (Ubuntu Precise):
status: In Progress → Fix Committed
tags: added: verification-needed
Serge Hallyn (serge-hallyn) wrote :

Confirmed in precise, thanks.

tags: added: verification-done
removed: verification-needed

I can't reproduce fix on precise. I've used ppa:juju/0.6 (r608), grub-pc is ubuntu3.8, juju-origin: proposed, but I still get the same error.

On Tue, Jan 22, 2013 at 03:32:12AM -0000, Sebastian Hoffmann wrote:
> I can't reproduce fix on precise. I've used ppa:juju/0.6 (r608), grub-pc
> is ubuntu3.8, juju-origin: proposed, but I still get the same error.

Can I please have a transcript of the error you're seeing? There are
(at least) two possible failure modes here.

Andreas Hasenack (ahasenack) wrote :

I installed the version from precise-updates on a precise lxc, which didn't have grub-pc before. It worked for me. Here is the output:
"""
(...)
Processing triggers for ureadahead ...
Setting up libfreetype6 (2.4.8-1ubuntu2.1) ...
Setting up gettext-base (0.18.1.1-5ubuntu3) ...
Setting up libfuse2 (2.8.6-2ubuntu2) ...
Setting up grub-common (1.99-21ubuntu3.8) ...
Setting up grub2-common (1.99-21ubuntu3.8) ...
Setting up grub-pc-bin (1.99-21ubuntu3.8) ...
Setting up os-prober (1.51ubuntu3) ...
Setting up grub-pc (1.99-21ubuntu3.8) ...

Creating config file /etc/default/grub with new version
lxc
Setting up grub-gfxpayload-lists (0.6) ...
Processing triggers for libc-bin ...
ldconfig deferred processing now taking place
root@precise:~#
"""

That lone "lxc" comes from grub's postinst calling "running-in-container". I think it could have its output redirected to /dev/null, unless we want to restrict this test to "lxc":

root@precise:~# running-in-container
lxc
root@precise:~#

Colin Watson (cjwatson) wrote :

Andreas: Thanks. I already noticed the running-in-container output and
fixed it in Debian experimental (which will be merged into raring at
some point); it probably isn't worth revising precise for in the absence
of another pressing need to touch those lines of code, though.

Andreas Hasenack (ahasenack) wrote :

Agreed.

Andreas Hasenack (ahasenack) wrote :

Output when NOT running on LXC (a linode machine in this case):
(...)
Preparing to replace grub-pc 1.99-21ubuntu3.7 (using .../grub-pc_1.99-21ubuntu3.8_i386.deb) ...
Unpacking replacement grub-pc ...
Preparing to replace grub-pc-bin 1.99-21ubuntu3.7 (using .../grub-pc-bin_1.99-21ubuntu3.8_i386.deb) ...
Unpacking replacement grub-pc-bin ...
Preparing to replace grub2-common 1.99-21ubuntu3.7 (using .../grub2-common_1.99-21ubuntu3.8_i386.deb) ...
Unpacking replacement grub2-common ...
Preparing to replace grub-common 1.99-21ubuntu3.7 (using .../grub-common_1.99-21ubuntu3.8_i386.deb) ...
Unpacking replacement grub-common ...
Processing triggers for man-db ...
Processing triggers for install-info ...
Processing triggers for ureadahead ...
Setting up grub-common (1.99-21ubuntu3.8) ...
Setting up grub2-common (1.99-21ubuntu3.8) ...
Setting up grub-pc-bin (1.99-21ubuntu3.8) ...
Setting up grub-pc (1.99-21ubuntu3.8) ...
Generating grub.cfg ...
Found linux image: /boot/vmlinuz-3.2.0-36-virtual
Found initrd image: /boot/initrd.img-3.2.0-36-virtual
Found linux image: /boot/vmlinuz-3.2.0-35-virtual
Found initrd image: /boot/initrd.img-3.2.0-35-virtual
done
root@ls3-precise:~# echo $?
0
root@ls3-precise:~# which running-in-container
/bin/running-in-container
root@ls3-precise:~# running-in-container
root@ls3-precise:~# echo $?
1
root@ls3-precise:~#

Hello Serge, or anyone else affected,

Accepted grub2 into precise-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/grub2/1.99-21ubuntu3.9 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags: removed: verification-done
tags: added: verification-needed

I can confirm this fix working on precise for me.

Colin Watson (cjwatson) on 2013-01-23
tags: added: verification-done
removed: verification-needed
Justin van Heerde (jvanheerde) wrote :

Ran into this bug trying out juju, the posted fix for precise (1.99-21ubuntu3.9) works for me.

Bruno Girin (brunogirin) wrote :

The posted fix works for me too: running Juju on Raring with precise charms using juju-origin: proposed.

The verification of this Stable Release Update has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regresssions.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package grub2 - 1.99-21ubuntu3.9

---------------
grub2 (1.99-21ubuntu3.9) precise; urgency=low

  * Also skip update-grub when running in a container (LP: #1060404).
 -- Colin Watson <email address hidden> Tue, 22 Jan 2013 14:43:32 +0000

Changed in grub2 (Ubuntu Precise):
status: Fix Committed → Fix Released
Changed in lxc (Ubuntu Precise):
status: Triaged → Invalid
Martin Pitt (pitti) wrote :

For the record, this hasn't fully been fixed in Quantal: Quantal's /etc/kernel/postinst.d/zz-update-grub does NOT have the container check as introduced in precise's grub2 (1.99-21ubuntu3.9) SRU. This breaks upgrades from quantal to saucy in containers:

  https://jenkins.qa.ubuntu.com/job/upgrade-ubuntu-quantal-saucy-desktop-amd64/27/

(I filed bug 1279658 about this and then found this bug when analyzing it). At this point I'm not sure whether it's still worth SRUing this to quantal, or whether we just apply a workaround in our testing.

Martin Pitt (pitti) on 2014-02-13
Changed in auto-upgrade-testing:
status: New → In Progress
assignee: nobody → Martin Pitt (pitti)
Martin Pitt (pitti) wrote :

I applied a workaround for trusty in http://bazaar.launchpad.net/~auto-upgrade-testing-dev/auto-upgrade-testing/trunk/revision/92 . Not pretty, but it'll do.

Changed in auto-upgrade-testing:
status: In Progress → Fix Released
Martin Pitt (pitti) wrote :

Err, I meant "for quantal", not "for trusty".

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers