Patch to Natty 2.6.37-virtual breaks non-EC2 users

Bug #684875 reported by Alex Bligh
16
This bug affects 1 person
Affects Status Importance Assigned to Milestone
cloud-init (Ubuntu)
Invalid
Medium
Unassigned
Natty
Invalid
Medium
Unassigned
linux (Ubuntu)
Fix Released
High
Stefan Bader
Natty
Fix Released
High
Stefan Bader

Bug Description

Binary package hint: linux-image-2.6.37-7-virtual

This patch:
 http://kernel.ubuntu.com/git?p=ubuntu/ubuntu-natty.git;a=commitdiff;h=38096c28f13d0c2dd08584ff834da6d81306c7b3
breaks the kernel for users running on machines other than EC2.

There are two problems: firstly, this does something totally unexpected, i.e. renames a device that anyone using the mainstream kernel will have as /dev/xvdX. This breaks existing boot scripts (including all of ours - Flexiant/FlexiScale) that presume xen based kernels (which we patch ourselves) will use /dev/xvdX.

However, the problem particularly manifests itself on Xen version prior to 3.4, which do not support the PCI unplug functionality. These thus still have the emulated devices plugged in, so /dev/xvdX exists. You then get a kernel bug (see attached), and can't use the PV drivers at all, as it tries to allocate /sys/class/sda when that already exists (as the emulated driver is using it). Without this patch, everything would work fine.

dmesg attached.

Please either revert the patch, or make it dependent on a command line option, or make it dependent on the relevant device being unplugged. My view is this should be reverted: it's supporting one particular xen user who is doing something non-standard, and breaking things for everyone else.

Revision history for this message
Alex Bligh (ubuntu-alex-org) wrote :
Revision history for this message
Scott Moser (smoser) wrote :

I think we can safely drop this now.
In EC2 for maverick and newer, we're booting with pv-grub, and specifying root=LABEL=uec-rootfs . In the future, we possibly use root=UUID=... .

Changed in linux (Ubuntu):
importance: Undecided → High
milestone: none → natty-alpha-2
status: New → Confirmed
Revision history for this message
Scott Moser (smoser) wrote :

This change to the kernel may require a change in cloud-init to dal with the metadata block device mapping. We already do something like this for eucalyptus where we notice that the metadata service says 'sdX' but devices are named 'vdX' (using virtio).

Changed in cloud-init (Ubuntu):
importance: Undecided → Medium
status: New → Triaged
Revision history for this message
Scott Moser (smoser) wrote :

this hacked patch is in place because (for some reason unknown to me) when not using pv-grub as the loader, EC2 specifies 'root=sda1' on the kernel command line.

Revision history for this message
Alex Bligh (ubuntu-alex-org) wrote :

> EC2 specifies 'root=sda1' on the kernel command line.

EC2 should fix that then, as it's plain wrong.

Revision history for this message
Alex Bligh (ubuntu-alex-org) wrote :

Though a compromise solution would be to register as sda only if the unplug of the original sda device succeeded / is going to be tried. Otherwise it's just going to cause a kernel bug.

I think xen_unplug_emulated_devices() is called sufficiently early you could choose the name when the driver is init'ed, so something like the attached patch (completely untested, may not even compile).

But even so, if you rename the xen block device, you will be running /dev/sdX with a non-standard block major and block minor number (you are not changing the block major / minor numbers). I can't help but think that's a recipe for disaster.

Scott Moser (smoser)
tags: added: ec2-images
tags: added: kernel-series-unknown
Revision history for this message
Alex Bligh (ubuntu-alex-org) wrote :

My understanding is that the patch currently applies to all kernel variants, so has the potential to cause problems for:
* Anyone running Xen versions pre 3.4
* Anyone running any version of Xen hoping for stable device naming between Ubuntu kernels and any others (e.g. mainline, Debian , the kernels provided by Xen/Citrix, other PV driver enabeld kernels they run etc. etc.)

Revision history for this message
Alex Bligh (ubuntu-alex-org) wrote :

xen-devel thread is here:
  http://www.gossamer-threads.com/lists/xen/devel/192003

I've been asked to point out there are really two problems:

1. If the emulated devices (i.e. the "real" sda) is not unplugged, there is a device name clash. The emulated devices cannot be unplugged on xen 3.3 (because it doesn't support it), but unless you pass unplug=unnecessary, it won't actually allow the PV drivers, so you lose PV support. On 3.4 and onwards you might pass unplug=unnecessary anyway to get consistent device mapping with xen 2.6.18 supplied kernel, in which case you will get the device name clash.

2. Even if the unplugging works, you then get inconsistent device mapping, because both mainline, 2.6.18 and everything else expect to see virtual devices under /dev/xvda, not /dev/sda, so although you won't get the clash (i.e. the failure to register the device), the device will have an unexpected name which can and will break stuff.

Revision history for this message
Stefan Bader (smb) wrote :

Just want to make sure that it is ok (for the possible dependency in cloud-init) to go forward and ask for the kernel patch to be reverted in natty before actually doing so.

Changed in linux (Ubuntu Natty):
assignee: nobody → Stefan Bader (stefan-bader-canonical)
Revision history for this message
Scott Moser (smoser) wrote :

Stefan, assuming the kernel boots, i'll deal with cloud-init fallout.
I think jjohansen had some issues with booting though.

Revision history for this message
John Johansen (jjohansen) wrote :

So I have experimented with this a bit and so far I haven't gotten an instance to boot without the patch. It should work, so I just need to tinker with it more.

Revision history for this message
Stefan Bader (smb) wrote :

I did a small test this morning with ami-e4c9388d:

IMAGE ami-e4c9388d 099720109477/ebs/ubuntu-images-testing/ubuntu-natty-daily-amd64-server-20110104 099720109477 available public x86_64 machine aki-427d952b ebs
BLOCKDEVICEMAPPING /dev/sda1 snap-9fb802f2 8

Originally running 2.6.37-11 and showing sda and sdb in /proc/partitions. I did a build from the current master-next in Natty:

# uname -a
Linux ip-10-117-77-50 2.6.37-12-virtual #26~lp684875v1 SMP Wed Jan 5 09:14:12 UTC 2011 x86_64 GNU/Linux

# cat /proc/partitions
major minor #blocks name

 202 1 8388608 xvda1
 202 16 440366080 xvdb

This works with or without the xen_emul_unplug=unnecesarry switch (expected as m1.large is not HVM). So we seem to be ok with just reverting the patch. Though we should check whether we now need the argument for our cluster instances.

Revision history for this message
Scott Moser (smoser) wrote : Re: [Bug 684875] Re: Patch to Natty 2.6.37-virtual breaks non-EC2 users

On Wed, 5 Jan 2011, Stefan Bader wrote:

> This works with or without the xen_emul_unplug=unnecesarry switch
> (expected as m1.large is not HVM). So we seem to be ok with just
> reverting the patch. Though we should check whether we now need the
> argument for our cluster instances.

If you've got a deb I can test easily by launching a cluster instance.
If you dont have one easily at hand, just go ahead and get the commit in
and we can deal with fallout on cluster (I'd like to remove that argument
on cluster anyway).

Scott

Revision history for this message
Stefan Bader (smb) wrote :

I went ahead and asked it to be included in the next Natty kernel (I assumed that we rather want it reverted and go on from there). This should give at least something to look at on the sprint.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 2.6.37-12.26

---------------
linux (2.6.37-12.26) natty; urgency=low

  [ Andy Whitcroft ]

  * rebase to v2.6.37-rc8
  * [Config] armel -- reenable omap flavour
  * [Config] disable CONFIG_MACH_OMAP3517EVM to fix FTBS on armel omap
  * [Config] disable CONFIG_GPIO_VX855 to fix FTBS on omap armel
  * [Config] disable CONFIG_WESTBRIDGE_ASTORIA to fix FTBS on omap armel
  * [Config] disable CONFIG_TI_DAVINCI_EMAC to fix FTBS on omap armel
  * rebase to mainline 989d873fc5b6a96695b97738dea8d9f02a60f8ab
  * [Config] track missing modules
  * rebase to v2.6.37 final

  [ Chase Douglas ]

  * SAUCE: (drop after 2.6.37) HID: magicmouse: Don't report REL_{X, Y} for
    Magic Trackpad

  [ Stefan Bader ]

  * Revert "SAUCE: blkfront: default to sd devices"
    - LP: #684875

  [ Tim Gardner ]

  * Revert "SAUCE: (no-up) libata: Ignore HPA by default."
    - LP: #380138
  * [Config] Added autofs4.ko to -virtual flavour
    - LP: #692917

  [ Upstream Kernel Changes ]

  * Add support for Intellimouse Mode in ALPS touchpad on Dell E2 series
    Laptops
    - LP: #632884

  [ Upstream Kernel Changes ]

  * rebase to v2.6.37-rc8
  * rebase to mainline 989d873fc5b6a96695b97738dea8d9f02a60f8ab
  * rebase to v2.6.37 final
 -- Andy Whitcroft <email address hidden> Thu, 23 Dec 2010 18:34:13 +0000

Changed in linux (Ubuntu Natty):
status: Confirmed → Fix Released
Revision history for this message
Alex Bligh (ubuntu-alex-org) wrote :

I have tested this on Xen 3.3.1 in HVM mode and now correctly get /dev/xvda etc.

Revision history for this message
Stefan Bader (smb) wrote :

Some notes on this which come up to me while looking at some other issues. Currently the Natty images seem to contain two entries in /etc/fstab that point to /dev/sd* devices. At first the entry for swap was scaring me a bit as it did not have a nobootwait and my reboots were not coming back. But it seems that was another issue with some old kernel version which prevented the reboot command issued from within the instance to fail. I just tested again with a recent kernel and that seems to be ok (naturally without getting any swap mounted).

So for getting the mounts right, there may be some additional scripting needed but at least it does not seem to cause any fatal boot fail when the names change.

Revision history for this message
Alex Bligh (ubuntu-alex-org) wrote :

Further notes:

1. non-ubuntu specific: to get HVM devices to work on Xen pre 3.4.something, you need to use emulunplug=unnecessary or perhaps emulunplug=unnecessary,all on the command line. Otherwise Xen's non-support of PCI unplug means that failure to unplug the emulated devices stops the HVM devices initialising.

2. It is desirable that if both devices come up, /dev/block/byuuid/... maps to /dev/xvdX not /dev/sdX if mount by UUID is to work. I haven't yet checked this. It requires module init order to be right. I'm not sure you can work around this by blacklisting sd_mod as I think sd is built in. I have some patches to allow early init of old style xen block devices somewhere which I might be able to dig out.

3. historical experience tells us that having 2 NICs (emulated and not) with the same Mac address confuses udev fatally. In general in a virtual environment you don't want udev renaming NICs anyway, so start up scripts should remove this.

Revision history for this message
Scott Moser (smoser) wrote :

cloud-init writes /etc/fstab entries with LABEL=, for root device.
for ephemeral devices other than swap, it adds 'nobootwait', so these devices would not hang up boot if they changed names.
for swap devices (I just verified), mountall does not wait.

Changed in cloud-init (Ubuntu Natty):
status: Triaged → Invalid
Revision history for this message
Colin Watson (cjwatson) wrote :

I don't suppose there's any chance of fixing this for Maverick as well? I believe that this is the cause of the verification failure in bug 720558, and I think that any attempt to work around this in GRUB would probably do more harm than good.

Comment #2 on this bug suggests that cloud-init should no longer require the /dev/sd* names in Maverick, but I don't know what effect renaming them in a stable release would have on people. However, I'm not sure how people would be using Maverick in Xen anyway, as bug 720558 renders installation using d-i impossible, and it sounds like this bug is pretty fatal for many other use cases.

Revision history for this message
Colin Watson (cjwatson) wrote :

Scott Moser has explained to me why this is probably too invasive to fix in Maverick (hardcoded device names in /etc/fstab). I think I have an idea for how to work around this in Maverick's grub package by observing that an earlier comment in this bug indicates that the xen-blkfront /dev/sd* devices have a non-standard major number; I should be able to use that to detect a Xen block device.

Revision history for this message
Alex Bligh (ubuntu-alex-org) wrote :

We have Maverick running in Xen quite extensively. We use debootstrap images with normal grub (not pvgrub), i.e. we are passing a full HD image to Xen (and I know we aren't the only ones to do this). We do however modify /etc/fstab etc., and aren't using -virtual (I think we use -server) precisely because of this sort of problem.

It would be nice to have it in Maverick, but (data point with sample size 1) for use the most important releases to work are the latest LTS (Lucid) and the latest non-LTS (Natty).

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers