Hyper-V: PV Drivers for Ubuntu guests running on Hyper-V lose root device to ata_piix

Bug #929545 reported by Mark Baker on 2012-02-09
20
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Medium
Andy Whitcroft

Bug Description

Microsoft PV drivers are unable to access the root device because the native driver gets to take over the root device
before our PV drivers get loaded. The strategies Microsoft currently use on other distributions are based on setting up modprobe rules but these will not work on Ubuntu as the native (ata*) driver is built directly into the kernel. Here are the options Microsoft would like to propose for getting around this problem:

 1. Build Hyper-V relevant storage driver as part of the kernel and we
    can fix the initialization ordering to ensure that our PV drivers
    control the root device when Ubuntu is running on a Hyper-V host.
    When not running on a Hyper-V host, our drivers will not
    successfully initialize and so will not have any effect. This would
    simultaneously address a need for fast boot while also addressing
    the performance issues on the root device when hosted on Hyper-V.
 2. Modify the ata driver to recognize that when hosted on Hyper-V, it
    should not control the disks.

Given that our PV block driver currently does not support CD/DVD
devices, I suspect we will end up with some combination of the two
approaches listed. Let me know your preference. If you want to go this
route, we can certainly help with the engineering work.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 929545

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete

Hi Brad - I am logging this bug on behalf of microsoft, I can't re-produce it myself as I only run Ubuntu and do not have a windows server 2008 with Hyper-V. I can ask them to provide a log but the bug is confirmed anyway.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Brad Figg (brad-figg) wrote :

@mark,

Which kernel version(s) are they looking for the fixes for this issue?

On 09/02/12 17:43, Brad Figg wrote:
> @mark,
>
> Which kernel version(s) are they looking for the fixes for this issue?
>
the upcoming 12.04 kernel, nothing before that.

/Mark

Changed in linux (Ubuntu):
importance: Undecided → Medium
tags: added: kernel-da-key precise

Mark - Option 1 isn't viable since the relevant HV storage drivers are still in staging. Option 2 plus adding adding the boot essential HV drivers to the initramfs is probably the best way to go. Does Microsoft have a suggested method for determining if the ata_piix driver is attempting to load on an HV hypervisor ? Even if we build in the HV bus manager, there does not appear to be any relevant interfaces advertised in include/linux/hyperv.h

Thank you for taking the time to file a bug report on this issue.

However, given the number of bugs that the Kernel Team receives during any development cycle it is impossible for us to review them all. Therefore, we occasionally resort to using automated bots to request further testing. This is such a request.

We have noted that there is a newer version of the development kernel than the one you last tested when this issue was found. Please test again with the newer kernel and indicate in the bug if this issue still exists or not.

You can update to the latest development kernel by simply running the following commands in a terminal window:

    sudo apt-get update
    sudo apt-get upgrade

If the bug still exists, change the bug status from Incomplete to Confirmed. If the bug no longer exists, change the bug status from Incomplete to Fix Released.

If you want this bot to quit automatically requesting kernel tests, add a tag named: bot-stop-nagging.

 Thank you for your help, we really do appreciate it.

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
tags: added: kernel-request-3.2.0-15.24

Hi Tim,

On 09/02/12 21:58, Tim Gardner wrote:
> Mark - Option 1 isn't viable since the relevant HV storage drivers are
> still in staging. Option 2 plus adding adding the boot essential HV
> drivers to the initramfs is probably the best way to go.

I think Andy is already working in this, see
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/917135

> Does Microsoft
> have a suggested method for determining if the ata_piix driver is
> attempting to load on an HV hypervisor ? Even if we build in the HV bus
> manager, there does not appear to be any relevant interfaces advertised
> in include/linux/hyperv.h

I can get back to them but probably best to update the bug and I'll get
them to respond.

Thanks

Mark

>

From Microsoft:

Hyper-V detection code has been integrated with the kernel for sometime now:

Look at arch/x86/kernel/cpu/mshyperv.c. You can detect the presence of Hyper-V by examining the exported variable

/Mark

Andy Whitcroft (apw) on 2012-02-10
summary: - PV Drivers for Ubuntu guests running on Hyper-V unable to control root
- device
+ Hyper-V: PV Drivers for Ubuntu guests running on Hyper-V lose root
+ device to ata_piix
Changed in linux (Ubuntu):
assignee: nobody → Andy Whitcroft (apw)
Andy Whitcroft (apw) on 2012-02-10
tags: added: bot-quit-nagging
removed: kernel-request-3.2.0-15.24 precise
Andy Whitcroft (apw) wrote :

Ok. I have attempted to add hyper-v detection to the ata_piix driver. This basically defers handling to hyper-v by default if the hypervisor is detected. I have also added an override to allow this to be suppressed from the command line. Could someone with a hyper-v based cloud test the kernels below and report whether they work here. Could you please include a dmesg of a successful boot so I can confirm the detection has recorded itself correctly. It would also be helpful to get a boot with ata_piix.prefer_ms_hyperv=0, again a dmesg would be useful. Kernels are at the URL below:

    http://people.canonical.com/~apw/lp929545-precise/

Kernels should be synced shortly. Thanks.

Mike Sterling (mike-sterling) wrote :

I've tested this, but it doesn't appear to work as expected. I installed today's precise-server ISO and then installed the new linux-image via dpkg -i. Checking /sys/block/sda/device/driver/0:0:0:0/vendor reported the driver as ATA instead of Msft as we would expect.

Andy Whitcroft (apw) wrote :

Hmm, looks like the patch isn't quite right. I'll spin a replacement test kernel shortly.

Andy Whitcroft (apw) wrote :

Ok I've updated the test kernels hopefully fixing the issue. If we could re-test the as before. Thanks.

Mike Sterling (mike-sterling) wrote :

That didn't seem to work either - it was unable to mount the root filesystem. If there are steps that I can do within the shell to further diagnose, let me know.

Andy Whitcroft (apw) wrote :

@Mike -- i suspect that that actually means it worked. But of course with the other bug, you don't have the drivers in your initramfs. I'll have to get an updated initramfs package for you as well.

Andy Whitcroft (apw) wrote :

Ok I've put together updating initramfs-tools which should include the correct hv_* module which hopefully will allow the right drivers to be loaded. Could you install the initramfs-tools from the URL below. You will also need to ensure the initramfs is rebuilt, and then retest as before:

    http://people.canonical.com/~apw/lp917135-precise/

Thanks.

Hi,

The intention here is to boot every disk volume using the virtual Hyper-V SCSI adapter? That would be different than the procedure for setting up VM guest disks (on Windows systems only?) as described in Microsoft's Hyper-V docs, but maybe that is the goal.

from http://technet.microsoft.com/en-us/library/dd183729%28WS.10%29.aspx

>You can select either integrated device electronics (IDE) or SCSI devices on virtual machines:
> IDE devices. Hyper-V uses emulated devices with IDE controllers. You can have up to two IDE controllers with two disks on each controller. The startup disk (sometimes referred to as the boot disk) must be attached to one of the IDE devices. The startup disk can be either a virtual hard disk or a physical disk. Although a virtual machine must use an IDE device as the startup disk to start the guest operating system, you have many options to choose from when selecting the physical device that will provide the storage for the IDE device. For example, you can use any of the types of physical storage identified in the introduction section.
> SCSI devices. Each virtual machine supports up to 256 SCSI disks (four SCSI controllers with each controller supporting up to 64 disks). SCSI controllers use a type of device developed specifically for use with virtual machines and use the virtual machine bus to communicate. The virtual machine bus must be available when the guest operating system is started. Therefore, virtual hard disks attached to SCSI controllers cannot be used as startup disks.
...
>Note
>Although the I/O performance of physical SCSI and IDE devices can differ significantly, this is not true for the virtualized SCSI and IDE devices in Hyper-V. Hyper-V. IDE and SCSI devices both offer equally fast I/O performance when integration services are installed in the guest operating system.

I can try this early 12.04 code out too on a Hyper-V host. I am using Ubuntu 10.04 as a guest OS quite a bit on Hyper-V hosts.

Regards,
Tim Miller Dyck

Andy Whitcroft (apw) wrote :

@Tim -- I can't claim to have ever booted in this environment. The request to switch the boot disks over to the paravirtualised drivers was made as the performance over IDE is poor. It has been suggested by those in the know so I am assuming that it should work, though as I say I do not have access to anything to test myself.

Mike Sterling (mike-sterling) wrote :

This doesn't change what the VM boots from - we still require that the system boot from an IDE disk. What this does enable is our hv_storvsc driver to control the root device, which uses hv_vmbus to communicate with dom0. hv_storvsc and hv_blkvsc were merged upstream a while back, and hv_storvsc is a generic block device that handles both IDE and SCSI traffic.

Andy Whitcroft (apw) wrote :

@Mike -- I think we both mean the same thing, I just am looking at things from inside. Did the new kerenl and initramfs-tools combination work for you?

Mike Sterling (mike-sterling) wrote :

Andy, what's the correct process to try both the kernel-image and the updated initramfs-tools? I installed the kernel-image, the initramfs-tools, and the initramfs-tools-bin package at once using dpkg -i, but I'm still getting the same hang as before.

Andy Whitcroft (apw) wrote :

@Mike -- I think i would expect that to work in that combination. Could we:

1) install all three and then confirm that the initramfs image at least contains the required drivers, using something like the below (remember to sub in the right kernel version):

    zcat /boot/initrd.img-KERNELVERSION | cpio -it | grep hv_

2) can we get a dmesg from the failing boot if at all possible

3) can we try modprobing the required modules from the initramfs prompt (I think this is just hv_storvsc) and see if the drives are then detected (in dmesg) if so you could then try exiting from the shell which should retry the mount.

Andy Whitcroft (apw) wrote :

@Mike -- if you are able to get onto freenode irc, then you might also like to come find me there, we might be able to diagnose this quicker there.

Mike Sterling (mike-sterling) wrote :

Ah-ha.

mike@ubuntu:~$ sudo zcat /boot/initrd.img-3.2.0-16-generic | cpio -it | grep hv_

lib/modules/3.2.0-16-generic/kernel/drivers/hv/hv_vmbus.ko
lib/modules/3.2.0-16-generic/kernel/drivers/hv/hv_utils.ko
lib/modules/3.2.0-16-generic/kernel/drivers/staging/hv/hv_mouse.ko
lib/modules/3.2.0-16-generic/kernel/drivers/staging/hv/hv_netvsc.ko
79943 blocks
mike@ubuntu:~$

We're missing hv_storvsc in the initrd.

Mike Sterling (mike-sterling) wrote :

Just an update as to where we are on this bug:

With the updated packages from apw, I've confirmed that hv_storvsc is present in the initrd from -pre6 combined with the early 3.2.0-16. However, after a reboot, the system barfs a bunch of errors about rejecting I/O due to an offline device. Output from boot is available here:

http://paste.ubuntu.com/841283/

It's unclear if this is an issue with the hv_storvsc driver or something else, however.

Andy Whitcroft (apw) wrote :

So at least some of the disk is readable as we were able to see and report the partition table:

[ 6.665440] scsi 0:0:0:0: Direct-Access Msft Virtual Disk 1.0 PQ: 0 ANSI: 4
[ 6.672178] sd 0:0:0:0: [sda] 167772160 512-byte logical blocks: (85.8 GB/80.0 GiB)
[ 6.678902] sd 0:0:0:0: Attached scsi generic sg0 type 0
[ 6.682081] sd 0:0:0:0: [sda] Write Protect is off
[ 6.686096] input: Microsoft Vmbus HID-compliant Mouse as /devices/virtual/input/input2
[ 6.691830] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, supports DPO and FUA
[ 6.702828] sda: sda1 sda2 < sda5 >
[ 6.706727] scsi1 : storvsc_host_t
[ 6.710982] sd 0:0:0:0: [sda] Attached SCSI disk

Though the issues seem to start where we start a scsi scan.

Mike Sterling (mike-sterling) wrote :

After some internal investigation, this is a known issue in the build of hv_storvsc that is present in the Ubuntu kernel sources, including 3.2.0-16. The latest version of the driver, available in linux-next after Feb 14 (at its new location in /drivers/scsi/storvsc_drv.c) has that resolved, as well as other cleanup issues from the community review.

In fact, I would strongly suggest that the latest hv* drivers from linux-next be used if at all possible. The snapshot being used as the base for the Precise kernel only has the hv_vmbus driver out of staging. With the release of the 3.3 kernel, the hv_netvsc and hv_mouse drivers were approved for exit, and we've already received word that the hv_storvsc driver will exit in 3.4. The quality of code after going through the community review is leaps and bounds better than what is present in 3.2, and should be ideal for a -LTS release.

To accomplish this, you can either pull from linux-next and replace the files in /drivers/staging/hv (which would require no changes to the build system), or pull the appropriate files out of the various folders (as well as the Kconfigs) and place them in the locations as of linux-next:

drivers/hv, drivers/net/hyperv/, drivers/hid/hid-hyperv.c, drivers/scsi/storvsc_drv.c

Please let me know how you'd like to proceed on this.

-M

Andy Whitcroft (apw) wrote :

We have pulled together a backport from that currnetly sitting in linux-next, roughtly what will hit 3.4, we have combined that with the fixes to the ata_piix driver identified above. @Mike could you test that for us in combination with the initramfs tools bits you already have. The kernel images are the newest ones in the same place as before (see above). Thanks.

Mike Sterling (mike-sterling) wrote :

I've confirmed that the combination of linux-image (from above) and the initramfs-tools / initramfs-tools-bin from http://people.canonical.com/~apw/lp917135-precise/ result in a system booting and using hv_storvsc to handle the root device instead of ata_piix.

-M

Nick Barcet (nijaba) on 2012-02-16
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Andy Whitcroft (apw) on 2012-02-16
Changed in linux (Ubuntu):
status: Confirmed → Fix Committed
Launchpad Janitor (janitor) wrote :
Download full text (5.4 KiB)

This bug was fixed in the package linux - 3.2.0-17.26

---------------
linux (3.2.0-17.26) precise; urgency=low

  [ Andy Whitcroft ]

  * [Config] clean up the human consumable package descriptions
  * [Config] fix generic flavour description
  * [Config] clean up linux-tools package descriptions
    - LP: #593107
  * deviations -- note the source of the Hyper-V updates
  * SAUCE: ata_piix: defer to the Hyper-V drivers by default
    - LP: #929545

  [ Eugeni Dodonov ]

  * SAUCE: drm/i915: do not enable RC6p on Sandy Bridge

  [ Kees Cook ]

  * SAUCE: (drop after 3.3) security: create task_free security callback
  * SAUCE: (drop after 3.3) security: Yama LSM
  * SAUCE: (drop after 3.3) Yama: add PR_SET_PTRACER_ANY
  * SAUCE: Yama: add link restrictions
  * SAUCE: security: unconditionally chain to Yama LSM

  [ Leann Ogasawara ]

  * Drop ndiswrapper

  [ Robert Hooker ]

  * SAUCE: drm/i915: Enable RC6 by default on sandybridge.

  [ Tim Gardner ]

  * SAUCE: ipheth: Add iPhone 4S
    - LP: #900802
  * dropped hv_mouse
  * [Config] CONFIG_X86_NUMACHIP=y

  [ Upstream Kernel Changes ]

  * Staging: hv: vmbus: Support building the vmbus driver as part of the
    kernel
  * hv: Add Kconfig menu entry
  * Drivers: hv: Fix a memory leak
  * Drivers: hv: Make the vmbus driver unloadable
  * Drivers: hv: Get rid of an unnecessary check in hv.c
  * Staging: hv: mousevsc: Make boolean states boolean
  * Staging: hv: mousevsc: Inline the code for mousevsc_on_device_add()
  * Staging: hv: mousevsc: Inline the code for reportdesc_callback()
  * Staging: hv: mousevsc: Cleanup mousevsc_on_channel_callback()
  * Staging: hv: mousevsc: Add a new line to a debug string
  * Staging: hv: mousevsc: Get rid of unnecessary include files
  * Staging: hv: mousevsc: Address some style issues
  * Staging: hv: mousevsc: Add a check to prevent memory corruption
  * Staging: hv: mousevsc: Use the KBUILD_MODNAME macro
  * Staging: hv: storvsc: Use mempools to allocate struct
    storvsc_cmd_request
  * Staging: hv: storvsc: Cleanup error handling in the probe function
  * Staging: hv: storvsc: Fixup the error when processing SET_WINDOW
    command
  * Staging: hv: storvsc: Fix error handling storvsc_host_reset()
  * Staging: hv: storvsc: Use the accessor function shost_priv()
  * Staging: hv: storvsc: Use the unlocked version queuecommand
  * Staging: hv: storvsc: use the macro KBUILD_MODNAME
  * Staging: hv: storvsc: Get rid of an unnecessary forward declaration
  * Staging: hv: storvsc: Upgrade the vmstor protocol version
  * Staging: hv: storvsc: Support hot add of scsi disks
  * Staging: hv: storvsc: Support hot-removing of scsi devices
  * staging: hv: Use kmemdup rather than duplicating its implementation
  * staging: hv: move hv_netvsc out of staging area
  * Staging: hv: mousevsc: Properly add the hid device
  * Staging: hv: storvsc: Disable clustering
  * Staging: hv: storvsc: Cleanup storvsc_device_alloc()
  * Staging: hv: storvsc: Fix a bug in storvsc_command_completion()
  * Staging: hv: storvsc: Fix a bug in copy_from_bounce_buffer()
  * Staging: hv: storvsc: Implement per device memory pools
  * Staging: hv: remove hv_mouse driver as it's now in the hid ...

Read more...

Changed in linux (Ubuntu):
status: Fix Committed → Fix Released
Andy Whitcroft (apw) wrote :

Ok this fix was incomplete as the CDROM/DVD are not supported via the hyper-v paravirt drivers. An updated fix is now applied.

Changed in linux (Ubuntu):
status: Fix Released → Fix Committed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 3.2.0-18.29

---------------
linux (3.2.0-18.29) precise; urgency=low

  [ Andy Whitcroft ]

  * [Config] restore build-% shortcut
  * SAUCE: ata_piix: defer disks to the Hyper-V drivers by default
    - LP: #929545, #942316

  [ Eugeni Dodonov ]

  * SAUCE: drm: give up on edid retries when i2c bus is not responding
    - LP: #855124

  [ Seth Forshee ]

  * SAUCE: (drop after 3.3) platform/x86: Add driver for Apple gmux device
    - LP: #925544

  [ Upstream Kernel Changes ]

  * bsg: fix sysfs link remove warning
    - LP: #946928
  * regset: Prevent null pointer reference on readonly regsets
    - LP: #949905
    - CVE-2012-1097
  * regset: Return -EFAULT, not -EIO, on host-side memory fault
    - LP: #949905
    - CVE-2012-1097

  [ Wu Fengguang ]

  * SAUCE: (drop after 3.4) ALSA: hda - add id for Atom Cedar Trail HDMI
    codec
 -- Leann Ogasawara <email address hidden> Fri, 09 Mar 2012 07:56:11 -0800

Changed in linux (Ubuntu):
status: Fix Committed → Fix Released
Luis Henriques (henrix) wrote :

This bug is awaiting verification that the kernel for Precise in -proposed solves the problem (3.2.0-35.55). Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-precise' to 'verification-done-precise'.

If verification is not done by one week from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-precise
Luis Henriques (henrix) wrote :

After a IRC chat with Ben Howard, I'm tagging this as verified in Precise.

tags: added: verification-done-precise
removed: verification-needed-precise

The verification of this Stable Release Update has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regresssions.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers