MaaS fails to boot Hyper-V Generation 2 virtual machines

Bug #1519836 reported by Gabriel Samfira on 2015-11-25
26
This bug affects 4 people
Affects Status Importance Assigned to Milestone
MAAS
Undecided
Unassigned
grub2 (Ubuntu)
High
Mathieu Trudel-Lapierre
Xenial
High
Mathieu Trudel-Lapierre
grub2-signed (Ubuntu)
High
Unassigned
Xenial
High
Mathieu Trudel-Lapierre

Bug Description

[Impact]
When trying to deploy a "Generation 2" virtual machine on Hyper-V, grub fails to fetch the linux kernel and initrd from MaaS. The operation times out immediately, because the Generation 2 VMs are missing the Programmable Interval Timer (PIT). The current version of grub still requires this interface to exist, in order to calculate time.

[Test case]
Attempt to boot a Generation 2 Hyper-V system from MAAS.

[Regression potential]
Since this changes the way timers are picked and used in grub, this may cause things depending on timers (timeout for various features, timeout for the GRUB menu, waiting for keyboard input to get in the menu) may be affected. Any wrong behavior in keyboard input validation for getting into the grub menu on boot should be considered a regression on this patch.

---

There is a patch that uses the EFI SetTimer() available here:

http://savannah.gnu.org/bugs/?42944

and an alternative in the discussion here:

https://lists.gnu.org/archive/html/grub-devel/2014-10/msg00016.html

that uses pmtimer instead. I am aware that grub is a critical package. What is the official/proper way to fix this issue? Can a patched grubnetx64.efi be packaged with MaaS? Do we have to wait for this fix to merge?

Gabriel

Andres Rodriguez (andreserl) wrote :

Hi Gabriel,

MAAS never officially supported booting HyperV VM's, but thank you for letting us know that this was the case.

That being said, this is not bug in MAAS but it is a bug in grub. I'll retarget this appropriately.

Thanks.

Changed in maas:
status: New → Invalid
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in grub2 (Ubuntu):
status: New → Confirmed
Joshua R. Poulson (jrp) wrote :

We had discussed the above change with Canonical before. Please integrate the grub2 fix even though it has not been accepted upstream as it definitely affects Ubuntu on Hyper-V and Azure.

tags: added: kernel-da-key

Any news on this?

Thanks.

Chris Valean (cvalean) wrote :

Thanks Adam, we're checking on this.

Please note that the tsc_pit.c file is part of beta3 only, wasn't present for beta2.
So even if this works, probably it will only if current supported releases will also get grub2 beta3, if there are plans for adding it.

Steve Langasek (vorlon) wrote :

per bug log, awaiting testing of the proposed fix on HyperV.

Changed in grub2 (Ubuntu):
status: Confirmed → Incomplete
Chris Valean (cvalean) wrote :

The suggested patch from comment #5 has been merged in the 2.02 beta3 code.

We've built grub2 from sources and used the compiled grubx64.efi in MAAS 1.9 to attempt to boot a Gen2 VM.

However, the boot process failed to load the kernel and perform a PXE boot, exactly because of the timer not allowing for the transfers to occur.
So from our tests this patch doesn't resolve the problem.

The original patch mentioned has been used numerous times with Ubuntu, as we even have a wiki article published at http://wiki.cloudbase.it/hyperv-uefi-grub as a reference for MAAS deployments if Generation 2 VMs are to be used.

Joshua R. Poulson (jrp) on 2016-03-30
Changed in grub2 (Ubuntu):
status: Incomplete → Confirmed
Changed in grub2 (Ubuntu):
status: Confirmed → In Progress
importance: Undecided → High
assignee: nobody → Mathieu Trudel-Lapierre (cyphermox)

I've uploaded the suggested patch from Rigoberto Corujo and it indeed seems to improve the timer situation to some degree even outside of the PXE case (for example, it seems like EFI grub better handles checking the modifier key here to get to the grub menu).

Still, I'm concerned about the impact on this if EFI SetTimer timers are known to sometimes hang. Have you built grub2 from sources, were you building grub2 beta3 altogether from the git tree or just taking the patches and applying them?

I agree that commit a03c1034 is most likely insufficient to fix the issue if PIT isn't available on gen2 hyper-V; but I think we should do due diligence and check against all of master (if it hasn't been done already), especially for the following patch already in grub2 git; which seem promising:

http://git.savannah.gnu.org/cgit/grub.git/commit/?id=d43a5ee65143f384357fbfdcace4258e3537c214

This specifically calls out fixes for Hyper-V; and while it doesn't attempt to make use of SetTimer; it does use Stall in TimeServices to achieve a similar effect.

I should have been clearer -- I uploaded a grub package *to my PPA* with Rigoberto's patch and noticed these changes locally, in booting from disk on a Thinkpad T450 -- admittedly it's not the same as PXE on Hyper-V, but it does highlight that there may be side-effects to the suggested patch, hence why I'd like to know whether the commit identified above works before doing an upload to the archive with a patch that got refused on the mailing list.

Chris Valean (cvalean) wrote :

We've built beta2 from sources around the #6-#8 comments, as the idea was that commit a03c1034 - and now d43a5ee65143f384357fbfdcace4258e3537c214 are part of beta3.
However, if the current supported releases won't get a grub update to beta3, they would still fail.

I can verify beta3, and can you please check on what is the plan to handle the version of grub delivered?

I'm not sure what you mean there.

My plan to handle this would be to backport the relevant commits to the necessary releases (right now, that basically means d43a5ee); as well as to upload the fix to the current development release. Later, we're likely to pull beta3 in the development release, but it won't be the case for the current supported releases, which will remain at beta2 and will have to use backported patches.

I uploaded grub2 with three patches relating to timers in EFI (do either PIT, or pmtimer, and if all else fails, EFI Stall()). I expect it should fix the issues on Hyper-V, could you please test it once it's available?

Thanks!

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package grub2 - 2.02~beta2-36ubuntu6

---------------
grub2 (2.02~beta2-36ubuntu6) yakkety; urgency=medium

  * Fix booting on Hyper-V gen 2 VMs due to the lack of PIT there; we can deal
    with this by using other timers when PIT aren't available. (LP: #1519836)
    - debian/patches/git_tsc_use_alt_delay_sources_d43a5ee6.patch
    - debian/patches/git_split_pmtimer_wait_tsc_d9a3bfea.patch
    - debian/patches/git_fix_tsc_calibration_pit_a03c1034.patch

 -- Mathieu Trudel-Lapierre <email address hidden> Fri, 13 May 2016 12:28:38 -0400

Changed in grub2 (Ubuntu):
status: In Progress → Fix Released
Changed in grub2-signed (Ubuntu):
status: New → Fix Released
importance: Undecided → High
Changed in grub2 (Ubuntu Xenial):
importance: Undecided → High
Changed in grub2-signed (Ubuntu Xenial):
importance: Undecided → High
Changed in grub2 (Ubuntu Xenial):
status: New → In Progress
Changed in grub2-signed (Ubuntu Xenial):
status: New → In Progress
Changed in grub2 (Ubuntu Xenial):
assignee: nobody → Mathieu Trudel-Lapierre (cyphermox)
Changed in grub2-signed (Ubuntu Xenial):
assignee: nobody → Mathieu Trudel-Lapierre (cyphermox)
description: updated
Adrian Vladu (avladu) wrote :

Hello,

I have tried the grub2 efi version from here:

http://archive.ubuntu.com/ubuntu/dists/yakkety/main/uefi/grub2-amd64/2.02~beta2-36ubuntu11/grubnetx64.efi

and it did not work for Hyper-V gen 2 vms to boot from pxe.

On the other hand, on a yakkety, I have manually built grub2 with the following commands:

    sudo apt-get source grub-efi-amd64
    cd grub2-2.02~beta2/
    sudo apt-get install fakeroot
    dpkg-buildpackage -b # Wait for like an hour
I have used the grub file from debian/grub2-images/2.02~beta2-36ubuntu11/grubnetx64.efi and the pxe boot was successful.

It seems the yakkety builds on archive.ubuntu.com are not working. Can someone take a look into it?

Thank you,
Adrian Vladu

Adrian Vladu (avladu) wrote :

On a second run, also the http://archive.ubuntu.com/ubuntu/dists/yakkety/main/uefi/grub2-amd64/2.02~beta2-36ubuntu11/grubnetx64.efi is working on MAAS 2.0. I will come back with some more results as I want to reproduce the scenario on a clean MAAS installation.

Thank you,
Adrian Vladu

Hello Gabriel, or anyone else affected,

Accepted grub2 into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/grub2/2.02~beta2-36ubuntu3.4 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in grub2 (Ubuntu Xenial):
status: In Progress → Fix Committed
tags: added: verification-needed

Can someone verify this SRU for xenial? I don't have Hyper-V to test with...

Steve Langasek (vorlon) wrote :

Hello Gabriel, or anyone else affected,

Accepted grub2 into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/grub2/2.02~beta2-36ubuntu3.5 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Joshua R. Poulson (jrp) wrote :

I have tested the 16.04 proposed branch with the following settings:
GRUB_HIDDEN_TIMEOUT=false
GRUB_TIMEOUT=5

Booted watching via VM connect and saw the grub menu countdown five seconds as expected, which corrects the PIT problem reported.

Marking verification-done

tags: added: verification-done
removed: verification-needed
Steve Langasek (vorlon) wrote :

Hello Gabriel, or anyone else affected,

Accepted grub2 into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/grub2/2.02~beta2-36ubuntu3.6 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags: removed: verification-done
tags: added: verification-needed

An upload of grub2-signed to xenial-proposed has been rejected from the upload queue for the following reason: "wrong build-depends".

Hello Gabriel, or anyone else affected,

Accepted grub2-signed into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/grub2-signed/1.66.6 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in grub2-signed (Ubuntu Xenial):
status: In Progress → Fix Committed

grub2 had already been verified to work for Hyper-V with the included patchset by Joshua R. Poulso in comment #20, marking verification-done again.

tags: added: verification-done
removed: verification-needed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package grub2 - 2.02~beta2-36ubuntu3.6

---------------
grub2 (2.02~beta2-36ubuntu3.6) xenial; urgency=medium

  * Fix support for IPv6 PXE booting under UEFI: (LP: #1229458)
    - grub_add_grub_env_set_net_property.patch: add grub_env_set_net_property.
    - misc-fix-invalid-char-strtol.patch: fix strto*l methods invalid chars.
    - net_read_bracketed_ipv6_addr.patch: read bracketed IPv6 addresses.
    - bootp_new_net_bootp6_command.patch: add new bootp6 commands.
    - efinet_uefi_ipv6_pxe_support.patch: teach efinet to allow bootp6.
    - bootp_process_dhcpack_http_boot.patch: process DHCPACK, support HTTP.
    - efinet_set_network_from_uefi_devpath.patch: configure network from the
      devpath provided by the UEFI firmware.
    - efinet_set_dns_from_uefi_proto.patch: set DNS nameservers and search
      domains from the UEFI protocol.
  * Fix booting on Hyper-V gen 2 VMs due to the lack of PIT there; we can deal
    with this by using other timers when PIT aren't available. (LP: #1519836)
    - debian/patches/git_tsc_use_alt_delay_sources_d43a5ee6.patch
    - debian/patches/git_split_pmtimer_wait_tsc_d9a3bfea.patch
    - debian/patches/git_fix_tsc_calibration_pit_a03c1034.patch

grub2 (2.02~beta2-36ubuntu3.3) xenial; urgency=medium

  * debian/patches/ip6_send_router_solicitation_7c4b6b7b.patch: handle long
    RA intervals by explicitly sending a SOLICIT.
  * debian/patches/ip6_fix_routing_eb9f401f.patch: fix IPv6 routing; we should
    be able to talk to things outside of link-local addresses; to do this,
    allow specifying a gateway and interface. (LP: #1229458)

 -- Mathieu Trudel-Lapierre <email address hidden> Thu, 15 Sep 2016 13:56:55 -0400

Changed in grub2 (Ubuntu Xenial):
status: Fix Committed → Fix Released

The verification of the Stable Release Update for grub2 has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package grub2-signed - 1.66.6

---------------
grub2-signed (1.66.6) xenial; urgency=medium

  * Rebuild against grub2 2.02~beta2-36ubuntu3.6. (LP: #1229458, #1519836)

 -- Mathieu Trudel-Lapierre <email address hidden> Mon, 19 Dec 2016 21:12:45 -0500

Changed in grub2-signed (Ubuntu Xenial):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers