UEFI network boot hangs at grub for adapter 82599ES 10-Gigabit SFI/SFP+
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MAAS |
Invalid
|
High
|
Unassigned | ||
maas-images |
Fix Released
|
High
|
Unassigned | ||
python-tx-tftp |
Invalid
|
Undecided
|
Unassigned | ||
grub2 (Ubuntu) |
Fix Released
|
High
|
Mathieu Trudel-Lapierre | ||
Trusty |
Confirmed
|
High
|
Mathieu Trudel-Lapierre | ||
Xenial |
Fix Released
|
High
|
Mathieu Trudel-Lapierre | ||
Yakkety |
Won't Fix
|
High
|
Mathieu Trudel-Lapierre | ||
grub2-signed (Ubuntu) |
Confirmed
|
Undecided
|
Unassigned | ||
Trusty |
Confirmed
|
Undecided
|
Unassigned | ||
Xenial |
Fix Released
|
Medium
|
Mathieu Trudel-Lapierre | ||
Yakkety |
Won't Fix
|
Undecided
|
Unassigned |
Bug Description
[Impact]
MAAS commissioning may fail when deploying Xenial images or using grubx64.efi from Xenial due to hardware particularities of some Intel 82599-based network cards. Other network manufacturers may be affected as well. The main failure mode appears to be an infinite re-send of some packets because of an unexpected response from the network hardware.
[Test case]
1) Attempt to netboot on a system with a "82599ES 10-Gigabit SFI/SFP+" network adapter; in UEFI mode.
2) Validate that netbooting happens correctly, passing control over to the kernel as configured in grub.cfg.
3) Validate that netbooting another system, not using an Intel 82599 adapter, behaves normally when booting in UEFI mode.
4) Validate that netbooting another system, not using an Intel 82599 adapter, behaves normally when booting in LEGACY mode.
[Regression potential]
As this affects network in EFI mode; any failure to netboot using EFI should be considered a possible regression. Systems may fail to receive data from the network boot server and terminate the process with a timeout. Another possible failure scenario is to fail to receive complete data over the network, or data corruption.
----
I am using MAAS to commission and install machines. When I attempt to commission a machine with a "82599ES 10-Gigabit SFI/SFP+" network adapter the following happens:
1) TFTP Request — bootx64.efi
2) TFTP Request — /grubx64.efi
3) Console hangs at grub prompt
If I go into bios and force the adapter above into legacy mode then the machine is able to network boot and run through the commission process.
1) TFTP Request — ubuntu/
2) TFTP Request — ubuntu/
3) TFTP Request — ifcpu64.c32
4) PXE Request — power off
5) TFTP Request — pxelinux.
6) TFTP Request — pxelinux.
7) TFTP Request — pxelinux.0
Also, if I disconnect the cable to the adapter above and connect a cable to the integrated "I210 Gigabit" adapter which is configured for UEFI mode. The machine is able to network boot grubx64.efi and run through the commission process.
~$ dpkg -l '*maas*'|cat
Desired=
| Status=
|/ Err?=(none)
||/ Name Version Architecture Description
+++-===
ii maas 1.7.2+bzr3355-
ii maas-cli 1.7.2+bzr3355-
ii maas-cluster-
ii maas-common 1.7.2+bzr3355-
ii maas-dhcp 1.7.2+bzr3355-
ii maas-dns 1.7.2+bzr3355-
ii maas-proxy 1.7.2+bzr3355-
ii maas-region-
ii maas-region-
ii python-django-maas 1.7.2+bzr3355-
ii python-maas-client 1.7.2+bzr3355-
ii python-
~$
Changed in grub2-signed (Ubuntu Xenial): | |
assignee: | nobody → Mathieu Trudel-Lapierre (cyphermox) |
importance: | Undecided → Medium |
status: | New → In Progress |
description: | updated |
tags: | added: cpe-onsite |
description: | updated |
Changed in grub2 (Ubuntu Yakkety): | |
status: | New → Won't Fix |
Changed in grub2-signed (Ubuntu Yakkety): | |
status: | New → Won't Fix |
description: | updated |
tags: | added: id-5ab2aac1fcfcb094be6eb2e1 |
Changed in maas-images: | |
status: | Triaged → Fix Released |
Changed in maas: | |
milestone: | next → none |
Seems that grubnetx64.efi is hanging with that interface as it should next request the grub/grub.cfg file, but that never occurs. Feels like its a grubnetx64.efi issue, targeting to that as well.