TFTP timeout when booting from grub that was PXE loaded

Bug #1508893 reported by Raghuram Kota
26
This bug affects 4 people
Affects Status Importance Assigned to Milestone
grub2 (Ubuntu)
Fix Released
High
dann frazier
Trusty
Fix Released
High
dann frazier
Vivid
Fix Released
High
dann frazier
Wily
Fix Released
High
dann frazier

Bug Description

[Impact]
PXE booting of UEFI systems is very slow, to the point that some systems timeout.

[Test Case]
PXE boot a UEFI-based system (d-i or MAAS) and monitor the time it takes for GRUB to download the kernel/initrd. tcpdump will show TFTP timeouts, and it can take on the order of minutes to begin running the kernel.

[Regression Risk]
The fix is restricted to UEFI-based systems. For those systems, it could lead to a regression if Managed Network Protocol is required to remain active while GRUB is performing the network boot.

Revision history for this message
Raghuram Kota (rkota) wrote :

Ming Lei has identified that this issue is fixed with the following upstream grub patch:

commit 49426e9fd2e562c73a4f1206f32eff9e424a1a73
Author: Andrei Borzenkov <email address hidden>
Date: Thu May 7 20:37:17 2015 +0300

    efinet: open Simple Network Protocol exclusively

The next step is to backport this patch into Wily and probably also Trusty.

Revision history for this message
dann frazier (dannf) wrote :

I applied just this patch to wily's grub and tested on a thunder platform w/ AMI UEFI, and it appears to have introduced a regression. Where a MAAS boot would work successfully before, it now triggers a Synchronous Exception.

The attached script will create a grub netboot image the same way MAAS does.

Revision history for this message
Ming Lei (tom-leiming) wrote :

Today I have applied this patch(efinet: open Simple Network Protocol exclusively) against grub on wily, looks it
does fix the issue on mustang/merlin.

Dann, could you build one upstrem grub and test it on thunder to see if there is the synchronous exception issue?

Revision history for this message
Ming Lei (tom-leiming) wrote :

Dann, I just run a quick test on cvm0 and looks the grub.efi built from wily plug the patch just works fine, and
attached my build commandline.

./autogen.sh
./configure --host=x86_64-linux-gnu --target=aarch64-linux-gnu --build=x86_64-linux-gnu --with-platform=efi --prefix=/tmp/grub64-efi_installed-wily

make -j 24
make -j8 install

cd /tmp/grub64-efi_installed-wily
modules="boot chain configfile configfile efinet ext2 fat gettext help hfsplus linux loadenv lsefi normal normal ntfs ntfscomp part_gpt part_msdos part_msdos read search search_fs_file search_fs_uuid search_label terminal terminfo tftp"

bin/grub-mkimage -v -o grub.efi -O arm64-efi -p "bootw" $modules

Revision history for this message
Andrew Cloke (andrew-cloke) wrote :

From discussions with Ming, the cvm0 node on which he tested successfully, may be running with Cavium's UEFI rather than AMI's.

Revision history for this message
dann frazier (dannf) wrote : Re: [Bug 1508893] Re: TFTP timeout on ARM64 hw when booting from grub that was PXE loaded

On Fri, Oct 23, 2015 at 1:38 AM, Ming Lei <email address hidden> wrote:
> Today I have applied this patch(efinet: open Simple Network Protocol exclusively) against grub on wily, looks it
> does fix the issue on mustang/merlin.
>
> Dann, could you build one upstrem grub and test it on thunder to see if
> there is the synchronous exception issue?

A build from upstream git shows this problem as well.

  -dann

Revision history for this message
Ming Lei (tom-leiming) wrote : Re: TFTP timeout on ARM64 hw when booting from grub that was PXE loaded

> A build from upstream git shows this problem as well.

Looks it is thought as one AMI firmware's issue, so the patch should be merged to grub, otherwise
grub can't run on APM's UEFI firmware.

Thanks,

Revision history for this message
Newell Jensen (newell-jensen) wrote :

Ming,

Trying to use your bootnetaa64.efi file from http://kernel.ubuntu.com/~ming/bugs/1508893/bootnetaa64.efi

I am not able to test this because I cannot PXE boot:

TianoCore 2.0.0 UEFI 2.4.0 Sep 1 2015 12:48:07
CPU: APM ARM 64-bit Strega Rev A2 2400MHz PCP 2400MHz
     32 KB ICACHE, 32 KB DCACHE
     SOC 2000MHz IOBAXI 400MHz AXI 250MHz AHB 200MHz GFC 66MHz
Board: X-Gene Merlin Board
Slimpro FW:
 Ver: 3.4 (build 2015/07/22)
 PMD: 980 mV
 SOC: 950 mV
The default boot selection will start in 3 seconds
[1] ubuntu
[2] PXE on MAC :3C
[3] nfs boot via TFTP
[4] Shell
[5] Boot Manager
[6] Reboot
[7] Shutdown
Start: 2
..PXE-E23: Client received TFTP error from server.
[1] ubuntu
[2] PXE on MAC :3C
[3] nfs boot via TFTP
[4] Shell
[5] Boot Manager
[6] Reboot
[7] Shutdown
Start: 2
..PXE-E23: Client received TFTP error from server.
[1] ubuntu
[2] PXE on MAC :3C
[3] nfs boot via TFTP
[4] Shell
[5] Boot Manager
[6] Reboot
[7] Shutdown
Start:

Revision history for this message
Ming Lei (tom-leiming) wrote : Re: [Bug 1508893] Re: TFTP timeout on ARM64 hw when booting from grub that was PXE loaded
Download full text (3.3 KiB)

On Tue, Nov 3, 2015 at 11:48 PM, Newell Jensen
<email address hidden> wrote:
> Ming,
>
> Trying to use your bootnetaa64.efi file from
> http://kernel.ubuntu.com/~ming/bugs/1508893/bootnetaa64.efi
>
> I am not able to test this because I cannot PXE boot:
>
> TianoCore 2.0.0 UEFI 2.4.0 Sep 1 2015 12:48:07
> CPU: APM ARM 64-bit Strega Rev A2 2400MHz PCP 2400MHz
> 32 KB ICACHE, 32 KB DCACHE
> SOC 2000MHz IOBAXI 400MHz AXI 250MHz AHB 200MHz GFC 66MHz
> Board: X-Gene Merlin Board
> Slimpro FW:
> Ver: 3.4 (build 2015/07/22)
> PMD: 980 mV
> SOC: 950 mV
> The default boot selection will start in 3 seconds
> [1] ubuntu
> [2] PXE on MAC :3C
> [3] nfs boot via TFTP
> [4] Shell
> [5] Boot Manager
> [6] Reboot
> [7] Shutdown
> Start: 2
> ..PXE-E23: Client received TFTP error from server.
> [1] ubuntu
> [2] PXE on MAC :3C
> [3] nfs boot via TFTP
> [4] Shell
> [5] Boot Manager
> [6] Reboot
> [7] Shutdown
> Start: 2
> ..PXE-E23: Client received TFTP error from server.
> [1] ubuntu
> [2] PXE on MAC :3C
> [3] nfs boot via TFTP
> [4] Shell
> [5] Boot Manager
> [6] Reboot
> [7] Shutdown
> Start:

I can't see the failure, and looks it is fine for me today, see following log:
TianoCore 2.0.0 UEFI 2.4.0 Sep 1 2015 12:48:07
CPU: APM ARM 64-bit Strega Rev A2 2400MHz PCP 2400MHz
     32 KB ICACHE, 32 KB DCACHE
     SOC 2000MHz IOBAXI 400MHz AXI 250MHz AHB 200MHz GFC 66MHz
Board: X-Gene Merlin Board
Slimpro FW:
    Ver: 3.4 (build 2015/07/22)
    PMD: 980 mV
    SOC: 950 mV
The default boot selection will start in 3 seconds
[1] ubuntu
[2] PXE on MAC :3C
[3] nfs boot via TFTP
[4] Shell
[5] Boot Manager
[6] Reboot
[7] Shutdown
Start: 2
..

then follows grub menu.

If you are online a bit early tomorrow, I can test it with you together.

Thanks,

>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1508893
>
> Title:
> TFTP timeout on ARM64 hw when booting from grub that was PXE loaded
>
> Status in grub2 package in Ubuntu:
> New
>
> Bug description:
> This issue was discovered with Ubuntu Wily (15.10) on a currently in
> development ARM64 HW.
>
> When loading kernel via tftp in grub on this hardware, the system
> stops receiving new packets (times out ) after receiving initial few
> tens of tftp data packets , which then causes kernel load failure.
>
> Running tcpdump before loading kernel in grub menu shows timeouts
> occurring, as captured in the below log :
>
> http://kernel.ubuntu.com/~ming/grub/apm.tcpdump
>
> Following are the detailed reproduction steps :
>
> 1) setup PXE boot entry in UEFI(no any parameter to grub)
> 2) build one grub from upstream (and/or Wily) and put it in PXE&TFTP server
> 3) setup grub config, suppose the shape is like below:
> menuentry 'Install for arm64' {
> linux /ubuntu-installer/arm64/Image --- console=ttyS0,115200
> initrd /ubuntu-installer/arm64/initrd.gz
> }
> 4) start PXE booting in UEFI
> 5) grub prompt is coming
> 6) select 'Install for arm64' menu item and press 'enter' to start
> loading kernel
> 7) then hangs inside l...

Read more...

Revision history for this message
Launchpad Janitor (janitor) wrote : Re: TFTP timeout on ARM64 hw when booting from grub that was PXE loaded

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in grub2 (Ubuntu):
status: New → Confirmed
dann frazier (dannf)
summary: - TFTP timeout on ARM64 hw when booting from grub that was PXE loaded
+ TFTP timeout when booting from grub that was PXE loaded
dann frazier (dannf)
description: updated
Changed in grub2 (Ubuntu Wily):
assignee: nobody → dann frazier (dannf)
Changed in grub2 (Ubuntu Vivid):
assignee: nobody → dann frazier (dannf)
Changed in grub2 (Ubuntu Trusty):
assignee: nobody → dann frazier (dannf)
Changed in grub2 (Ubuntu):
importance: Undecided → High
Changed in grub2 (Ubuntu Trusty):
importance: Undecided → High
Changed in grub2 (Ubuntu Vivid):
importance: Undecided → High
Changed in grub2 (Ubuntu Wily):
importance: Undecided → High
Changed in grub2 (Ubuntu Trusty):
status: New → In Progress
Changed in grub2 (Ubuntu Wily):
status: New → In Progress
Changed in grub2 (Ubuntu Vivid):
status: New → In Progress
Changed in grub2 (Ubuntu):
status: Confirmed → In Progress
assignee: nobody → dann frazier (dannf)
Revision history for this message
Ming Lei (tom-leiming) wrote :

On HP ProLiant m400 Server, when booting via UEFI, TFTP still may timeout when loading kernel by
netboot.

Looks only the commit 49426e9fd2( efinet: open Simple Network Protocol exclusively) isn't enough, and the
following three commits are required too for grub working well on HP m400 ARM64 server:

      7b386b703154c0901c4616(efidisk: move device path helpers in core for efinet)
      c52ae40570c3bfbcca22d21(efinet: skip virtual IPv4 and IPv6 devices when enumerating cards)
     f348aee7b33dd85e7da62b(efinet: enable hardware filters when opening interface)

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package grub2 - 2.02~beta2-32

---------------
grub2 (2.02~beta2-32) unstable; urgency=medium

  [ Mathieu Trudel-Lapierre ]
  * Cherry-pick patch to add SAS disks to the device list from the ofdisk
    module. (LP: #1517586)

  [ dann frazier ]
  * Cherry-pick patch to open Simple Network Protocol exclusively.
    (LP: #1508893)

  [ Linn Crosetto ]
  * Install arm64 signed images if UEFI Secure Boot is enabled (closes:
    #806178).

 -- Colin Watson <email address hidden> Wed, 25 Nov 2015 16:07:21 +0000

Changed in grub2 (Ubuntu):
status: In Progress → Fix Released
Revision history for this message
Steve Langasek (vorlon) wrote : Please test proposed package

Hello Raghuram, or anyone else affected,

Accepted grub2 into trusty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/grub2/2.02~beta2-9ubuntu1.5 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in grub2 (Ubuntu Trusty):
status: In Progress → Fix Committed
tags: added: verification-needed
Revision history for this message
Steve Langasek (vorlon) wrote :

Hello Raghuram, or anyone else affected,

Accepted grub2 into vivid-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/grub2/2.02~beta2-22ubuntu1.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in grub2 (Ubuntu Vivid):
status: In Progress → Fix Committed
Revision history for this message
Steve Langasek (vorlon) wrote :

Hello Raghuram, or anyone else affected,

Accepted grub2 into wily-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/grub2/2.02~beta2-29ubuntu0.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in grub2 (Ubuntu Wily):
status: In Progress → Fix Committed
Revision history for this message
Chris J Arges (arges) wrote :

Hello Raghuram, or anyone else affected,

Accepted grub2-signed into trusty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/grub2-signed/1.34.6 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Revision history for this message
Chris J Arges (arges) wrote :

Hello Raghuram, or anyone else affected,

Accepted grub2-signed into vivid-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/grub2-signed/1.46.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Revision history for this message
Chris J Arges (arges) wrote :

Hello Raghuram, or anyone else affected,

Accepted grub2-signed into wily-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/grub2-signed/1.55.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Revision history for this message
dann frazier (dannf) wrote :

I've verified this by doing netboot installs w/ the updated GRUB on arm64/efi. It certainly "feels" faster (minutes to seconds), and shows no signs of regression.

tags: added: verification-done
removed: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package grub2 - 2.02~beta2-9ubuntu1.5

---------------
grub2 (2.02~beta2-9ubuntu1.5) trusty; urgency=medium

  * d/p/arm64-set-correct-length-of-device-path-end-entry.patch: Fixes
    booting arm64 kernels on certain UEFI implementations. (LP: #1476882)
  * progress: avoid NULL dereference for net files. (LP: #1459872)
  * arm64/setjmp: Add missing license macro. (LP: #1459871)
  * Cherry-pick patch to add SAS disks to the device list from the ofdisk
    module. (LP: #1517586)
  * Cherry-pick patch to open Simple Network Protocol exclusively.
    (LP: #1508893)

 -- dann frazier <email address hidden> Wed, 25 Nov 2015 13:13:35 -0700

Changed in grub2 (Ubuntu Trusty):
status: Fix Committed → Fix Released
Revision history for this message
Steve Langasek (vorlon) wrote : Update Released

The verification of the Stable Release Update for grub2 has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Chris J Arges (arges) wrote : Please test proposed package

Hello Raghuram, or anyone else affected,

Accepted grub2 into vivid-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/grub2/2.02~beta2-22ubuntu1.5 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags: removed: verification-done
tags: added: verification-needed
Revision history for this message
Chris J Arges (arges) wrote :

Hello Raghuram, or anyone else affected,

Accepted grub2 into wily-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/grub2/2.02~beta2-29ubuntu0.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Revision history for this message
Newell Jensen (newell-jensen) wrote :

I tested this on wily, vivid, and trusty. Marking as verification-done.

tags: added: verification-done
removed: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package grub2 - 2.02~beta2-22ubuntu1.5

---------------
grub2 (2.02~beta2-22ubuntu1.5) vivid; urgency=medium

  * Merge in changes from 2.02~beta2-22ubuntu1.3:
    - d/p/arm64-set-correct-length-of-device-path-end-entry.patch: Fixes
      booting arm64 kernels on certain UEFI implementations. (LP: #1476882)
    - progress: avoid NULL dereference for net files. (LP: #1459872)
    - arm64/setjmp: Add missing license macro. (LP: #1459871)
    - Cherry-pick patch to add SAS disks to the device list from the ofdisk
      module. (LP: #1517586)
    - Cherry-pick patch to open Simple Network Protocol exclusively.
      (LP: #1508893)
  * Cherry-picks to better handle TFTP timeouts on some arches: (LP: #1521612)
    - (7b386b7) efidisk: move device path helpers in core for efinet
    - (c52ae40) efinet: skip virtual IP devices when enumerating cards
    - (f348aee) efinet: enable hardware filters when opening interface
  * Update quick boot logic to handle abstractions for which there is no
    write support. (LP: #1274320)

 -- dann frazier <email address hidden> Wed, 16 Dec 2015 13:31:15 -0700

Changed in grub2 (Ubuntu Vivid):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package grub2 - 2.02~beta2-29ubuntu0.3

---------------
grub2 (2.02~beta2-29ubuntu0.3) wily; urgency=medium

  * Merge in changes from 2.02~beta2-29ubuntu0.1:
    - arm64/setjmp: Add missing license macro. (LP: #1459871)
    - Cherry-pick patch to add SAS disks to the device list from the ofdisk
      module. (LP: #1517586)
    - Cherry-pick patch to open Simple Network Protocol exclusively.
      (LP: #1508893)
  * Cherry-picks to better handle TFTP timeouts on some arches: (LP: #1521612)
    - (7b386b7) efidisk: move device path helpers in core for efinet
    - (c52ae40) efinet: skip virtual IP devices when enumerating cards
    - (f348aee) efinet: enable hardware filters when opening interface

 -- dann frazier <email address hidden> Wed, 16 Dec 2015 10:05:39 -0700

Changed in grub2 (Ubuntu Wily):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.