Timeout downloading initrd
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
| Release Notes for Ubuntu |
Undecided
|
Unassigned | |||
grub2 (Ubuntu) | Status tracked in Hirsute | |||||
| Xenial |
Undecided
|
dann frazier | |||
| Bionic |
Undecided
|
dann frazier | |||
| Focal |
Undecided
|
dann frazier | |||
| Groovy |
Undecided
|
dann frazier | |||
| Hirsute |
Undecided
|
dann frazier |
Bug Description
[Impact]
GRUB times out when downloading downloading large files w/ tftp. This notably breaks subiquity based PXE installs which feature a large initrd. (Observed on several arm64 platforms, though the symptom is not arch-specific).
[Test Case]
Simple test case using an x86 UEFI VM:
Place a kernel/ramdisk on a tftp server. Inflate the initrd or kernel to 87M, e.g.:
dd if=/dev/zero of=initrd.img bs=1M count=87
dd if=initrd.img.orig of=initrd.img conv=notrunc
Success looks like:
Shell> fs0:
FS0:\> \efi\grubnetx64.efi
grub> net_dhcp efinet0
grub> linux (tftp,192.
grub> initrd (tftp,192.
grub>
Failure looks like:
grub> net_dhcp efinet0
grub> linux (tftp,192.
grub> initrd (tftp,192.
!!!! X64 Exception Type - 06(#UD - Invalid Opcode) CPU Apic ID - 00000000 !!!!
RIP - 0000000000099080, CS - 0000000000000038, RFLAGS - 0000000000010286
RAX - 000000007DC2FF00, RCX - 000000004FF99013, RDX - 000000007BF4CCF4
RBX - 000000007BE43FC0, RSP - 000000007FF25AE8, RBP - 000000007BE3C2A0
RSI - 000000000000000B, RDI - 000000007BE3C340
R8 - 000000007DC21168, R9 - 000000007DC1D4AE, R10 - 0000000000000067
R11 - 0000000000000002, R12 - 000000007BE3CCA0, R13 - 000000007BE3C260
R14 - 0000000000020004, R15 - 000000007DC1A613
DS - 0000000000000030, ES - 0000000000000030, FS - 0000000000000030
GS - 0000000000000030, SS - 0000000000000030
CR0 - 0000000080010033, CR2 - 0000000000000000, CR3 - 000000007FC01000
CR4 - 0000000000000668, CR8 - 0000000000000000
DR0 - 0000000000000000, DR1 - 0000000000000000, DR2 - 0000000000000000
DR3 - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - 0000000000000400
GDTR - 000000007F9EE698 0000000000000047, LDTR - 0000000000000000
IDTR - 000000007F4B2018 0000000000000FFF, TR - 0000000000000000
FXSAVE_STATE - 000000007FF25740
!!!! Can't find image information. !!!!
This was originally discovered on a Cavium ThunderX CRB system using subiquity from the groovy arm64 ISO. Failure there looks like:
������
�*Ubuntu Server �
� �
� �
� �
� �
� �
� �
� �
� �
� �
� �
� �
� �
������
Use the and keys to select which entry is highlighted.
Press enter to boot the selected OS, `e' to edit the commands
before booting or `c' for a command-line.
error: timeout reading `initrd'.
Press any key to continue...
[Fix]
https:/
[Where problems could occur]
The fix is to the tftp command, so problems would like appear in the tftp stack, possibly due to inconsistencies between tftp server implementations.
Steve Langasek (vorlon) wrote : | #2 |
Just to confirm, are you using the initrd as extracted from the .iso?
affects: | ubuntu-cdimage → livecd-rootfs (Ubuntu) |
Steve Langasek (vorlon) wrote : | #3 |
And is there something intrinsic to this hardware that leads to the timeout? Or would this perhaps work if the machine had a faster network link to the tftp server?
dann frazier (dannf) wrote : Re: [Bug 1900773] Re: thunderx CRB systems tftp timeout downloading initrd | #4 |
On Tue, Oct 20, 2020 at 5:30 PM Steve Langasek
<email address hidden> wrote:
>
> Just to confirm, are you using the initrd as extracted from the .iso?
I am.
dann frazier (dannf) wrote : | #5 |
On Tue, Oct 20, 2020 at 5:30 PM Steve Langasek
<email address hidden> wrote:
>
> And is there something intrinsic to this hardware that leads to the
> timeout?
I wonder if there might be a firmware bug. Here's what I see on the wire:
52120 56.958016 10.229.50.135 10.229.50.84 TFTP 78 Read Request, File:
initrd, Transfer type: octet, blksize=1024, tsize=0
52121 56.960144 10.229.50.84 10.229.50.135 TFTP 72 Option
Acknowledgement, blksize=1024, tsize=90771321
<---snip--->
183218 65.602857 10.229.50.84 10.229.50.135 TFTP 1070 Data Packet, Block: 65535
183219 65.603129 10.229.50.135 10.229.50.84 TFTP 60 Acknowledgement,
Block: 65535
<---snip--->
229458 68.561417 10.229.50.135 10.229.50.84 TFTP 60 Acknowledgement,
Block: 88643
229459 68.561466 10.229.50.84 10.229.50.135 TFTP 935 Data Packet,
Block: 88644 (last)
229460 68.561519 10.229.50.135 10.229.50.84 TFTP 60 Acknowledgement,
Block: 88644
229462 68.962413 10.229.50.135 10.229.50.84 TFTP 60 Acknowledgement,
Block: 65535
It looks like the entire initrd was successfully transferred. The
client ACK'd the last block but then, for some reason, it comes back
about half a second later and re-ACKs a block it had already ACK'd.
And that block being number number 65535 is *interesting*.
> Or would this perhaps work if the machine had a faster network
> link to the tftp server?
Perhaps - but these systems are in the same physical location, slowest
link between is 1Gbps.
I hit this issue on Hisilicon d06 when PXE groovy subiquity. System boots without initrd and hang on no rootfs.
error: timeout reading `/casper/initrd'.
Press any key to continue...
dann frazier (dannf) wrote : Re: [Bug 1900773] Re: thunderx CRB systems tftp timeout downloading initrd | #7 |
On Wed, Oct 21, 2020, 03:20 Ike Panhc <email address hidden> wrote:
> I hit this issue on Hisilicon d06 when PXE groovy subiquity. System
> boots without initrd and hang on no rootfs.
>
>
> error: timeout reading `/casper/initrd'.
>
> Press any key to continue...
>
Interesting, does recompressing with lzma suffice as a workaround for d06
also?
>
90771321 initrd
44620048 initrd.lzma
Yes. recompress with `lzma -9` and I can finish the installation.
dann frazier (dannf) wrote : | #10 |
Since technically GRUB is requesting the initrd file, I tried to rule it out as a possible cause by seeing if the initial payload that UEFI downloads directly would time out if it was the same size. I did this by padding my grubnetaa64.efi binary to be the same size as the installer initrd.
mv grubnetaa64.efi grubnetaa64.
cp casper/initrd grubnetaa64.efi
dd if=grubnetaa64.
This did *not* timeout. So unfortunately we can't obviously rule out GRUB as a factor.
Ike Panhc (ikepanhc) wrote : | #11 |
More infomation. I can not reproduce with grubnetaa64.efi from focal.
This is the focal grub I download, which can not reproduce
This is the groovy grub, which is able to reproduce
Taihsiang Ho (taihsiangho) wrote : | #12 |
My apologies. The comment#9 https:/
If d05 (d05-4) uses the groovy grub, it COULD reproduce this issue as well.
So I would say the testing result of d05 is the same as d06 by @Ike.
- grub from focal ---> not reproduce this issue
- grub from groovy --> able to reproduce this issue
My next will be:
- hide comment#9 to not confuse people
- also try the initrd.lzma workaround
Taihsiang Ho (taihsiangho) wrote : | #13 |
Repacking initrd as initrd.lz is a working workaround on d05. By using initrd.lz, I could not reproduce this issue on d05-4.
summary: |
- thunderx CRB systems tftp timeout downloading initrd + ARM servers timeout downloading initrd |
Groovy rc 20201022 daily image could reproduce on d05 (d05-1) and d06 (kreiken). Besides, the lz workaround still works on both of the platforms.
dann frazier (dannf) wrote : | #15 |
Turns out this is a GRUB issue. It was fixed upstream in the following commit, which cleanly cherry-picks to groovy. I prepared a test build in ppa:dannf/test and confirmed it resolves the issue.
commit a6838bbc6726ad6
Author: Javier Martinez Canillas <email address hidden>
Date: Thu Sep 10 17:17:57 2020 +0200
tftp: Roll-over block counter to prevent data packets timeouts
Commit 781b3e5efc3 (tftp: Do not use priority queue) caused a regression
when fetching files over TFTP whose size is bigger than 65535 * block size.
grub> linux /images/
grub> echo $?
0
grub> initrd /images/
error: timeout reading '/images/
grub> echo $?
28
Changed in livecd-rootfs (Ubuntu): | |
status: | New → Invalid |
Changed in grub2 (Ubuntu): | |
status: | New → Confirmed |
dann frazier (dannf) wrote : | #16 |
I am able to reproduce w/ focal as well. The above patch is tagged as "Fixes: 781b3e5efc3 (tftp: Do not use priority queue)", which is in focal (d/p/0090-
[1] http://
[2] http://
Changed in grub2 (Ubuntu Focal): | |
status: | New → Confirmed |
Changed in livecd-rootfs (Ubuntu Focal): | |
status: | New → Invalid |
summary: |
- ARM servers timeout downloading initrd + Timeout downloading initrd |
description: | updated |
Changed in grub2 (Ubuntu Hirsute): | |
assignee: | nobody → dann frazier (dannf) |
status: | Confirmed → In Progress |
description: | updated |
Hello dann, or anyone else affected,
Accepted grub2 into groovy-proposed. The package will build now and be available at https:/
Please help us by testing this new package. See https:/
If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-
Further information regarding the verification process can be found at https:/
N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.
Changed in grub2 (Ubuntu Groovy): | |
status: | Confirmed → Fix Committed |
tags: | added: verification-needed verification-needed-groovy |
Changed in grub2 (Ubuntu Focal): | |
status: | Confirmed → Fix Committed |
tags: | added: verification-needed-focal |
Łukasz Zemczak (sil2100) wrote : | #18 |
Hello dann, or anyone else affected,
Accepted grub2 into focal-proposed. The package will build now and be available at https:/
Please help us by testing this new package. See https:/
If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-
Further information regarding the verification process can be found at https:/
N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.
Launchpad Janitor (janitor) wrote : | #19 |
This bug was fixed in the package grub2 - 2.04-1ubuntu36
---------------
grub2 (2.04-1ubuntu36) hirsute; urgency=medium
* Avoid "EFI stub: FIRMWARE BUG" message when booting >= 5.7 kernels
on arm64 by setting the image base address before jumping to the
PE/COFF entry point LP: #1900774
* Fix tftp timeouts when fetch large files. LP: #1900773
-- dann frazier <email address hidden> Wed, 11 Nov 2020 07:17:49 -0700
Changed in grub2 (Ubuntu Hirsute): | |
status: | In Progress → Fix Released |
Łukasz Zemczak (sil2100) wrote : | #20 |
Hello dann, or anyone else affected,
Accepted grub2 into bionic-proposed. The package will build now and be available at https:/
Please help us by testing this new package. See https:/
If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-
Further information regarding the verification process can be found at https:/
N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.
Changed in grub2 (Ubuntu Bionic): | |
status: | New → Fix Committed |
tags: | added: verification-needed-bionic |
Łukasz Zemczak (sil2100) wrote : | #21 |
Hello dann, or anyone else affected,
Accepted grub2 into xenial-proposed. The package will build now and be available at https:/
Please help us by testing this new package. See https:/
If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-
Further information regarding the verification process can be found at https:/
N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.
Changed in grub2 (Ubuntu Xenial): | |
status: | New → Fix Committed |
tags: | added: verification-needed-xenial |
All autopkgtests for the newly accepted grub2 (2.04-1ubuntu26.7) for focal have finished running.
The following regressions have been reported in tests triggered by the package:
ubuntu-
Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUp
https:/
[1] https:/
Thank you!
All autopkgtests for the newly accepted grub2 (2.04-1ubuntu35.1) for groovy have finished running.
The following regressions have been reported in tests triggered by the package:
grubzfs-
ubuntu-
ubiquity/unknown (amd64)
zsys/unknown (amd64)
grml2usb/unknown (amd64)
Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUp
https:/
[1] https:/
Thank you!
Changed in grub2 (Ubuntu Groovy): | |
assignee: | nobody → dann frazier (dannf) |
Changed in grub2 (Ubuntu Focal): | |
assignee: | nobody → dann frazier (dannf) |
Changed in grub2 (Ubuntu Bionic): | |
assignee: | nobody → dann frazier (dannf) |
Changed in grub2 (Ubuntu Xenial): | |
assignee: | nobody → dann frazier (dannf) |
dann frazier (dannf) wrote : | #24 |
= groovy verification =
Minimal BASH-like line editing is supported. For the first word, TAB
lists possible command completions. Anywhere else TAB lists possible
device or file completions.
grub> net_dhcp efinet0
grub> linux (tftp,192.
grub> initrd (tftp,192.
grub>
= focal verification =
Minimal BASH-like line editing is supported. For the first word, TAB
lists possible command completions. Anywhere else TAB lists possible
device or file completions.
grub> net_dhcp efinet0
grub> linux (tftp,192.
grub> initrd (tftp,192.
grub>
= bionic verification =
Minimal BASH-like line editing is supported. For the first word, TAB
lists possible command completions. Anywhere else TAB lists possible
device or file completions.
grub> net_add_addr test efinet0 192.168.122.86
grub> linux (tftp,192.
grub> initrd (tftp,192.
grub>
= xenial verification =
Minimal BASH-like line editing is supported. For the first word, TAB
lists possible command completions. Anywhere else TAB lists possible
device or file completions.
grub> net_add_addr test efinet0 192.168.122.86
grub> linux (tftp,192.
grub> initrd (tftp,192.
grub>
tags: |
added: verification-done verification-done-focal verification-done-groovy verification-done-xenial removed: verification-needed verification-needed-focal verification-needed-groovy verification-needed-xenial |
no longer affects: | livecd-rootfs (Ubuntu) |
no longer affects: | livecd-rootfs (Ubuntu Focal) |
no longer affects: | livecd-rootfs (Ubuntu Groovy) |
no longer affects: | livecd-rootfs (Ubuntu Hirsute) |
Łukasz Zemczak (sil2100) wrote : | #25 |
The listed grub versions in the output look weird, but I trust that the right packages from -proposed have been used for validation.
Launchpad Janitor (janitor) wrote : | #27 |
This bug was fixed in the package grub2 - 2.04-1ubuntu35.1
---------------
grub2 (2.04-1ubuntu35.1) groovy; urgency=medium
* Avoid "EFI stub: FIRMWARE BUG" message when booting >= 5.7 kernels
on arm64 by setting the image base address before jumping to the
PE/COFF entry point LP: #1900774
* Fix tftp timeouts when fetching large files. LP: #1900773
-- dann frazier <email address hidden> Thu, 12 Nov 2020 16:08:57 -0700
Changed in grub2 (Ubuntu Groovy): | |
status: | Fix Committed → Fix Released |
The verification of the Stable Release Update for grub2 has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.
Launchpad Janitor (janitor) wrote : | #28 |
This bug was fixed in the package grub2 - 2.04-1ubuntu26.7
---------------
grub2 (2.04-1ubuntu26.7) focal; urgency=medium
* Avoid "EFI stub: FIRMWARE BUG" message when booting >= 5.7 kernels
on arm64 by setting the image base address before jumping to the
PE/COFF entry point LP: #1900774
* Fix tftp timeouts when fetching large files. LP: #1900773
-- dann frazier <email address hidden> Thu, 12 Nov 2020 16:15:13 -0700
Changed in grub2 (Ubuntu Focal): | |
status: | Fix Committed → Fix Released |
tags: |
added: verification-done-bionic removed: verification-needed-bionic |
Launchpad Janitor (janitor) wrote : | #29 |
This bug was fixed in the package grub2 - 2.02-2ubuntu8.20
---------------
grub2 (2.02-2ubuntu8.20) bionic; urgency=medium
* Avoid "EFI stub: FIRMWARE BUG" message when booting >= 5.7 kernels
on arm64 by setting the image base address before jumping to the
PE/COFF entry point LP: #1900774
* Fix tftp timeouts when fetching large files. LP: #1900773
-- dann frazier <email address hidden> Fri, 13 Nov 2020 17:40:19 -0700
Changed in grub2 (Ubuntu Bionic): | |
status: | Fix Committed → Fix Released |
Launchpad Janitor (janitor) wrote : | #30 |
This bug was fixed in the package grub2 - 2.02~beta2-
---------------
grub2 (2.02~beta2-
* Avoid "EFI stub: FIRMWARE BUG" message when booting >= 5.7 kernels
on arm64 by setting the image base address before jumping to the
PE/COFF entry point LP: #1900774
* Fix tftp timeouts when fetching large files. LP: #1900773
-- dann frazier <email address hidden> Fri, 13 Nov 2020 18:03:44 -0700
Changed in grub2 (Ubuntu Xenial): | |
status: | Fix Committed → Fix Released |
This bug has been reported on the Ubuntu ISO testing tracker.
A list of all reports related to this bug can be found here: iso.qa. ubuntu. com/qatracker/ reports/ bugs/1900773
http://