Comment 2 for bug 2046182

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tools (master)

Reviewed: https://review.opendev.org/c/starlingx/tools/+/903374
Committed: https://opendev.org/starlingx/tools/commit/00a5ccd35bdf40389878cb9b132bc4d2dc2683f1
Submitter: "Zuul (22348)"
Branch: master

commit 00a5ccd35bdf40389878cb9b132bc4d2dc2683f1
Author: M. Vefa Bicakci <email address hidden>
Date: Mon Dec 4 16:52:53 2023 +0000

    GRUB configuration: Increase UEFI watchdog timeout

    This commit increases the UEFI watchdog timeout utilized by GRUB in
    StarlingX from 3 minutes to 20 minutes to prevent undesirable and
    arguably premature UEFI watchdog timeout-triggered reboots during the
    installation of StarlingX ISO images via BMC/iLO/iDRAC/platform-provided
    virtual media redirection features in conjunction with ISO images hosted
    on web servers.

    In more detail, a user reported that a StarlingX-based distribution's
    ISO image would not successfully install with platform-provided ISO
    image redirection when the ISO image in question was hosted on a web
    server, despite the bandwidth and latency between the platform network
    interface and the web server being acceptable. The same user reported
    that removing the "efi-watchdog enable ..." line from the GRUB
    configuration resolved the issue.

    The same issue was later reproduced locally with an HPE DL360g10 server,
    where the OAM network interface was able to download an ISO image from a
    local server on a different subnet at a rate of about 76 MiB/s. (While
    the OAM and the iLO network interfaces are likely not the same, we do
    not envision the network conditions to be vastly different when the two
    network paths are compared.) In our reproduction of the issue, the
    downloading of the kernel and the initramfs images takes approximately
    nine minutes and ten seconds, after which the "Linux version" banner is
    printed out by the kernel on the serial console, regardless of whether
    the "Enhanced Download Performance" setting is enabled in the iLO
    settings or not.

    Based on these experimental results, this commit changes the UEFI
    watchdog timeout from 3 minutes to a duration that is approximately two
    times the initial kernel/initramfs load time of 9 minutes and 10 seconds
    encountered in our experiments: 20 minutes.

    Note that this commit does not affect the GRUB configuration files that
    are used after installation. The timeout remains 3 minutes in
    "/boot/efi/EFI/BOOT/grub.cfg" on installed systems after this commit,
    which is appropriate as the GRUB configuration file in question is
    utilized for booting up from local storage (i.e., SSD or HDD).

    Verification:

    * The reported issue was confirmed by placing a StarlingX-based
      distribution's nightly build ISO image on a web server, and the iLO
      (out-of-band platform management firmware) of the HPE DL360g10 server
      under test was configured to boot up from the ISO image on the web
      server via virtual media redirection using an HTTP URL. The 3 minute
      UEFI watchdog timeout set by GRUB was observed to be insufficient and
      the server was seen to autonomously reboot in the middle of the
      loading of the kernel and/or initramfs images.

    * A custom ISO image was built with this commit.

    * The built ISO image was uploaded to the same web server and the iLO
      configuration was modified to boot up from the custom-built ISO image
      instead, also via an HTTP URL. The server was observed to load the
      kernel/initramfs and transfer the control to the Linux kernel in about
      9 minutes and 10 seconds, regardless of the "Enhanced Download
      Performance" setting in the iLO.

    * The installation was allowed to continue. Without the "Enhanced
      Download Performance" setting, the installation finished in ~36 hours,
      whereas with the setting in question enabled, the installation
      finished in ~2 hours. We also observed that this setting did not
      affect the initial loading of the kernel and initramfs images by GRUB.

    Closes-Bug: 2046182
    Change-Id: Iaadf304fcc1969350e399fcd89a06ce1102df223
    Signed-off-by: M. Vefa Bicakci <email address hidden>