Debian: Kickstart hangs when ipv6 dhclient fails to resolve

Bug #1993342 reported by Eric MacDonald
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Low
Eric MacDonald

Bug Description

Brief Description
-----------------
Running this command for an ipv6 configuration can hang the kickstart if it does not resolve.

dhclient -6 <interface>

This is a known issue in dhclient - https://bugzilla.redhat.com/show_bug.cgi?id=585047

Need a change that prevents the hang. Maybe add -1 to the command line to give it a timeout.

Severity
--------
Major: All works fine in success path.
However, install fails/hangs when the network is misconfigured or DHCP server is not running.

Note, the install will not succeed or even start if the DHCP server is not running. So this issue is really a corner case involving miss-re-configuration of the management interface, say re-assigning it to a separate vlan, that does not work, during the interface setup phase of an ipv6 install.

Steps to Reproduce
------------------
Install a system node with management interface on a dysfunctional vlan

Expected Behavior
------------------
kickstart does not hang

Actual Behavior
----------------
kickstart hangs on 'dhclient -6 vlan###' request in interface setup phase

Reproducibility
---------------
100% reproducible

System Configuration
--------------------
Any DX System node install

Branch/Pull Time/Commit
-----------------------
Any Debian load prior to the date of this bug report.

Last Pass
---------
First time this issue is seen. Success path has no issue.

Timestamp/Logs
--------------
last log in kickstart is

<date .....> dhclient -6 vlan330 || true

Test Activity
-------------
PV lab configuration Change to add a new worker node

Workaround
----------
Fix the vlan networking

Ghada Khalil (gkhalil)
tags: added: stx.metal stx.networking
Changed in starlingx:
importance: Undecided → Low
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to metal (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/metal/+/861920

Changed in starlingx:
status: New → In Progress
Changed in starlingx:
assignee: nobody → Eric MacDonald (rocksolidmtce)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to metal (master)

Reviewed: https://review.opendev.org/c/starlingx/metal/+/861920
Committed: https://opendev.org/starlingx/metal/commit/654e18e9db200441e0a44a30f6797aa48fcb7fd0
Submitter: "Zuul (22348)"
Branch: master

commit 654e18e9db200441e0a44a30f6797aa48fcb7fd0
Author: Eric MacDonald <email address hidden>
Date: Wed Oct 19 19:46:07 2022 +0000

    Prevent kickstart hang when ipv6 dhclient fails to resolve

    The interface setup post phase of the kickstart issues a
    dhclient (dhcp) request for IP address. Normally this executes
    fine and an ip address (lease) is acquired.

    However, in a failure mode case in ipv6 mode that dhclient
    request will hang there waiting until the dhcp server responds
    So, if there is a network configuration error that precludes
    dhclient from getting a response the kickstart and therefore
    the entire installation process hangs.

    This is a known issue/behavior in dhclient that is typically
    worked around with the -1 option.
    - https://bugzilla.redhat.com/show_bug.cgi?id=585047
    - https://linux.die.net/man/8/dhclient

    Rather than using the -1 option which changes the behavior
    with fixed 30 second timeout, this update uses the linux 'timeout'
    command with a chosen 60 second upper bound on the vlan dhclient
    (dhcp) request. If the request does not complete in that time
    then it is terminated in error, allowing the kickstart to proceed.

    Test Plan: Change does not affect CentOS in any way.

    PASS: Verify Debian build and iso install for ipv6 and ipv4.
    PASS: Verify success path with and without vlan in ipv4 and ipv6
    PASS: verify failure path handling in ipv6 vlan case
    PASS: Verify logging

    Closes-Bug: 1993342
    Signed-off-by: Eric MacDonald <email address hidden>
    Change-Id: I2853f52b79e0f82c0a2e645fdeb9e7b7aa4f0a9e

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
tags: added: stx.8.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.