Networking: When the loopback is unconfigured, the AIO manifest fails in apply_network_config.sh

Bug #1973614 reported by Andre Kantek
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Andre Kantek

Bug Description

Brief Description

The /var/run/network-scripts.puppet/ifcfg-lo interface is not being created by puppet over the unlock after removing the management (and cluster-host) interface(s) from the lo interface prior to controller-0 unlock.

The problem was observed on both CentOS and Debian environments.

Severity

Major: Controller-0 fails to unlock due to puppet error that leads to configuration failure. Prevents Debian AIO DX integration (see workaround)

Steps to Reproduce

source /etc/platform/openrc
OAM_IF=enp0s8
MGMT_IF=enp0s3
system host-if-modify controller-0 lo -c none
IFNET_UUIDS=$(system interface-network-list controller-0 | awk '{if ($6 =="lo") print $4;}')
for UUID in $IFNET_UUIDS; do
    system interface-network-remove ${UUID}
done

system host-if-modify controller-0 $OAM_IF -c platform
system interface-network-assign controller-0 $OAM_IF oam
system host-if-modify controller-0 $MGMT_IF -c platform
system interface-network-assign controller-0 $MGMT_IF mgmt
system interface-network-assign controller-0 $MGMT_IF cluster-host

sleep 4

system host-unlock controller-0

Expected Behavior

controller-0 unlocks without error

Actual Behavior

controller-0 fails with a configuration error

Reproducibility

100% reproducible

System Configuration

AIO DX

Load info (eg: 2022-03-10_20-00-07)

On a running system use cat /etc/build.info and grep BUILD_ID

Branch and the time when code was pulled or git commit or cengn load info

Last Pass

Timestamp/Logs

2022-05-04T19:49:20.344 [[0;36mDebug: 2022-05-04 19:49:20 +0000 Executing: 'apply_network_config.sh'[[0m

2022-05-04T19:49:20.466 [[mNotice: 2022-05-04 19:49:20 +0000 /Stage[main]/Platform::Network::Apply/Exec[apply-network-config]/returns: /bin/rm: cannot remove '/var/run/network-scripts.puppet//auto': No such file or directory[[0m

2022-05-04T19:49:20.468 [[mNotice: 2022-05-04 19:49:20 +0000 /Stage[main]/Platform::Network::Apply/Exec[apply-network-config]/returns: /bin/rm: cannot remove '/var/run/network-scripts.puppet//ifcfg-*': No such file or directory[[0m

2022-05-04T19:49:20.470 ^[[1;31mError: 2022-05-04 19:49:20 +0000 'apply_network_config.sh' returned 1 instead of one of [0]

Alarms

Configuration failure alarm

Test Activity

Debian AIO DX Development

Workaround

1) Do not execute the command:
system host-if-modify controller-0 lo -c none

Andre Kantek (akantek)
Changed in starlingx:
assignee: nobody → Andre Kantek (akantek)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to stx-puppet (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/stx-puppet/+/842096

Changed in starlingx:
status: New → In Progress
Ghada Khalil (gkhalil)
tags: added: stx.7.0 stx.networking
Changed in starlingx:
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to stx-puppet (master)

Reviewed: https://review.opendev.org/c/starlingx/stx-puppet/+/842096
Committed: https://opendev.org/starlingx/stx-puppet/commit/e9dd002b98133fbe96837ce101042636f00623b2
Submitter: "Zuul (22348)"
Branch: master

commit e9dd002b98133fbe96837ce101042636f00623b2
Author: Andre Fernando Zanella Kantek <email address hidden>
Date: Mon May 16 10:44:04 2022 -0400

    Do not abort the network script if the loopback is unconfigured.

    This protection was added on change:
    https://review.opendev.org/c/starlingx/stx-puppet/+/752081

    But it is deemed unnecessary as it is possible to set the loopback to
    class none. On bootstrap, the loopback is set to the class platform
    and received by the mgmt cluster-host networks. This initial setup
    can then be modified, in duplex installations, by moving the networks
    to physical interfaces and leaving the loopback as class none (but
    the operator can leave it as class=platform).

    With the loopback class set to none, the puppet-network module will
    not generate the file /var/run/network-scripts.puppet/ifcfg-lo, which
    will result in the AIO manifest applying error.

    There are already several validations on unlock to prevent invalid
    configurations (e.g., one cannot unlock if the mgmt network is not
    anchored on a platform interface). To add this validation to this
    internal script is just preventing the execution without actually
    protecting the system.

    On the tests below the management and cluster-host network were moved
    to the physical interfaces.

    Test Plan:
    PASS Set interface loopback to class none and unlock

    No side effects were detected.

    Closes-bug: 1973614

    Signed-off-by: Andre Fernando Zanella Kantek <email address hidden>
    Change-Id: I17f4070667e2dabca2cedc059cb0609cee444092

Changed in starlingx:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.