Bonding interface does not come up after host-unlock

Bug #1981765 reported by Steven Webster
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Steven Webster

Bug Description

Brief Description
-----------------
Adding a bond interface on a management or cluster-host network without a VLAN ontop can cause the bond interface to fail to come up after host-unlock.

Severity
--------
Major: System/Feature is usable but degraded

Steps to Reproduce
------------------
After bootstrap:

for i in $(system interface-network-list 1 | grep controller |awk '{print $4}');do echo $i;system interface-network-remove $i;done
system host-if-modify controller-0 lo -c none
system host-if-add controller-0 -a active_standby -c platform bond0 ae ens2f0 ens2f1
system host-if-add controller-0 -a active_standby -c platform bond1 ae ens3f0 ens3f1
system host-if-add -V 755 -c platform controller-0 mgmt0 vlan bond1
system host-if-add -V 754 -c platform controller-0 oam0 vlan bond1
system interface-network-assign controller-0 bond0 cluster-host
system interface-network-assign controller-0 bond1 pxeboot
system interface-network-assign controller-0 oam0 oam
system interface-network-assign controller-0 mgmt0 mgmt
system host-unlock controller-0

Expected Behavior
------------------
The bond0 interface should be up.

Actual Behavior
----------------
The bond0 interface is not up.

Reproducibility
---------------
100%

System Configuration
--------------------
N/A

Branch/Pull Time/Commit
-----------------------
master

Last Pass
---------
N/A
This code has been in place for a couple years, but most configs have cluster-host and mgmt on a VLAN which is why this hasn't been seen previously.

Test Activity
-------------
Evaluation

Workaround
----------
Remove the following from the /etc/sysconfig/ifcfg-<bond-interface>:

/sbin/modprobe bonding; echo +bond0 > /sys/class/net/bonding_masters

then ifup <bond-interface>

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/config/+/849905

Changed in starlingx:
status: New → In Progress
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → Medium
tags: added: stx.8.0
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/c/starlingx/config/+/849905
Committed: https://opendev.org/starlingx/config/commit/462a1cd967f2ae5f55eee080c45c660e7b3f9af7
Submitter: "Zuul (22348)"
Branch: master

commit 462a1cd967f2ae5f55eee080c45c660e7b3f9af7
Author: Steven Webster <email address hidden>
Date: Thu Jul 14 21:04:16 2022 -0400

    Fix bonding interface sysconfig pre-up params

    An issue was noted when attempting to use a bonded interface on
    a management or cluster-host network without an upper VLAN
    interface. The problem turned out to be the following pre-up
    command in the sysconfig file associated with the bond:

    /sbin/modprobe bonding; echo +%s > /sys/class/net/bonding_masters

    The code which programs this command was added in 2019 to fix bug
    (bug 1836969)

    https://opendev.org/starlingx/config/commit/d0ad539f831d9aef7a7d7d653ff0537f47264852

    However, it is noted that today, this command will fail as the bonded
    interface is already created. Trying to add it to the
    bonding_masters list will fail, leaving the interface in a 'down'
    state.

    The reason this code was added was to be able to disable DAD in
    a duplex-direct system, where the duplicate address detection
    would not complete until both hosts were powered on and
    initialized.

    This commit ensures that:

    1. The interface is only added to the bonding_masters
       in a duplex-direct system (in order to be able to
       disable DAD before the interface comes up)
    2. In the case of a duplex-direct system, if the
       interface is already added to the bonding_masters,
       it won't be added again.

    Note:

    The underlying ifup upstream code already accounts for
    the situation that an interface has been added to the
    bonding_masters list, so it is safe for us to explicitly
    add it in a pre-up directive in the case that DAD must
    be disabled.

    Testing:

    1. Ensure the bonding interface (without VLAN) comes up
    2. Ensure in a duplex-direct system that the accept_dad is
       able to be set (regression test bug 1836969)

    Change-Id: I4f712bbbbfa75adfcccbb737df60109db2fef1ee
    Closes-Bug: 1981765
    Signed-off-by: Steven Webster <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.