Bonding interface does not come up on networking.service stop/start/restart

Bug #2010119 reported by Steven Webster
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Fabiano Correa Mercer

Bug Description

Brief Description
-----------------
Reported to the stx mailing list: https://lists.starlingx.io/pipermail/starlingx-discuss/2023-March/013877.html

It was seen that after 24 hours of operation, a worker node lost connectivity over the management network. The management network was assigned to a bonded interface.

systemctl status networking.service showed a failure to bring up the bonded interface:

● networking.service - Raise network interfaces
     Loaded: loaded (/lib/systemd/system/networking.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Fri 2023-03-10 15:28:30 UTC; 12s ago
       Docs: man:interfaces(5)
    Process: 395870 ExecStart=/sbin/ifup -a --read-environment (code=exited, status=1/FAILURE)
   Main PID: 395870 (code=exited, status=1/FAILURE)
        CPU: 311ms

This appears to be a known bug in the ifenslave 2.12 Debian package

Severity
--------
Major: System/Feature is usable but degraded

Steps to Reproduce
------------------
Created a bonded platform interface:

system host-if-add controller-0 -c platform -a balanced -x layer2 ae0 ae enp0s9 enp0s10

systemctl stop networking.service
systemctl start networking.service
systemctl status networking.service

Expected Behavior
------------------
The bonded interface should come up

Actual Behavior
----------------
The bonded interface does not come up

Reproducibility
---------------
100%

System Configuration
--------------------
N/A

Branch/Pull Time/Commit
-----------------------
stx 8.0

Last Pass
---------
Unknown but probably worked in stx 7.0

Timestamp/Logs
--------------
N/A

Test Activity
-------------
Unknown

Workaround
----------
This appears to have been fixed in ifenslave 2.13:

https://packages.debian.org/bullseye/ifenslave
https://metadata.ftp-master.debian.org/changelogs//main/i/ifenslave/ifenslave_2.13~deb11u1_changelog
https://salsa.debian.org/debian/ifenslave/-/commit/c82a70f38898526870634169c7bb933c5e0f9987

Not tested but the issue would probably also be solved with a host lock/unlock

Changed in starlingx:
assignee: nobody → Fabiano Correa Mercer (fcorream)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tools (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/tools/+/881769

Changed in starlingx:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tools (master)

Reviewed: https://review.opendev.org/c/starlingx/tools/+/881769
Committed: https://opendev.org/starlingx/tools/commit/33104366d8e82400c96da6996573c01bbf46cb29
Submitter: "Zuul (22348)"
Branch: master

commit 33104366d8e82400c96da6996573c01bbf46cb29
Author: Fabiano Mercer <email address hidden>
Date: Thu Apr 27 21:32:56 2023 -0300

    Use ifenslave 2.13~deb11u1 stable release

    The ifenslave 2.13~deb11u1 is the current stable release for bullseye
    https://packages.debian.org/bullseye/net/ifenslave
    The difference between 2.12 and 2.13~deb11u1 are listed below:

    ifenslave (2.13~deb11u1) bullseye; urgency=medium
      * Rebuild for bullseye
      * Revert "Bump Standards-Version to 4.6.0 (no changed needed)"

    ifenslave (2.13) unstable; urgency=medium
      * QA upload.
      * Fix MAC address setting messed up by udev for bond interfaces.
        (Closes: #949062)
      * Use ifquery instead of example contrib script ifstate.
       (Closes: #991930)
      * Fix ifquery redirections.
      * Bump Standards-Version to 4.6.0 (no changed needed).
      * Remove long supported Linux version requirements from Description.
      * Use correct argument in setup_slave_device(). (Closes: #968368)
      * Handle slave definitions of interfaces with no bond settings.
        (Closes: #990428)
      * Delete bond interfaces on ifdown -a. (Closes: #992102)

    The fix: #991930 replaced the use of ifstate by the ifquery.

    The latest release added the file: 98-net-bonding.link in
    the lib/systemd/network/ directory. ( fix: #949062 )

    The content of this file is as follows:
       [Match]
       Driver=bonding

       [Link]
       MACAddressPolicy=none

    The config above is just applied for interfaces that are using the
    bonding Driver, and it forces the bond interface to keep the MAC
    address assigned by the kernel.

    According to the tests, these additional fixes don't have any
    negative impact to the StarlingX.

    Test plan
    PASS AIO-SX fresh install
    PASS AIO-DX fresh install
    PASS Created platform bond interfaces
         of types: 802.3ad balanced-xor and active-standby
         check that bond-master param is present in the bond slave
         ifaces config in /etc/network/interfaces.d/
         systemctl stop networking.service
         systemctl start networking.service
         systemctl status networking.service
         networking.service must be active

    Closes-Bug: #2010119

    Change-Id: I66a4a62770776ec7e5ce98ca1aa259c74f218ac9
    Signed-off-by: Fabiano Mercer <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → Medium
Revision history for this message
Ghada Khalil (gkhalil) wrote (last edit ):

Fix merged in the stx active branch which is built nightly: https://mirror.starlingx.cengn.ca/mirror/starlingx/master/debian/monolithic/
Fix will be automatically included in the next stx release: stx.9.0

Email sent to the stx mailing list to the reporter of the issue to get input on whether he needs the fix to be cherrypicked to r/stx.8.0: https://lists.starlingx.io/pipermail/starlingx-discuss/2023-May/014033.html

05/03 Update: No response from the reporter, but decided to proceed with cherry-pick given the change is small and is low risk.

description: updated
tags: added: stx.9.0
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tools (r/stx.8.0)

Fix proposed to branch: r/stx.8.0
Review: https://review.opendev.org/c/starlingx/tools/+/882061

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tools (r/stx.8.0)

Reviewed: https://review.opendev.org/c/starlingx/tools/+/882061
Committed: https://opendev.org/starlingx/tools/commit/272ec7c0e30c3f2186f31e04caee23ccba7892ec
Submitter: "Zuul (22348)"
Branch: r/stx.8.0

commit 272ec7c0e30c3f2186f31e04caee23ccba7892ec
Author: Fabiano Mercer <email address hidden>
Date: Thu Apr 27 21:32:56 2023 -0300

    Use ifenslave 2.13~deb11u1 stable release

    The ifenslave 2.13~deb11u1 is the current stable release for bullseye
    https://packages.debian.org/bullseye/net/ifenslave
    The difference between 2.12 and 2.13~deb11u1 are listed below:

    ifenslave (2.13~deb11u1) bullseye; urgency=medium
      * Rebuild for bullseye
      * Revert "Bump Standards-Version to 4.6.0 (no changed needed)"

    ifenslave (2.13) unstable; urgency=medium
      * QA upload.
      * Fix MAC address setting messed up by udev for bond interfaces.
        (Closes: #949062)
      * Use ifquery instead of example contrib script ifstate.
       (Closes: #991930)
      * Fix ifquery redirections.
      * Bump Standards-Version to 4.6.0 (no changed needed).
      * Remove long supported Linux version requirements from Description.
      * Use correct argument in setup_slave_device(). (Closes: #968368)
      * Handle slave definitions of interfaces with no bond settings.
        (Closes: #990428)
      * Delete bond interfaces on ifdown -a. (Closes: #992102)

    The fix: #991930 replaced the use of ifstate by the ifquery.

    The latest release added the file: 98-net-bonding.link in
    the lib/systemd/network/ directory. ( fix: #949062 )

    The content of this file is as follows:
       [Match]
       Driver=bonding

       [Link]
       MACAddressPolicy=none

    The config above is just applied for interfaces that are using the
    bonding Driver, and it forces the bond interface to keep the MAC
    address assigned by the kernel.

    According to the tests, these additional fixes don't have any
    negative impact to the StarlingX.

    Test plan
    PASS AIO-SX fresh install
    PASS AIO-DX fresh install
    PASS Created platform bond interfaces
         of types: 802.3ad balanced-xor and active-standby
         check that bond-master param is present in the bond slave
         ifaces config in /etc/network/interfaces.d/
         systemctl stop networking.service
         systemctl start networking.service
         systemctl status networking.service
         networking.service must be active

    Closes-Bug: #2010119

    Change-Id: I66a4a62770776ec7e5ce98ca1aa259c74f218ac9
    Signed-off-by: Fabiano Mercer <email address hidden>
    (cherry picked from commit 33104366d8e82400c96da6996573c01bbf46cb29)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.