Nodes don't become "online" after install due to network interface renaming

Bug #1981831 reported by Steven Webster
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Steven Webster

Bug Description

Brief Description
-----------------
It seems that the installer (anaconda) uses interfaces eno1 and eno2 after PXE-boot. The installer will then create the network ifcfg scripts accordingly...

After first reboot the interfaces eno1/eno2 don't exist anymore, but got renamed to enp25s0f0/enps0f1.

This second rename is not visible in the installer.

This affects Mellanox w/ OFED version 5.4+ only, as a 2nd udev rule (82-net-setup-link.rules) is installed by the kernel module which overrides the one installed at install time (70-persistent-net.rules).

This also only affects systems with 'onboard' Mellanox devices. Most systems use a separate MLX PCI card and would not experience this issue as the renaming in the two udev rules file would be consistent.

Severity
--------

Critical: System/Feature is not usable due to the defect

Steps to Reproduce
------------------
Must have a system with onboard Mellanox devices using the mlx5_core driver.

PXE boot (install).
After the first reboot, the /etc/sysconfig/ifcfg-<name> will not correspond to the actual MLX device name.

Expected Behavior
------------------
The renaming should be consistent between install time and runtime.

Actual Behavior
----------------
The renaming is not consistent between install time and runtime.

Reproducibility
---------------
100%

System Configuration
--------------------
N/A, just requires onboard MLX devices

Branch/Pull Time/Commit
-----------------------
master

Last Pass
---------
N/A I have never seen a system with onboard MLX devices.

Timestamp/Logs
--------------
N/A not needed

Test Activity
-------------
Evaluation

Workaround
----------
Change the PHYSDEV in the relevant /etc/sysconfig/ifcfg-<name> file

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kernel (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/kernel/+/850023

Changed in starlingx:
status: New → In Progress
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → Medium
tags: added: stx.8.0
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kernel (master)

Reviewed: https://review.opendev.org/c/starlingx/kernel/+/850023
Committed: https://opendev.org/starlingx/kernel/commit/b4f21b2de119ed1d2469714e97fb03875a561923
Submitter: "Zuul (22348)"
Branch: master

commit b4f21b2de119ed1d2469714e97fb03875a561923
Author: Steven Webster <email address hidden>
Date: Fri Jul 15 11:11:21 2022 -0400

    Enable mlx5 onboard device udev naming

    This commit introduces a patch which prioritizes the udev
    (re)naming rules for mlx5 controlled devices to take the
    onboard device name (if it exists) over the slot/path name.

    This is consistent with the naming order used by the StarlingX
    installer to create the 70-persistent-net.rules file.

    It is also consistent with the naming order in the 99-default.link
    file.

    Without this patch, there could be an inconsistency with the
    70-persistent-net.rules first re-naming the device to its
    slot/path name, and then being overridden by the Mellanox specific
    82-net-setup-link.rules.

    Testing:

    - Ensure if a system has mlx onboard devices, that the interfaces
      are named according to the onboard name. (CentOS)
    - Note: Since I have depended on a third party with access to a
      (CentOS) system with onboard MLX devices, the Debian portion of
      this patch has been confirmed to build successfully, but not
      functionally tested.

    Closes-Bug: 1981831

    Signed-off-by: Steven Webster <email address hidden>
    Change-Id: I79f308b39debd8e5ffd131bc90262a7ab6345e41

Changed in starlingx:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.