vfio-pci module not loaded if vswitch_type=none

Bug #1829565 reported by Steven Webster
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
High
Steven Webster

Bug Description

Brief Description
-----------------
If the vswitch_type of the system is set to "none", the vfio-pci module is not loaded on worker nodes. This makes sense for an openstack enabled system. However, with the introduction of the multus/sriov CNI plugins, this can cause an issue on a non-openstack enabled node if the user wants to use a DPDK enabled NetworkAttachmentDefinition.

The sriov-cni will try to bind a device to the vfio-pci driver, but the pod will be unable to launch as the sriov-cni will report an error binding to the unloaded module.

The workaround is to simply load the module manually, but we want to have this occur without user intervention.

Severity
--------
Major: Unable to use a DPDK enabled NetworkAttachmentDefinition in a pod without loading the vfio-pci module.

Steps to Reproduce
------------------
- Configure system with vswitch_type="none"
- Enable the SRIOV device plugin: system host-label-assign sriovdp=enabled
- Create a DPDK enabled NetworkAttachmentDefinition
- Launch a pod referencing the NetworkAttachmentDefinition

Expected Behavior
------------------
We should automatically load the vfio-pci module if the label sriovdp=enabled is set. This can probably be achieved with an sriov-cni init container.

Actual Behavior
----------------
User must manually load the module

Reproducibility
---------------
Reproducible

System Configuration
--------------------
Any system with non-openstack enabled workers and sriovdp=enabled label.

Branch/Pull Time/Commit
-----------------------
Master

Changed in starlingx:
assignee: nobody → Steven Webster (swebster-wr)
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Marking as release gating; related to new multus/sriov container support

tags: added: stx.2.0 stx.networking
Changed in starlingx:
importance: Undecided → High
status: New → Triaged
Changed in starlingx:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/661757
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=13b142ff8ba25b6035b461218ef86d9fb14db2ad
Submitter: Zuul
Branch: master

commit 13b142ff8ba25b6035b461218ef86d9fb14db2ad
Author: Steven Webster <email address hidden>
Date: Mon May 27 14:25:28 2019 -0500

    Integration with latest SR-IOV CNI images

    As part of the ongoing development of the sriov-cni and
    sriov-device-plugin, the DPDK NetworkAttachmentDefinition
    configuration options have been deprecated.

    Previously, we used this functionality to have the sriov-cni
    plugin perform the device bind from netdevice (kernel) to
    vfio (userspace), and simply set sriov-device-plugin
    deviceType configuration parameter to 'netdevice'.

    Going forward, we must add a mechanism for a user to define
    the deviceType at the interface configuration level. This
    means an SR-IOV enabled device can no longer have a mix of
    netdevice, vfio chosen by the NeworkAttachmentDefinition.
    That is, it must be determined by the user beforehand which
    type of virtual function driver (kernel or DPDK) a device's
    VFs should have.

    This commit includes the cgtsclient, API, DB and puppet
    related changes required for a user to set the VF driver type.

    In terms of the cgts-client, the following parameter has been
    added: --vf-driver. Example usage for a device intended to
    be used with a DPDK application is as follows:

    system host-if-modify -m 1500 -n sriov0 -d ${DATANET} \
      -c pci-sriov -N ${NUM_VFS} --vf-driver=vfio ${WORKER_NAME} \
      ${INTERFACE_UUID}

    If the user does not specify a vf-driver, the default device
    type will remain as it is today as 'netdevice'. The user can
    also choose to explicitly set the --vf-driver to 'netdevice'
    for the same effect. In this case, a check is made to ensure
    the VF driver has been detected and reported by the sysinv
    agent.

    Story: 2005208
    Task: 33485
    Closes-Bug: 1829565
    Change-Id: I8f6f27b79c7fafa03873e71473f7694991142e50
    Signed-off-by: Steven Webster <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.