The sriov device plugin pod may start before it's config manifest is written
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Triaged
|
Low
|
Steven Webster |
Bug Description
Brief Description
-----------------
If an SR-IOV interface VF driver is changed from vfio to netdevice, it's possible that the SR-IOV device plugin pod can start before the manifest is applied to change the /etc/pcidp/
As such, the SR-IOV device plugin will look for vfio bound devices and will not find any. It is then not possible to launch a pod which uses the SR-IOV interfaces until the SR-IOV device plugin is restarted (or the host locks and unlocks again)
Severity
--------
Major: System/Feature is usable but degraded
Steps to Reproduce
------------------
system host-lock
system host-if-modify <worker> -n sriov0 -c pci-sriov -N <num_vfs> --vf-driver=vfio <interface_uuid>
system host-unlock ... wait for system to come up
system host-lock
system host-if-modify <worker> -n sriov0 -c pci-sriov -N <num_vfs> --vf-driver=
system host-unlock
Expected Behavior
------------------
The /etc/pcidp/
Actual Behavior
----------------
It appears the pod has started before the file is written. This was confirmed by looking at the logs of the device plugin.
Reproducibility
---------------
Seen once (so far)
System Configuration
-------
One node system
Branch/Pull Time/Commit
-------
master
BUILD_DATE=
Last Pass
---------
I believe this is the first time this has been seen
Workaround
---------
Delete the sriov device plugin pod
Or
lock/unlock the host
Test Activity
-------------
Developer testing
Changed in starlingx: | |
status: | Triaged → In Progress |
Changed in starlingx: | |
status: | In Progress → Triaged |
Marking as stx.3.0 / medium priority - would be nice to fix to avoid an extra lock/unlock for the host