Kata container runtime does not include support for SR-IOV devices
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Triaged
|
Low
|
Unassigned |
Bug Description
Brief Description
-----------------
The kata runtime shipped with StarlingX does not fully support SR-IOV network devices assigned to a pod/container. The devices themselves are able to be assigned, but currently there is no driver support in the kata kernel.
Severity
--------
Major performance feature not available with kata runtime
Steps to Reproduce
------------------
- Ensure that pci-sriov classed interface(s) are assigned to a worker node
- (system host-if-modify <worker> <interface> -c pci-sriov -n sriov1 -N <number of VFs>)
- Ensure the sriovdp label is applied
- (system host-label-assign <worker> sriovdp=enabled)
- Ensure the SR-IOV interface is assigned to a data network
- (system interface-
- Launch a pod with SR-IOV devices and observe that the devices can be seen with lspci
- Observe that the device cannot be bound to any driver and is not usable
Sample network attachment definition spec:
apiVersion: "k8s.cni.
kind: NetworkAttachme
metadata:
name: sriov1
annotations:
k8s.
spec:
config: '{
"cniVersion": "0.3.0",
"type": "sriov"
}'
Sample pod spec:
apiVersion: v1
kind: Pod
metadata:
name: testpod1
annotations:
k8s.
spec:
runtimeClassName: kata
containers:
- name: appcntr1
image: centos/tools
imagePullPo
command: [ "/bin/bash", "-c", "--" ]
args: [ "while true; do sleep 300000; done;" ]
resources:
requests:
cpu: 2
memory: "1Gi"
intel.
limits:
cpu: 2
memory: "1Gi"
intel.
Expected Behavior
------------------
See above
Actual Behavior
----------------
See above
Reproducibility
---------------
100%
System Configuration
-------
All
Branch/Pull Time/Commit
-------
BUILD_DATE=
Last Pass
---------
Likely never
Timestamp/Logs
--------------
See above
Test Activity
-------------
Developer Testing
Workaround
----------
A custom kata kernel and rootfs must be built to include the appropriate driver support
tags: | added: stx.networking |
Changed in starlingx: | |
assignee: | Lin Shuicheng (shuicheng) → nobody |
I have done some investigative work to determine what would need to be done to improve our support for
SR-IOV and Kata containers:
1. It should be documented that for Kata containers, the SR-IOV device must be bound to the VFIO driver
in the host. Currently Kata can only pass through an SR-IOV device using a vfio driver.
For example:
system host-if-modify <worker> <sriov_interface> --vf-driver=vfio
2. We would need a method to bind the driver appropriately in the Kata VM itself so that it shows up in the
container as a kernel network device (netdevice), or is bound again to vfio in the VM.
I think one method of allowing this would be to include the standard network drivers and vfio as kernel modules,
and have the user be able to decide which driver is used based on the kata kernel_modules pod annotation:
io.kataconta iners.config. agent.kernel_ modules: "vfio; vfio-pci"
To support this, the following modifications would be needed:
2.1 Specify the kata containers pod annotation prefix in the containerd config.toml file:
/etc/ containerd/ config. toml
In the 'plugins. cri.containerd. runtimes. kata' section:
pod_annotations = ["io.katacontai ners.*" ]
2.2 Specify appropriate kernel_params in the kata-containers configuration.toml:
/usr/ share/defaults/ kata-containers /configuration. toml
The kernel params need to be set with iommu and the vendor:device id of the supported network devices/
For example:
kernel_params = "iommu=pt intel_iommu=on vfio-pci. ids=8086: 154c"
Setting the vfio-pci.ids in this way means if "vfio; vfio-pci" is in the kernel_modules annotation,
the devices will be bound to vfio automatically in the Kata VM.
Note: This means it would be tricky/not possible to have a mixed vfio/netdevices in the container.
Alternatively, it might be better to just document that the user can specify the kernel_params as
a pod annotation similar to the kernel_modules
3. A custom Kata kernel will need to be built for StarlingX which includes the supported network/vfio
drivers as kernel modules.
Ref:
https:/ /github. com/kata- containers/ osbuilder /github. com/kata- containers/ packaging/ tree/master/ kernel /github. com/kata- containers/ documentation/ blob/master/ use-cases/ using-SRIOV- and-kata. md
https:/
https:/
For example:
CONFIG_IGB=m
CONFIG_IGBVF=m
CONFIG_IXGB=m
CONFIG_IXGBE=m
CONFIG_IXGBEVF=m
CONFIG_I40E=m
CONFIG_I40EVF=m
+<Mellanox Drivers>
CONFIG_ VFIO_IOMMU_ TYPE1=m VFIO_VIRQFD= m VFIO_NOIOMMU= y VFIO_PCI= m VFIO_PCI_ MMAP=y VFIO_PCI_ INTX=y VFIO_PCI_ IGD=y VFIO_MDEV= m VFIO_MDEV_ DEVICE= m
CONFIG_
CONFIG_VFIO=m
CONFIG_
CONFIG_
CONFIG_
CONFIG_
CONFIG_
CONFIG_
CONFIG_
4. A custom Kata rootfs will need to be built for StarlingX which contains the kernel modules
Ref:
https:/ /github. com/kata- containers/ osbuilder
My rootfs build example looks something like this:
In the kernel build, I had to modify the build-kernel.sh script to add the following to the build_kernel()
function:
make -j $(nproc) ARCH="$ {arch_target} " {arch_target} " mod...
+make ARCH="$