kube-sriovdp container in crash loop: can't find /etc/pcidp/config.json

Bug #1891889 reported by Pratik M.
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Won't Fix
Low
Steven Webster

Bug Description

Brief Description
-----------------
R4.0 AIO-DX bare-metal installed. vswitch set to ovs-dpdk. kube-sriovdp container in crash loop. And get a bunch of DPDK EAL error logs.

[sysadmin@controller-0 ~(keystone_admin)]$ kubectl get pods -A | grep -i -e "ovs\|sriov"
kube-system kube-sriov-cni-ds-amd64-rdglr 1/1 Running 4 2d20h
kube-system kube-sriov-device-plugin-amd64-ncsh5 0/1 CrashLoopBackOff 296 24h

NIC is Intel 82599. Server is HP BL465c.

daemon.log:
020-08-17T16:58:48.000 controller-0 ovs-vswitchd[122136]: err ovs|47933|dpdk|ERR|EAL: 0000:87:00.1 failed to select IOMMU type
2020-08-17T16:58:48.000 controller-0 ovs-vswitchd[122136]: err ovs|47934|dpdk|ERR|EAL: Driver cannot attach the device (0000:87:00.1)
2020-08-17T16:58:48.000 controller-0 ovs-vswitchd[122136]: err ovs|47935|dpdk|ERR|EAL: Failed to attach device on primary process
2020-08-17T16:58:48.000 controller-0 ovs-vswitchd[122136]: err ovs|47938|dpdk|ERR|Invalid port_id=32

dmesg has this
[20019.001877] vfio-pci 0000:87:00.0: Device is ineligible for IOMMU domain attach due to platform RMRR requirement. Contact your platform vendor.
[20019.166854] vfio-pci 0000:87:00.1: Device is ineligible for IOMMU domain attach due to platform RMRR requirement. Contact your platform vendor.

[sysadmin@controller-0 ~(keystone_admin)]$ lspci | grep 87:00
87:00.0 Ethernet controller: Intel Corporation 82599 10 Gigabit Dual Port Backplane Connection (rev 01)
87:00.1 Ethernet controller: Intel Corporation 82599 10 Gigabit Dual Port Backplane Connection (rev 01)

[sysadmin@controller-0 ~(keystone_admin)]$ cat /proc/cpuinfo | grep -e "name\|flags" | sort | uniq

flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single intel_ppin tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdt_a rdseed adx smap intel_pt xsaveopt cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts

model name : Intel(R) Xeon(R) CPU E5-2697 v4 @ 2.30GHz

[sysadmin@controller-0 log(keystone_admin)]$ cat /proc/cmdline

BOOT_IMAGE=/vmlinuz-4.18.0-147.3.1.el8_1.7.tis.x86_64 root=UUID=59fe6053-2c2e-4e15-9520-b4b47f1fef43 ro security_profile=standard module_blacklist=integrity,ima audit=0 tboot=false crashkernel=auto biosdevname=0 console=tty0 iommu=pt usbcore.autosuspend=-1 selinux=0 enforcing=0 nmi_watchdog=panic,1 softlockup_panic=1 intel_iommu=on user_namespace.enable=1 hugepagesz=1G hugepages=22 kvm-intel.eptad=0 hugepagesz=2M hugepages=0 default_hugepagesz=1G irqaffinity=0-1,36-37 rcu_nocbs=2-35,38-71 isolcpus=2,38 kthread_cpus=0-1,36-37 nopti nospectre_v2 nospectre_v1

[sysadmin@controller-0 log(keystone_admin)]$ system host-label-list 1

+--------------+-------------------------+-------------+
| hostname | label key | label value |
+--------------+-------------------------+-------------+
| controller-0 | openstack-compute-node | enabled |
| controller-0 | openstack-control-plane | enabled |
| controller-0 | openvswitch | enabled |
| controller-0 | sriov | enabled |
| controller-0 | sriovdp | enabled |
+--------------+-------------------------+-------------+

[sysadmin@controller-0 log(keystone_admin)]$ system host-cpu-list 1 | grep -v Application

+--------------------------------------+-------+-----------+-------+--------+-------------------------------------------+-------------------+
| uuid | log_c | processor | phy_c | thread | processor_model | assigned_function |
| | ore | | ore | | | |
+--------------------------------------+-------+-----------+-------+--------+-------------------------------------------+-------------------+
| 49d2be67-409e-496d-885a-6a5aaacd6728 | 0 | 0 | 0 | 0 | Intel(R) Xeon(R) CPU E5-2697 v4 @ 2.30GHz | Platform |
| decf7d5d-6940-438a-8a57-7f592de5fc47 | 1 | 0 | 1 | 0 | Intel(R) Xeon(R) CPU E5-2697 v4 @ 2.30GHz | Platform |
| 9e58b2e5-3760-44a6-847a-0f42b22803f5 | 2 | 0 | 2 | 0 | Intel(R) Xeon(R) CPU E5-2697 v4 @ 2.30GHz | vSwitch |
| ec9d85bb-28e0-4734-8e36-c8bfa19aa456 | 36 | 0 | 0 | 1 | Intel(R) Xeon(R) CPU E5-2697 v4 @ 2.30GHz | Platform |
| dc7942b3-1273-42d6-9c77-1ea99ee27fb8 | 37 | 0 | 1 | 1 | Intel(R) Xeon(R) CPU E5-2697 v4 @ 2.30GHz | Platform |
| 242e85fa-f87b-4293-abe1-a848af04b884 | 38 | 0 | 2 | 1 | Intel(R) Xeon(R) CPU E5-2697 v4 @ 2.30GHz | vSwitch |
+--------------------------------------+-------+-----------+-------+--------+-------------------------------------------+-------------------+

Reproducibility
---------------
Reproducible

Steps followed from documentation:
system host-label-assign ${NODE} sriovdp=enabled

system host-memory-modify ${NODE} 0 -1G 32
system host-memory-modify ${NODE} 1 -1G 32

system host-label-assign ${NODE} openstack-control-plane=enabled
system host-label-assign ${NODE} openstack-compute-node=enabled
system host-label-assign ${NODE} openvswitch=enabled
system host-label-assign ${NODE} sriov=enabled

system modify --vswitch_type ovs-dpdk

system host-cpu-modify -f vswitch -p0 1 controller-0
system host-memory-modify -f vswitch -1G 1 controller-0 0
system host-memory-modify -f vswitch -1G 1 controller-0 1

Note: I have run CentOS 7 + DPDK applications on the same server, so is this something to do w/ StarlingX module settings?

Revision history for this message
Pratik M. (pvmpublic) wrote :

I did BIOS changes to disable "Shared memory", per:
https://community.hpe.com/t5/proliant-servers-ml-dl-sl/device-is-ineligible-for-iommu-domain-attach-due-to-platform/td-p/6751904

The dmesg and DPDP EAL error logs disappeared, but the container is still in crashloop.

[sysadmin@controller-0 ~(keystone_admin)]$ dmesg | grep -e DMAR -e IOMMU | more
[ 0.000000] ACPI: DMAR 0x000000007B7E7000 0002AC (v01 HP ProLiant 00000001 HP 00000001)
[ 0.000000] DMAR: IOMMU enabled

Revision history for this message
Ghada Khalil (gkhalil) wrote :

kube-sriov is for use with the k8s. ovs-dpdk is for use with openstack. What is the configuration you are trying to setup?

tags: added: stx.networking
Changed in starlingx:
status: New → Incomplete
Revision history for this message
Pratik M. (pvmpublic) wrote :

Thank you for taking a look.

Primarily I am trying to create a test setup to test both virtualised and containerised workloads. They do not need talk to each other. So I am aware of the absence of kuryr, but this is a lab and I am just interested in validating the platform's suitability to run both VNFs or CNFs. I thought this was a supported/primary model.

I followed:
https://docs.starlingx.io/deploy_install_guides/r4_release/bare_metal/aio_duplex_install_kubernetes.html

which said that both for OpenStack and k8s I need to do
system host-label-assign controller-0 sriovdp=enabled

Revision history for this message
Pratik M. (pvmpublic) wrote :
Download full text (4.6 KiB)

OK, I re-installed R4.0 and followed all the commands for OpenStack-only. Which just means that I left out
the two "system host-memory-modify ${NODE} 0|1 -1G 32" above.

Have not done unlock but still see the container is stuck in ContainerCreating.

It seems to be looking for a /etc/pcidp/config.json on host. Maybe a race condition in the setup (unlock will create it)?

Or is it a documentation issue?

The installation guide said:
# system interface-datanetwork-assign ${NODE} ${DATA0IFUUID} ${PHYSNET0}

But do I need to rather follow this:
https://wiki.openstack.org/wiki/StarlingX/Networking
# system host-if-modify -m 1500 -n sriov -c pci-sriov -N 5 ${COMPUTE} ${DATA0IFUUID}

Or
# system host-if-modify -m 1500 -n sriov1 -d datanet1 -c pci-sriov -N 4 --vf-driver=vfio controller-0 ens2f0

Name: kube-sriov-device-plugin-amd64-vpc49
Namespace: kube-system
Priority: 0
Node: controller-0/192.168.206.2
Start Time: Tue, 18 Aug 2020 18:09:40 +0530
Labels: app=sriovdp
              controller-revision-hash=6cfb4bff7b
              pod-template-generation=1
              tier=node
Annotations: <none>
Status: Pending
IP: 192.168.206.2
IPs:
  IP: 192.168.206.2
Controlled By: DaemonSet/kube-sriov-device-plugin-amd64
Containers:
  kube-sriovdp:
    Container ID:
    Image: registry.local:9001/docker.io/starlingx/k8s-plugins-sriov-network-device:stx.4.0-v3.2-16-g4e0302ae
    Image ID:
    Port: <none>
    Host Port: <none>
    Args:
      --log-dir=sriovdp
      --log-level=10
    State: Waiting
      Reason: ContainerCreating
    Ready: False
    Restart Count: 0
    Environment: <none>
    Mounts:
      /etc/pcidp/config.json from config (ro)
      /var/lib/kubelet/ from devicesock (rw)
      /var/log from log (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from sriov-device-plugin-token-nblkw (ro)
Conditions:
  Type Status
  Initialized True
  Ready False
  ContainersReady False
  PodScheduled True
Volumes:
  devicesock:
    Type: HostPath (bare host directory volume)
    Path: /var/lib/kubelet/
    HostPathType:
  log:
    Type: HostPath (bare host directory volume)
    Path: /var/log
    HostPathType:
  config:
    Type: HostPath (bare host directory volume)
    Path: /etc/pcidp/config.json
    HostPathType: File
  sriov-device-plugin-token-nblkw:
    Type: Secret (a volume populated by a Secret)
    SecretName: sriov-device-plugin-token-nblkw
    Optional: false
QoS Class: BestEffort
Node-Selectors: beta.kubernetes.io/arch=amd64
                 sriovdp=enabled
Tolerations: :NoSchedule
                 node.kubernetes.io/disk-pressure:NoSchedule
                 node.kubernetes.io/memory-pressure:NoSchedule
                 node.kubernetes.io/network-unavailable:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute
                 node.kubernetes.io/pid-pressure:NoSchedule
                 node.kubernetes.io/unreachable:NoExecute
                 node.kubernetes.io/unschedulable:NoSchedu...

Read more...

Revision history for this message
Steven Webster (swebster-wr) wrote :

The sriovdp pod being in crashloopbackoff is a known 'issue' in that if that file is missing or empty, there's nothing for the device plugin to query.

The /etc/pcidp/config.json file will get written to if:

1. The host has a configured SR-IOV interface:

 system host-if-modify <worker> <interface> -c pci-sriov -n sriov0 -N <num vfs>

2. The interface has been assigned to a data network

 system interface-datanetwork-assign <worker> <interface> <datanetwork>

3. The host is unlocked

I'm wondering where you found the docs for "# system host-if-modify -m 1500 -n sriov1 -d datanet1 -c pci-sriov -N 4 --vf-driver=vfio controller-0 ens2f0" ? This is kind of legacy behaviour in assigning a datanetwork to an interface via the host-if-modify command (-d datanet1). If those docs are out there, they should be changed.

Revision history for this message
Steven Webster (swebster-wr) wrote :

^ for brevity, also requires the label sriovdp, which I see you have.

system host-label-assign <worker> sriovdp=enabled

Revision history for this message
Pratik M. (pvmpublic) wrote :

Thank you for your inputs. I will try and update here.

@Steven, I found the 3rd invocation in starlingx-discuss archives, not in documentation.

summary: - kube-sriovdp container in crash loop
+ kube-sriovdp container in crash loop: can't find /etc/pcidp/config.json
Changed in starlingx:
assignee: nobody → Steven Webster (swebster-wr)
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Adding the stx.4.0 release tag for now as this is the release the issue is reported in

tags: added: stx.4.0
Revision history for this message
Steven Webster (swebster-wr) wrote :

Hi Pratik, are you still seeing this issue, after confirming the SR-IOV interfaces are configured properly and sriovdp label applied?

Revision history for this message
Ghada Khalil (gkhalil) wrote :

screening: Closing as stx.4.0 is EOL

Changed in starlingx:
importance: Undecided → Low
status: Incomplete → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.