Brief Description
-----------------
There is kubernetes make test failing after recent feature, "Identify platform pods based on pod/namespace labels", https://review.opendev.org/c/starlingx/integ/+/907637 .
make test WHAT=./pkg/kubelet/cm/cpumanager GOFLAGS="-v"
FAIL: TestStaticPolicyAddWithResvList
There also is lots of extra log noise, uncertain whether this related or due to independent issue.
Perhaps this is a symptom...
E0313 11:16:55.361101 3593482 policy_static.go:690] Failed to build client config from /etc/kubernetes/kubelet.conf: stat /etc/kubernetes/kubelet.conf: no such file or directory
...
I0313 11:16:55.362138 3593482 policy_static.go:716] "Checking pod " =" for label 'app.starlingx.io/component=platform'."
I0313 11:16:55.362170 3593482 policy_static.go:724] "Pod " =" does not have 'app.starlingx.io/component=platform' label. Checking its namespace information..."
...
There is also another issue with the new patch for integ/+/907637 . As it stands, we have patch kubelet-cpumanager-infra-pods-use-system-reserved-CP.patch which adds the hardcoded list of namespaces, and kubelet-cpumanager-introduce-concept-of-isolated-CPU.patch which originally just did what was described by the name but now *also* removes the hardcoded list of platform namespaces. So now the contents of the patch no longer match the patch name, which isn't great. (Fundamentally the behaviour of platform pods is orthogonal to isolated CPUs.)
Details,
=== RUN TestStaticPolicyAddWithResvList
I0313 11:16:55.360914 3593482 fake_topology_manager.go:33] "NewFakeManager"
I0313 11:16:55.360953 3593482 policy_static.go:144] "Static policy created with configuration" options={FullPhysicalCPUsOnly:false DistributeCPUsAcrossNUMA:false AlignBySocket:false}
I0313 11:16:55.360996 3593482 policy_static.go:174] "Reserved CPUs not available for exclusive assignment" reservedSize=1 reserved="0"
I0313 11:16:55.361028 3593482 policy_static.go:716] "Checking pod " =" for label 'app.starlingx.io/component=platform'."
I0313 11:16:55.361061 3593482 policy_static.go:724] "Pod " =" does not have 'app.starlingx.io/component=platform' label. Checking its namespace information..."
E0313 11:16:55.361101 3593482 policy_static.go:690] Failed to build client config from /etc/kubernetes/kubelet.conf: stat /etc/kubernetes/kubelet.conf: no such file or directory
I0313 11:16:55.361156 3593482 policy_static.go:716] "Checking pod " =" for label 'app.starlingx.io/component=platform'."
I0313 11:16:55.361194 3593482 policy_static.go:724] "Pod " =" does not have 'app.starlingx.io/component=platform' label. Checking its namespace information..."
E0313 11:16:55.361219 3593482 policy_static.go:690] Failed to build client config from /etc/kubernetes/kubelet.conf: stat /etc/kubernetes/kubelet.conf: no such file or directory
I0313 11:16:55.361252 3593482 policy_static.go:323] "Static policy: Allocate" pod="" containerName="fakeContainer2"
I0313 11:16:55.361278 3593482 fake_topology_manager.go:55] "GetAffinity" podUID="fakePod" containerName="fakeContainer2"
I0313 11:16:55.361314 3593482 policy_static.go:349] "Topology Affinity" pod="" containerName="fakeContainer2" affinity={NUMANodeAffinity:<nil> Preferred:false}
I0313 11:16:55.361340 3593482 policy_static.go:420] "AllocateCPUs" numCPUs=8 socket=<nil>
E0313 11:16:55.361397 3593482 policy_static.go:354] "Unable to allocate CPUs" err="not enough cpus available to satisfy request" pod="" containerName="fakeContainer2" numCPUs=8
I0313 11:16:55.361430 3593482 fake_topology_manager.go:33] "NewFakeManager"
I0313 11:16:55.361461 3593482 policy_static.go:144] "Static policy created with configuration" options={FullPhysicalCPUsOnly:false DistributeCPUsAcrossNUMA:false AlignBySocket:false}
I0313 11:16:55.361504 3593482 policy_static.go:174] "Reserved CPUs not available for exclusive assignment" reservedSize=2 reserved="0-1"
I0313 11:16:55.361532 3593482 policy_static.go:716] "Checking pod " =" for label 'app.starlingx.io/component=platform'."
I0313 11:16:55.361560 3593482 policy_static.go:724] "Pod " =" does not have 'app.starlingx.io/component=platform' label. Checking its namespace information..."
E0313 11:16:55.361587 3593482 policy_static.go:690] Failed to build client config from /etc/kubernetes/kubelet.conf: stat /etc/kubernetes/kubelet.conf: no such file or directory
I0313 11:16:55.361615 3593482 policy_static.go:716] "Checking pod " =" for label 'app.starlingx.io/component=platform'."
I0313 11:16:55.361636 3593482 policy_static.go:724] "Pod " =" does not have 'app.starlingx.io/component=platform' label. Checking its namespace information..."
E0313 11:16:55.361656 3593482 policy_static.go:690] Failed to build client config from /etc/kubernetes/kubelet.conf: stat /etc/kubernetes/kubelet.conf: no such file or directory
I0313 11:16:55.361682 3593482 policy_static.go:323] "Static policy: Allocate" pod="" containerName="fakeContainer2"
I0313 11:16:55.361705 3593482 fake_topology_manager.go:55] "GetAffinity" podUID="fakePod" containerName="fakeContainer2"
I0313 11:16:55.361741 3593482 policy_static.go:349] "Topology Affinity" pod="" containerName="fakeContainer2" affinity={NUMANodeAffinity:<nil> Preferred:false}
I0313 11:16:55.361767 3593482 policy_static.go:420] "AllocateCPUs" numCPUs=1 socket=<nil>
I0313 11:16:55.361868 3593482 policy_static.go:452] "AllocateCPUs" result="4"
I0313 11:16:55.361895 3593482 policy_static.go:359] [cpumanager] guaranteed: AddContainer (namespace: , pod UID: fakePod, pod: , container: fakeContainer2); numCPUS=1, cpuset=4
I0313 11:16:55.361937 3593482 fake_topology_manager.go:33] "NewFakeManager"
I0313 11:16:55.361969 3593482 policy_static.go:144] "Static policy created with configuration" options={FullPhysicalCPUsOnly:false DistributeCPUsAcrossNUMA:false AlignBySocket:false}
I0313 11:16:55.362012 3593482 policy_static.go:174] "Reserved CPUs not available for exclusive assignment" reservedSize=2 reserved="0-1"
I0313 11:16:55.362040 3593482 policy_static.go:716] "Checking pod " =" for label 'app.starlingx.io/component=platform'."
I0313 11:16:55.362074 3593482 policy_static.go:724] "Pod " =" does not have 'app.starlingx.io/component=platform' label. Checking its namespace information..."
E0313 11:16:55.362102 3593482 policy_static.go:690] Failed to build client config from /etc/kubernetes/kubelet.conf: stat /etc/kubernetes/kubelet.conf: no such file or directory
I0313 11:16:55.362138 3593482 policy_static.go:716] "Checking pod " =" for label 'app.starlingx.io/component=platform'."
I0313 11:16:55.362170 3593482 policy_static.go:724] "Pod " =" does not have 'app.starlingx.io/component=platform' label. Checking its namespace information..."
E0313 11:16:55.362193 3593482 policy_static.go:690] Failed to build client config from /etc/kubernetes/kubelet.conf: stat /etc/kubernetes/kubelet.conf: no such file or directory
I0313 11:16:55.362218 3593482 policy_static.go:323] "Static policy: Allocate" pod="" containerName="fakeContainer3"
I0313 11:16:55.362242 3593482 fake_topology_manager.go:55] "GetAffinity" podUID="fakePod" containerName="fakeContainer3"
I0313 11:16:55.362275 3593482 policy_static.go:349] "Topology Affinity" pod="" containerName="fakeContainer3" affinity={NUMANodeAffinity:<nil> Preferred:false}
I0313 11:16:55.362301 3593482 policy_static.go:420] "AllocateCPUs" numCPUs=2 socket=<nil>
I0313 11:16:55.362389 3593482 policy_static.go:452] "AllocateCPUs" result="4-5"
I0313 11:16:55.362410 3593482 policy_static.go:359] [cpumanager] guaranteed: AddContainer (namespace: , pod UID: fakePod, pod: , container: fakeContainer3); numCPUS=2, cpuset=4-5
I0313 11:16:55.362456 3593482 fake_topology_manager.go:33] "NewFakeManager"
I0313 11:16:55.362484 3593482 policy_static.go:144] "Static policy created with configuration" options={FullPhysicalCPUsOnly:false DistributeCPUsAcrossNUMA:false AlignBySocket:false}
I0313 11:16:55.362520 3593482 policy_static.go:174] "Reserved CPUs not available for exclusive assignment" reservedSize=2 reserved="0-1"
I0313 11:16:55.362547 3593482 policy_static.go:716] "Checking pod " =" for label 'app.starlingx.io/component=platform'."
I0313 11:16:55.362579 3593482 policy_static.go:724] "Pod " =" does not have 'app.starlingx.io/component=platform' label. Checking its namespace information..."
=== RUN TestStaticPolicyAddWithResvList
I0313 11:16:55.360914 3593482 fake_topology_manager.go:33] "NewFakeManager"
I0313 11:16:55.360953 3593482 policy_static.go:144] "Static policy created with configuration" options={FullPhysicalCPUsOnly:false DistributeCPUsAcrossNUMA:false AlignBySocket:false}
I0313 11:16:55.360996 3593482 policy_static.go:174] "Reserved CPUs not available for exclusive assignment" reservedSize=1 reserved="0"
I0313 11:16:55.361028 3593482 policy_static.go:716] "Checking pod " =" for label 'app.starlingx.io/component=platform'."
I0313 11:16:55.361061 3593482 policy_static.go:724] "Pod " =" does not have 'app.starlingx.io/component=platform' label. Checking its namespace information..."
E0313 11:16:55.361101 3593482 policy_static.go:690] Failed to build client config from /etc/kubernetes/kubelet.conf: stat /etc/kubernetes/kubelet.conf: no such file or directory
I0313 11:16:55.361156 3593482 policy_static.go:716] "Checking pod " =" for label 'app.starlingx.io/component=platform'."
I0313 11:16:55.361194 3593482 policy_static.go:724] "Pod " =" does not have 'app.starlingx.io/component=platform' label. Checking its namespace information..."
E0313 11:16:55.361219 3593482 policy_static.go:690] Failed to build client config from /etc/kubernetes/kubelet.conf: stat /etc/kubernetes/kubelet.conf: no such file or directory
I0313 11:16:55.361252 3593482 policy_static.go:323] "Static policy: Allocate" pod="" containerName="fakeContainer2"
I0313 11:16:55.361278 3593482 fake_topology_manager.go:55] "GetAffinity" podUID="fakePod" containerName="fakeContainer2"
I0313 11:16:55.361314 3593482 policy_static.go:349] "Topology Affinity" pod="" containerName="fakeContainer2" affinity={NUMANodeAffinity:<nil> Preferred:false}
I0313 11:16:55.361340 3593482 policy_static.go:420] "AllocateCPUs" numCPUs=8 socket=<nil>
E0313 11:16:55.361397 3593482 policy_static.go:354] "Unable to allocate CPUs" err="not enough cpus available to satisfy request" pod="" containerName="fakeContainer2" numCPUs=8
I0313 11:16:55.361430 3593482 fake_topology_manager.go:33] "NewFakeManager"
I0313 11:16:55.361461 3593482 policy_static.go:144] "Static policy created with configuration" options={FullPhysicalCPUsOnly:false DistributeCPUsAcrossNUMA:false AlignBySocket:false}
I0313 11:16:55.361504 3593482 policy_static.go:174] "Reserved CPUs not available for exclusive assignment" reservedSize=2 reserved="0-1"
I0313 11:16:55.361532 3593482 policy_static.go:716] "Checking pod " =" for label 'app.starlingx.io/component=platform'."
I0313 11:16:55.361560 3593482 policy_static.go:724] "Pod " =" does not have 'app.starlingx.io/component=platform' label. Checking its namespace information..."
E0313 11:16:55.361587 3593482 policy_static.go:690] Failed to build client config from /etc/kubernetes/kubelet.conf: stat /etc/kubernetes/kubelet.conf: no such file or directory
I0313 11:16:55.361615 3593482 policy_static.go:716] "Checking pod " =" for label 'app.starlingx.io/component=platform'."
I0313 11:16:55.361636 3593482 policy_static.go:724] "Pod " =" does not have 'app.starlingx.io/component=platform' label. Checking its namespace information..."
E0313 11:16:55.361656 3593482 policy_static.go:690] Failed to build client config from /etc/kubernetes/kubelet.conf: stat /etc/kubernetes/kubelet.conf: no such file or directory
I0313 11:16:55.361682 3593482 policy_static.go:323] "Static policy: Allocate" pod="" containerName="fakeContainer2"
I0313 11:16:55.361705 3593482 fake_topology_manager.go:55] "GetAffinity" podUID="fakePod" containerName="fakeContainer2"
I0313 11:16:55.361741 3593482 policy_static.go:349] "Topology Affinity" pod="" containerName="fakeContainer2" affinity={NUMANodeAffinity:<nil> Preferred:false}
I0313 11:16:55.361767 3593482 policy_static.go:420] "AllocateCPUs" numCPUs=1 socket=<nil>
I0313 11:16:55.361868 3593482 policy_static.go:452] "AllocateCPUs" result="4"
I0313 11:16:55.361895 3593482 policy_static.go:359] [cpumanager] guaranteed: AddContainer (namespace: , pod UID: fakePod, pod: , container: fakeContainer2); numCPUS=1, cpuset=4
I0313 11:16:55.361937 3593482 fake_topology_manager.go:33] "NewFakeManager"
I0313 11:16:55.361969 3593482 policy_static.go:144] "Static policy created with configuration" options={FullPhysicalCPUsOnly:false DistributeCPUsAcrossNUMA:false AlignBySocket:false}
I0313 11:16:55.362012 3593482 policy_static.go:174] "Reserved CPUs not available for exclusive assignment" reservedSize=2 reserved="0-1"
I0313 11:16:55.362040 3593482 policy_static.go:716] "Checking pod " =" for label 'app.starlingx.io/component=platform'."
I0313 11:16:55.362074 3593482 policy_static.go:724] "Pod " =" does not have 'app.starlingx.io/component=platform' label. Checking its namespace information..."
E0313 11:16:55.362102 3593482 policy_static.go:690] Failed to build client config from /etc/kubernetes/kubelet.conf: stat /etc/kubernetes/kubelet.conf: no such file or directory
I0313 11:16:55.362138 3593482 policy_static.go:716] "Checking pod " =" for label 'app.starlingx.io/component=platform'."
I0313 11:16:55.362170 3593482 policy_static.go:724] "Pod " =" does not have 'app.starlingx.io/component=platform' label. Checking its namespace information..."
E0313 11:16:55.362193 3593482 policy_static.go:690] Failed to build client config from /etc/kubernetes/kubelet.conf: stat /etc/kubernetes/kubelet.conf: no such file or directory
I0313 11:16:55.362218 3593482 policy_static.go:323] "Static policy: Allocate" pod="" containerName="fakeContainer3"
I0313 11:16:55.362242 3593482 fake_topology_manager.go:55] "GetAffinity" podUID="fakePod" containerName="fakeContainer3"
I0313 11:16:55.362275 3593482 policy_static.go:349] "Topology Affinity" pod="" containerName="fakeContainer3" affinity={NUMANodeAffinity:<nil> Preferred:false}
I0313 11:16:55.362301 3593482 policy_static.go:420] "AllocateCPUs" numCPUs=2 socket=<nil>
I0313 11:16:55.362389 3593482 policy_static.go:452] "AllocateCPUs" result="4-5"
I0313 11:16:55.362410 3593482 policy_static.go:359] [cpumanager] guaranteed: AddContainer (namespace: , pod UID: fakePod, pod: , container: fakeContainer3); numCPUS=2, cpuset=4-5
I0313 11:16:55.362456 3593482 fake_topology_manager.go:33] "NewFakeManager"
I0313 11:16:55.362484 3593482 policy_static.go:144] "Static policy created with configuration" options={FullPhysicalCPUsOnly:false DistributeCPUsAcrossNUMA:false AlignBySocket:false}
I0313 11:16:55.362520 3593482 policy_static.go:174] "Reserved CPUs not available for exclusive assignment" reservedSize=2 reserved="0-1"
I0313 11:16:55.362547 3593482 policy_static.go:716] "Checking pod " =" for label 'app.starlingx.io/component=platform'."
I0313 11:16:55.362579 3593482 policy_static.go:724] "Pod " =" does not have 'app.starlingx.io/component=platform' label. Checking its namespace information..."
E0313 11:16:55.362602 3593482 policy_static.go:690] Failed to build client config from /etc/kubernetes/kubelet.conf: stat /etc/kubernetes/kubelet.conf: no such file or directory
policy_static_test.go:1058: StaticPolicy Allocate() error (InfraPod, SingleSocketHT, ExpectAllocReserved). expected container fakeContainer2 to be present in assignments map[fakePod:map[fakeContainer100:2-3,6-7]]
policy_static_test.go:1063: StaticPolicy Allocate() error (InfraPod, SingleSocketHT, ExpectAllocReserved). expected cpuset 0-1 but got
I0313 11:16:55.362733 3593482 fake_topology_manager.go:33] "NewFakeManager"
I0313 11:16:55.362763 3593482 policy_static.go:144] "Static policy created with configuration" options={FullPhysicalCPUsOnly:false DistributeCPUsAcrossNUMA:false AlignBySocket:false}
I0313 11:16:55.362806 3593482 policy_static.go:174] "Reserved CPUs not available for exclusive assignment" reservedSize=2 reserved="0-1"
I0313 11:16:55.362836 3593482 policy_static.go:716] "Checking pod " =" for label 'app.starlingx.io/component=platform'."
I0313 11:16:55.362870 3593482 policy_static.go:724] "Pod " =" does not have 'app.starlingx.io/component=platform' label. Checking its namespace information..."
E0313 11:16:55.362896 3593482 policy_static.go:690] Failed to build client config from /etc/kubernetes/kubelet.conf: stat /etc/kubernetes/kubelet.conf: no such file or directory
policy_static_test.go:1058: StaticPolicy Allocate() error (InfraPod, SingleSocketHT, Isolcpus, ExpectAllocReserved). expected container fakeContainer2 to be present in assignments map[fakePod:map[fakeContainer100:2-3,6-7]]
policy_static_test.go:1063: StaticPolicy Allocate() error (InfraPod, SingleSocketHT, Isolcpus, ExpectAllocReserved). expected cpuset 0 but got
--- FAIL: TestStaticPolicyAddWithResvList (0.00s)
policy_static_test.go:1058: StaticPolicy Allocate() error (InfraPod, SingleSocketHT, ExpectAllocReserved). expected container fakeContainer2 to be present in assignments map[fakePod:map[fakeContainer100:2-3,6-7]]
policy_static_test.go:1063: StaticPolicy Allocate() error (InfraPod, SingleSocketHT, ExpectAllocReserved). expected cpuset 0-1 but got
I0313 11:16:55.362733 3593482 fake_topology_manager.go:33] "NewFakeManager"
I0313 11:16:55.362763 3593482 policy_static.go:144] "Static policy created with configuration" options={FullPhysicalCPUsOnly:false DistributeCPUsAcrossNUMA:false AlignBySocket:false}
I0313 11:16:55.362806 3593482 policy_static.go:174] "Reserved CPUs not available for exclusive assignment" reservedSize=2 reserved="0-1"
I0313 11:16:55.362836 3593482 policy_static.go:716] "Checking pod " =" for label 'app.starlingx.io/component=platform'."
I0313 11:16:55.362870 3593482 policy_static.go:724] "Pod " =" does not have 'app.starlingx.io/component=platform' label. Checking its namespace information..."
E0313 11:16:55.362896 3593482 policy_static.go:690] Failed to build client config from /etc/kubernetes/kubelet.conf: stat /etc/kubernetes/kubelet.conf: no such file or directory
policy_static_test.go:1058: StaticPolicy Allocate() error (InfraPod, SingleSocketHT, Isolcpus, ExpectAllocReserved). expected container fakeContainer2 to be present in assignments map[fakePod:map[fakeContainer100:2-3,6-7]]
policy_static_test.go:1063: StaticPolicy Allocate() error (InfraPod, SingleSocketHT, Isolcpus, ExpectAllocReserved). expected cpuset 0 but got
--- FAIL: TestStaticPolicyAddWithResvList (0.00s)
Severity
--------
Provide the severity of the defect.
Major: We don't accept future patches built on top of broken patches,
although kubelet binary does not appear to be broken.
In theory we should also have a Zuul job to enforce this PASS.
Future maintainability. We should refactor integ/+/907637 so that
we make the patches orthagonal:
kubelet-cpumanager-infra-pods-use-system-reserved-CP.patch, and
kubelet-cpumanager-introduce-concept-of-isolated-CPU.patch .
Steps to Reproduce
------------------
Follow guide to build kubernetes with patches.
eg, setup correct go version, fresh kubernetes clone, add incremental patches, then:
// Build kubelet and kubeadm (since we modified the source)
make WHAT=cmd/kubelet make WHAT=cmd/kubeadm
// Run all tests for cpumanager since that is area of kubelet we have changed
make test WHAT=./pkg/kubelet/cm GOFLAGS="-v"
make test WHAT=./pkg/kubelet/cm/cpuset GOFLAGS="-v"
make test WHAT=./pkg/kubelet/cm/cpumanager GOFLAGS="-v"
make test WHAT=./pkg/kubelet/cm/cpumanager/state GOFLAGS="-v"
make test WHAT=./pkg/kubelet/cm/cpumanager/topology GOFLAGS="-v"
make test WHAT=./pkg/kubelet/cm/topologymanager GOFLAGS="-v"
make test WHAT=./pkg/kubelet/cm/devicemanager GOFLAGS="-v"
make test WHAT=./pkg/kubelet/kuberuntime GOFLAGS="-v"
// Run subset of cpumanager tests (eg, tests starting with Test)
make test WHAT=./pkg/kubelet/cm/cpumanager GOFLAGS="-v" KUBE_TEST_ARGS="-run Test"
// Run subset of cpumanager tests (eg, tests ending in PolicyName)
make test WHAT=./pkg/kubelet/cm/cpumanager GOFLAGS="-v" KUBE_TEST_ARGS="-run PolicyName$"
We have to do all of the above directories since our incremental patches touch those areas.
Expected Behavior
------------------
All make tests PASS.
Actual Behavior
----------------
FAIL: TestStaticPolicyAddWithResvList
Reproducibility
---------------
100%.
System Configuration
--------------------
not-applicable, issue during build.
Branch/Pull Time/Commit
-----------------------
Recent.
Last Pass
---------
Before submission integ/+/907637 .
Timestamp/Logs
--------------
Not applicable. Sufficient info above.
Test Activity
-------------
Feature development.
Workaround
----------
None. Revert commit would be a possibility.
Fix proposed to branch: master /review. opendev. org/c/starlingx /integ/ +/914810
Review: https:/