Platform kubernets Cgroup (k8s-infra) reported value for cpuset.cpus is incorrect
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
High
|
Jim Gauld |
Bug Description
Brief Description
-----------------
Platform kubernetes cgroup reported value for cpuset.cpus (in the puppet log and in /sys/fs/
Severity
--------
standard
Steps to Reproduce
------------------
1 install and unlock worker node
2.confirm value for the platform cpu is eg. 0,36 as follows:
platform:
<compute-1># sudo grep cpu /opt/platform/
/opt/platform/
/opt/platform/
/opt/platform/
/opt/platform/
/opt/platform/
/opt/platform/
/opt/platform/
/opt/platform/
/opt/platform/
/opt/platform/
/opt/platform/
3. Confirm the puppet heiradata reports the k9s_cpuset also as 0,36 for this worker node
<compute-1>:~# grep -rs k8s /opt/platform/
/opt/platform/
/opt/platform/
4. Confirm the platform-
compute-
::::::::::::::
/etc/systemd/
::::::::::::::
[Manager]
CPUAffinity="0,36"
5. Confirm the output from the puppet.log for the k8s-infra cpuset
compute-
2019-04-
6. Confirm the settings in the following files cpuset.mems and and cpuset.cpus (in the path compute-
compute-
0
compute-
30
Expected Behavior
------------------
In step 5, expected puppet.log to report cpuset: 0,36 (not 30)
In step 6, execpted the cpuset.cpus file to have 0,36 (not 30)
Actual Behavior
----------------
See actual output in step 5 and 6
Reproducibility
---------------
yes
System Configuration
-------
2+3
(Optional Hyperthreaded, low-latency lab yow-cgcs-
Branch/Pull Time/Commit
-------
BUILD_ID=
Timestamp/Logs
--------------
see puppet.log
2019-04-
2019-04-
Changed in starlingx: | |
assignee: | nobody → Jim Gauld (jgauld) |
Changed in starlingx: | |
status: | New → In Progress |
tags: | added: stx.config |
This is new issue related to my recent code https:/ /review. openstack. org/#/c/ 648511/ .
The cpulist values are correct in hierdata, but are mangled when used by puppet.
Solution is to wrap the values in quotes, like we do for various other parameters.
I made the simple fix manually on this lab to /usr/lib64/ python2. 7/site- packages/ sysinv/ puppet/ kubernetes. py, restarted sysinv-conductor, and locked/unlocked the compute, and issue was resolved.
BEFORE, looking at compute-1 : puppet/ 19.01/hieradata / puppet/ 19.01/hieradata /192.168. 204.96. yaml:platform: :kubernetes: :params: :k8s_cpuset: 0,36 puppet/ 19.01/hieradata /192.168. 204.96. yaml:platform: :kubernetes: :params: :k8s_nodeset: '0'
controller-0:~# grep -rs k8s_ /opt/platform/
/opt/platform/
/opt/platform/
AFTER manual fix, we correctly get the 0,36:
controller-0:~# grep -rs k8s_ /opt/platform/ puppet/ 19.01/hieradata / puppet/ 19.01/hieradata /192.168. 204.96. yaml:platform: :kubernetes: :params: :k8s_cpuset: '"0,36"' puppet/ 19.01/hieradata /192.168. 204.96. yaml:platform: :kubernetes: :params: :k8s_nodeset: '"0"'
/opt/platform/
/opt/platform/
compute-1:~# grep -rs "Set k8s" /var/log/ puppet/ latest/ puppet. log 12T17:25: 01.064 Notice: 2019-04-12 17:24:59 +0000 Scope(Class[ Platform: :Kubernetes: :Cgroup] ): Set k8s-infra nodeset: "0", cpuset: "0,36"
2019-04-
compute- 1:/sys/ fs/cgroup/ cpuset/ k8s-infra# cat cpuset.cpus 1:/sys/ fs/cgroup/ cpuset/ k8s-infra# cat cpuset.mems
0,36
compute-
0
compute-1:# cat /etc/systemd/ system. conf.d/ platform- cpuaffinity. conf
[Manager]
CPUAffinity="0,36"
The affinity of platform tasks and kubernetes task are on cpus 0,36 as desired: calico/ bird calico/ bird.
compute-1:~$ ps-sched.sh | grep -e COMM -e bird|cut -c1-120
PID TID PPID S PO NICE RTPRIO PR AFFINITY P COMM COMMAND
46516 46516 46269 S TS 0 - 20 0x1000000001 0 runsv runsv bird
46517 46517 46269 S TS 0 - 20 0x1000000001 36 runsv runsv bird6
46684 46684 46517 S TS 0 - 20 0x1000000001 36 bird6 bird6 -R -s /var/run/
46686 46686 46516 S TS 0 - 20 0x1000000001 36 bird bird -R -s /var/run/