kubelet does not start due to garbled cpulist

Bug #1885316 reported by Jim Gauld
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Jim Gauld

Bug Description

Brief Description:
The kubelet --reserved-cpus configuration contains a cpulist string. If the value contains a comma (e.g., 0,20) then this gets interpreted as octal and becomes garbled and becomes integer 16. When the node comes up, the wrong cpus get used. For more complex values (e.g., 0,4,40,44), the resulting garbled integer is 18468, and in that case kubelet fails to start, resulting in node does not come up.

The likelihood of encountering this happens when hyperthreading enabled, and depending on the cpu enumeration pattern of BIOS. So we tend to see this more on AIO Dell Hardware, but it really is a generic issue. This can occur on first provisioning, and subsequently if number of platform cores are reconfigured and the result is a comma separate list instead of a range.

Root cause is that the underlying puppet hieara data requires special quoting of these values. That is done for most cases in the sysinv puppet code, but was missing for the new variables: platform::kubernetes::params::k8s_all_reserved_cpuset, and platform::kubernetes::params::k8s_platform_cpuset.

Simple solution is to special quote these two variables, just like the other usages.

Severity:
Critical: System node will not come up.

Steps to Reproduce:
- Lock worker.
- Add platform cores.
- Unlock

Expected behaviour:
kubelet --reserved-cpus option contains proper cpulist. kubelet starts. node comes up.

Actual behaviour:
kubelet --reserved-cpus option is invalid. kubelet does not start. node does not come up.

Reproducibility:
100 percent, if result is a comma-separated cpulist or list of ranges

System configuration:
AIO-DX, hyperthreading enabled.
This can occur with all system types.

Test activity:
Evaluation.

Workaround:
This requires a source code change since these values get regenerated on unlock. Modify the source code that generates this puppet hieradata to surround cpusets with quotation marks, restart sysinv-conductor, lock/unlock affected host.

Eg, On both controllers, modify the two lines of code in /usr/lib64/python2.7/site-packages/sysinv/puppet/kubernetes.py for 'platform::kubernetes::params::k8s_all_reserved_cpuset' and 'platform::kubernetes::params::k8s_platform_cpuset' to surround the values with quotation marks, like the following:

config.update(
    {'platform::kubernetes::params::k8s_cpuset':
    "\"%s\"" % k8s_cpuset,
    'platform::kubernetes::params::k8s_nodeset':
    "\"%s\"" % k8s_nodeset,
    'platform::kubernetes::params::k8s_platform_cpuset':
    "\"%s\"" % k8s_platform_cpuset, #modify this line
    'platform::kubernetes::params::k8s_all_reserved_cpuset':
    "\"%s\"" % k8s_all_reserved_cpuset, #modify this line
    'platform::kubernetes::params::k8s_reserved_mem': k8s_reserved_mem,
})

On the active controller, restart sysinv-conductor process to pick up the code change:
sudo /usr/local/sbin/patch-restart-processes sysinv-conductor

Lock/unlock the affected host.

To verify the change, on controller (note 192.168.204.3 has the change, 192.168.204.3 does not):
sudo grep -rs cpuset /opt/platform
/opt/platform/puppet/20.06/hieradata/192.168.204.3.yaml:platform::kubernetes::params::k8s_all_reserved_cpuset: '"0-1"'
/opt/platform/puppet/20.06/hieradata/192.168.204.3.yaml:platform::kubernetes::params::k8s_cpuset: '"0-1"'
/opt/platform/puppet/20.06/hieradata/192.168.204.3.yaml:platform::kubernetes::params::k8s_platform_cpuset: '"0-1"'
/opt/platform/puppet/20.06/hieradata

/192.168.204.2.yaml:platform::kubernetes::params::k8s_all_reserved_cpuset: 0-1
/opt/platform/puppet/20.06/hieradata/192.168.204.2.yaml:platform::kubernetes::params::k8s_cpuset: '"0-1"'
/opt/platform/puppet/20.06/hieradata/192.168.204.2.yaml:platform::kubernetes::params::k8s_platform_cpuset: 0-1

Revision history for this message
Ghada Khalil (gkhalil) wrote :

stx.5.0 / medium priority - issue was reported on one h/w instance. Can consider back-porting in the future if this becomes a more widespread issue for the community.

Changed in starlingx:
assignee: nobody → Jim Gauld (jgauld)
importance: Undecided → Medium
status: New → Triaged
tags: added: stx.5.0 stx.containers
Revision history for this message
Jim Gauld (jgauld) wrote :

This issue was introduced by recent commit 92828038 Bob Church (Enable --reserved-cpus option in k8s v1.18.1).

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (master)

Fix proposed to branch: master
Review: https://review.opendev.org/738268

Changed in starlingx:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/738268
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=d6f0688a94b667c24f8a59a4229dfa41b7991046
Submitter: Zuul
Branch: master

commit d6f0688a94b667c24f8a59a4229dfa41b7991046
Author: Jim Gauld <email address hidden>
Date: Fri Jun 26 15:24:21 2020 -0400

    kubelet does not start due to garbled cpulist

    This adds consistent quote wrapping around sysinv puppet hieradata
    values containing complex cpulist or list of ranges.

    Two new variables were introduced by commit
    92828038b4cfa720c6dfc74fbdcb2e463ac5996d Enable --reserved-cpus option
    in k8s v1.18.1.

    platform::kubernetes::params::k8s_all_reserved_cpuset, and
    platform::kubernetes::params::k8s_platform_cpuset.

    Without the quotes, any value containing a comma gets interpreted
    as octal without comma, this results in a strange invalid integer.
    This leads to kubelet fail to start. and node will not come up.

    This issue was more likely when multiple platform cores are reserved,
    and hyperthreading is enabled, since that causes non-consecutive
    integers specified in the reserved list.

    Change-Id: I029d89cd8a6f1ca12078d9b86ae05e6660fa2af6
    Closes-Bug: 1885316
    Signed-off-by: Jim Gauld <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Changing the release tag to stx.4.0 since the fix made it in for that release.

tags: added: stx.4.0
removed: stx.5.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.