Kubernetes hugepage feature: 1G pages not updated based on system memory configuration

Bug #1830927 reported by Steven Webster
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
High
Don Penney

Bug Description

Brief Description
-----------------
On a non-openstack (and) vswitch_type=none enabled system, the available Kubernetes hugepages may not be updated after increasing the amount of application hugepages.

I have seen this occur for 1G pages; I'm not sure if the same applies to 2M pages.

I have seen this occur for an AIO-DX system. I did not see this on a Standard system, but in that case I may have had ovs_dpdk as a vswitch.

Severity
--------
Major: System/Feature is usable but degraded

Steps to Reproduce
------------------
- After initial install, ensure vswitch_type is set to "none"

system modify -v "none"

- Ensure no openstack labels are present on the system to ensure the kubernetes hugepage feature is enabled.

- Configure some 1G (app) hugepages on controller-1

  system host-modify controller-1 -1G 2 0
  system host-modify controller-1 -1G 2 1

- Unlock the host

system host-unlock controller-1

After the node comes up, look at the Kubernetes hugepage allocation

kubectl get node controller-1 -o json | grep huge

            "hugepages-1Gi": "0",
            "hugepages-2Mi": "0",
            "hugepages-1Gi": "0",
            "hugepages-2Mi": "0",

Expected Behavior
------------------

The kubernetes available hugepages should be updated to reflect what's configured on the system.

Actual Behavior
----------------

In some cases the hugepages are never updated.

Reproducibility
---------------

It's possible there could be a race between when the hugepages are configured via puppet/sysfs and when the kubelet starts / audits hugepages.

See a similar issue here: https://github.com/kubernetes/kubernetes/issues/64309

The reason this never occurs if we have a vswitch_type of not none, is that in this case 1G (app and vswitch) hugepages are configured via grub, so they are present from boot time. Note that updating the grub params will cause an extra reboot after unlock, which is why we may have been configuring app pages via sysfs originally.

Workaround
--------------------

In compute.pp:

- if $::is_gb_page_supported and $::platform::params::vswitch_type != 'none' {
+ if $::is_gb_page_supported {

system host-memory-modify controller-1 -f vswitch -1G 1 0
system host-memory-modify controller-1 -f vswitch -1G 1 1

Suggestion
--------------------

It's probably required to to always update 1G pages via grub if 1G app pages are requested. This seems to be the recommended way to do this in general for 1G pages, rather than sysfs anyway. However, we'd want to ensure we only update the grub parameters if memory (app/vswitch) has actually changed to prevent the extra reboot after unlock.

System Configuration
--------------------
AIO-DX

Branch/Pull Time/Commit
-----------------------
master

Test Activity
-------------
Developer Testing

Revision history for this message
Brent Rowsell (brent-rowsell) wrote :

Restarting the kubelet will allow k8s to discover the hugepages.
This seems to be an init order issue.

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Marking as release gating; issue with hugepage allocations on all-in-one systems due to init order issues.

tags: added: stx.2.0 stx.containers
Changed in starlingx:
importance: Undecided → High
status: New → Triaged
assignee: nobody → Don Penney (dpenney)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (master)

Fix proposed to branch: master
Review: https://review.opendev.org/662373

Changed in starlingx:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/662373
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=0b427ca4a772d7edad8df8956473d427f7bcf39f
Submitter: Zuul
Branch: master

commit 0b427ca4a772d7edad8df8956473d427f7bcf39f
Author: Don Penney <email address hidden>
Date: Thu May 30 22:24:28 2019 -0400

    Restart kubelet after hugepage allocation

    On AIO systems, kubelet is launched prior to the application
    of the worker puppet manifest. However, in some cases, the
    hugepage allocation is occurring as part of this manifest
    application, and kubelet does not have these hugepages in
    its availability.

    To address this, the kubernetes manifest is updated to define
    a refresh dependency against the hugepage allocation on AIO,
    in order to restart kubelet if needed.

    Change-Id: I3289b83fb5bf93553b786694a153be84a97ebe8d
    Closes-Bug: 1830927
    Signed-off-by: Don Penney <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.