Kubernetes hugepage feature: 1G pages not updated based on system memory configuration
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
High
|
Don Penney |
Bug Description
Brief Description
-----------------
On a non-openstack (and) vswitch_type=none enabled system, the available Kubernetes hugepages may not be updated after increasing the amount of application hugepages.
I have seen this occur for 1G pages; I'm not sure if the same applies to 2M pages.
I have seen this occur for an AIO-DX system. I did not see this on a Standard system, but in that case I may have had ovs_dpdk as a vswitch.
Severity
--------
Major: System/Feature is usable but degraded
Steps to Reproduce
------------------
- After initial install, ensure vswitch_type is set to "none"
system modify -v "none"
- Ensure no openstack labels are present on the system to ensure the kubernetes hugepage feature is enabled.
- Configure some 1G (app) hugepages on controller-1
system host-modify controller-1 -1G 2 0
system host-modify controller-1 -1G 2 1
- Unlock the host
system host-unlock controller-1
After the node comes up, look at the Kubernetes hugepage allocation
kubectl get node controller-1 -o json | grep huge
Expected Behavior
------------------
The kubernetes available hugepages should be updated to reflect what's configured on the system.
Actual Behavior
----------------
In some cases the hugepages are never updated.
Reproducibility
---------------
It's possible there could be a race between when the hugepages are configured via puppet/sysfs and when the kubelet starts / audits hugepages.
See a similar issue here: https:/
The reason this never occurs if we have a vswitch_type of not none, is that in this case 1G (app and vswitch) hugepages are configured via grub, so they are present from boot time. Note that updating the grub params will cause an extra reboot after unlock, which is why we may have been configuring app pages via sysfs originally.
Workaround
-------
In compute.pp:
- if $::is_gb_
+ if $::is_gb_
system host-memory-modify controller-1 -f vswitch -1G 1 0
system host-memory-modify controller-1 -f vswitch -1G 1 1
Suggestion
-------
It's probably required to to always update 1G pages via grub if 1G app pages are requested. This seems to be the recommended way to do this in general for 1G pages, rather than sysfs anyway. However, we'd want to ensure we only update the grub parameters if memory (app/vswitch) has actually changed to prevent the extra reboot after unlock.
System Configuration
-------
AIO-DX
Branch/Pull Time/Commit
-------
master
Test Activity
-------------
Developer Testing
Restarting the kubelet will allow k8s to discover the hugepages.
This seems to be an init order issue.