Kubernetes node events warns about FailedNodeAllocatableEnforcement

Bug #1763405 reported by Bharat Kunwar
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Magnum
In Progress
Undecided
Bharat Kunwar

Bug Description

Kubernetes node events warns about FailedNodeAllocatableEnforcement

Steps to reproduce:

- Deploy kubernetes with the latest branch of queens/stable with following kubernetes config:

Client Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.3", GitCommit:"d2835416544f298c919e2ead3be3d0864b52323b", GitTreeState:"archive", BuildDate:"2018-02-13T11:42:06Z", GoVersion:"go1.10rc2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.3", GitCommit:"d2835416544f298c919e2ead3be3d0864b52323b", GitTreeState:"archive", BuildDate:"2018-02-13T11:42:06Z", GoVersion:"go1.10rc2", Compiler:"gc", Platform:"linux/amd64"}

- Run `kubectl describe nodes` on the master node.

Expected result:

No warning

Actual result:

Warning FailedNodeAllocatableEnforcement 25m (x876 over 15h) kubelet, k8s-fa27-mqf5nkvarpmq-minion-0 Failed to update Node Allocatable Limits "": failed to set supported cgroup subsystems for cgroup : Failed to set config for supported subsystems : failed to write 135089913856 to memory.limit_in_bytes: write /rootfs/var/lib/containers/atomic/kubelet.0/rootfs/sys/fs/cgroup/memory/memory.limit_in_bytes: invalid argument

Discussion:

This is essentially a kubernetes bug[1]. The presence of the warning doesn't necessarily appear to change the overall behaviour of the cluster. According to the docs[2], `cgroups-per-qos` is supposed to be enabled by default. However, this change is not propagated through in the latest version of kubernetes.

As a workaround, I have attached a patch for magnum. If your cluster is already deployed, ingress into your worker nodes and add/change `--cgroups-per-qos=true --enforce-node-allocatable=pods` on to your `KUBELET_ARGS` line located inside `/etc/kubernetes/kubelet` so it looks something like this:

```
KUBELET_ARGS="$(/etc/kubernetes/get_require_kubeconfig.sh) --pod-manifest-path=/etc/kubernetes/manifests --cadvisor-port=0 --kubeconfig /etc/kubernetes/kubelet-config.yaml --hostname-override=k8s-fa27-mqf5nkvarpmq-minion-0 --address=10.0.0.5 --port=10250 --read-only-port=0 --anonymous-auth=false --authorization-mode=Webhook --authentication-token-webhook=true --cluster_dns=10.254.0.10 --cluster_domain=cluster.local --pod-infra-container-image=gcr.io/google_containers/pause:3.0 --client-ca-file=/etc/kubernetes/certs/ca.crt --tls-cert-file=/etc/kubernetes/certs/kubelet.crt --tls-private-key-file=/etc/kubernetes/certs/kubelet.key --cgroup-driver=systemd --cgroups-per-qos=true --enforce-node-allocatable=pods"
```

Then run `sudo systemctl restart kubelet.service`.

[1] https://github.com/kubernetes/kubernetes/issues/55867
[2] https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/#enabling-qos-and-pod-level-cgroups

Bharat Kunwar (brtkwr)
description: updated
Bharat Kunwar (brtkwr)
summary: - FailedNodeAllocatableEnforcement warning under `kubectl describe nodes`
+ FailedNodeAllocatableEnforcement warning under kubernetes node events
summary: - FailedNodeAllocatableEnforcement warning under kubernetes node events
+ kubernetes node events warns about FailedNodeAllocatableEnforcement
description: updated
summary: - kubernetes node events warns about FailedNodeAllocatableEnforcement
+ Kubernetes node events warns about FailedNodeAllocatableEnforcement
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to magnum (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.openstack.org/560952

Bharat Kunwar (brtkwr)
description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to magnum (master)

Fix proposed to branch: master
Review: https://review.openstack.org/561605

Changed in magnum:
assignee: nobody → Bharat Kunwar (brtknr)
status: New → In Progress
Revision history for this message
Spyros Trigazis (strigazi) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on magnum (master)

Change abandoned by Bharat Kunwar (<email address hidden>) on branch: master
Review: https://review.openstack.org/561605

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on magnum (stable/queens)

Change abandoned by Bharat Kunwar (<email address hidden>) on branch: stable/queens
Review: https://review.openstack.org/560952

Revision history for this message
Spyros Trigazis (strigazi) wrote :

using cgroupfs instead of systemd driver solved this issue, would like to add a label to be able to select the cgroup driver?

Revision history for this message
Bharat Kunwar (brtkwr) wrote :

Isn't docker running inside fedora atomic containers currently configured to use systemd driver by default?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.