Comment 4 for bug 1884469

Revision history for this message
Yang Liu (yliu12) wrote :

This issue is seen again on DC-4 with load.
platform-integ-apps and cert-manager app apply-failed due to controller-1 is tainted after fresh install.

Both apps were applied successfully after removing the taint manually.

[sysadmin@controller-0 ~(keystone_admin)]$ kubectl get pods --all-namespaces -o wide | grep rbd
kube-system rbd-provisioner-77bfb6dbb-5l9j9 1/1 Running 1 14m dead:beef::8e22:765f:6121:eb5d controller-0 <none> <none>
kube-system rbd-provisioner-77bfb6dbb-rjszw 0/1 Pending 0 14m <none> <none> <none> <none>

[sysadmin@controller-0 ~(keystone_admin)]$ kubectl get pods --all-namespaces -o wide | grep cert
cert-manager cm-cert-manager-856678cfb7-vfndk 1/1 Running 1 20h dead:beef::8e22:765f:6121:eb49 controller-0 <none> <none>
cert-manager cm-cert-manager-856678cfb7-xdfnp 0/1 Pending 0 16h <none> <none> <none> <none>
cert-manager cm-cert-manager-cainjector-85849bd97-n7dfc 0/1 Pending 0 16h <none> <none> <none> <none>
cert-manager cm-cert-manager-cainjector-85849bd97-v64lr 1/1 Running 2 20h dead:beef::8e22:765f:6121:eb48 controller-0 <none> <none>
cert-manager cm-cert-manager-webhook-5745478cbc-nfts9 1/1 Running 1 20h dead:beef::8e22:765f:6121:eb47 controller-0 <none> <none>
cert-manager cm-cert-manager-webhook-5745478cbc-zr52l 0/1 Pending 0 16h <none> <none> <none> <none>

Events:
  Type Reason Age From Message
  ---- ------ ---- ---- -------
  Warning FailedScheduling 18s (x16 over 15m) default-scheduler 0/2 nodes are available: 1 node(s) didn't match pod affinity/anti-affinity, 1 node(s) didn't satisfy existing pods anti-affinity rules, 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.

[sysadmin@controller-0 ~(keystone_admin)]$ kubectl describe nodes controller-1 | grep -i taint
Taints: node-role.kubernetes.io/master:NoSchedule

New logs uploaded to:
https://files.starlingx.kube.cengn.ca/launchpad/1884469