Kubernetes Control Plane Charm

Bug #2006957
Comment #1

Comment 1 for bug 2006957

Revision history for this message

George Kraft (cynerva) wrote on 2023-02-10:

I'm still looking into this. It's a bizarre one. There are 23 ovn-central pods and 29 kube-ovn-monitor pods. There should only be 3 of each. The extra pods are failing with:

Status: Failed
Reason: NodeAffinity
Message: Pod was rejected: Predicate NodeAffinity failed

These pods have anti-affinity policies that prevent more than 1 of each from being placed on any given node. The failure makes sense, but it does not make sense why there are so many pods. The charm correctly requested 3 replicas.

From the kube-controller-manager logs, it looks like the ReplicaSet controller spam-created pods during a 10 second window in which one of the control-plane nodes was NotReady. The extra pods were never deleted.

I think I should be able to reproduce this by forcing a kubernetes-control-plane node into a NotReady state during kube-ovn deployment. Let me give that a try.