Comment 3 for bug 1905058

Revision history for this message
Kevin W Monroe (kwmonroe) wrote :

From a chat with Chris, he's seeing workers that do not get updated kubeconfigs after the switch to k8s secrets. One thing that's different in this bug vs my repro attempts is that the original lead unit was k8s-master/2. For me, it's always k8s-master/0.

The reason this may be significant is because the relation data view [1] that we parse may overwrite data with key/values from the lowest unit name. Perhaps k8s-master/1 became the new leader in step 5, got new credentials in step 10, but those were being clobbered by old creds from k8s-master/0 (because k-m/0 is a lower unit name than k-m/1).

Then it's possible that the k8s-worker doesn't see a creds change [2], so it never sets the restart flag and therefore never generates the new kubeconfig.

Next steps to test this theory is to deploy with a bunch of masters to increase the chance of the original leader not being /0, and the subsequent leader not being /0 either. If that pans out, we'll need to adjust how k8s-worker detects changing creds.

Until then, a workaround that worked for Chris is to manually change the token in /root/cdk/kubeconfig (admin token) and /home/ubuntu/.kube/config (kubelet token) on the broken workers. You can see a list of tokens that Charmed K8s expects with the following:

juju run -u kubernetes-master/leader 'for i in `kubectl --kubeconfig /root/.kube/config get -n kube-system secrets --field-selector type=juju.is/token-auth | grep -o .*-token-auth`; do echo user: $i; kubectl --kubeconfig /root/.kube/config get -n kube-system secrets/$i --template=dG9rZW46IA=={{.data.password}} | base64 -d; echo; echo; done'

1: https://github.com/juju-solutions/charms.reactive/blob/master/charms/reactive/endpoints.py#L710-L713
2: https://github.com/charmed-kubernetes/charm-kubernetes-worker/blob/master/reactive/kubernetes_worker.py#L1230