This is a race condition between build_kubeconfig, start_control_plane, and configure_apiserver.
In build_kubeconfig, a new client kubeconfig was written[1] with the new CA. Later in build_kubeconfig, it tried to fetch kube-scheduler's token from a secret[2]. Fetching the secret failed:
2023-03-04 02:53:50 INFO unit.kubernetes-control-plane/0.juju-log server.go:316 certificates:55: Executing ['kubectl', '--kubeconfig=/root/.kube/config', 'get', 'secrets', '-n', 'kube-system', '--field-selector', 'type=juju.is/token-auth', '-o', 'json']
2023-03-04 02:53:50 WARNING unit.kubernetes-control-plane/0.certificates-relation-changed logger.go:60 E0304 02:53:50.359454 135532 memcache.go:238] couldn't get current server API group list: Get "https://127.0.0.1:6443/api?timeout=32s": x509: certificate signed by unknown authority
2023-03-04 02:53:50 WARNING unit.kubernetes-control-plane/0.certificates-relation-changed logger.go:60 E0304 02:53:50.365873 135532 memcache.go:238] couldn't get current server API group list: Get "https://127.0.0.1:6443/api?timeout=32s": x509: certificate signed by unknown authority
2023-03-04 02:53:50 WARNING unit.kubernetes-control-plane/0.certificates-relation-changed logger.go:60 E0304 02:53:50.369305 135532 memcache.go:238] couldn't get current server API group list: Get "https://127.0.0.1:6443/api?timeout=32s": x509: certificate signed by unknown authority
2023-03-04 02:53:50 WARNING unit.kubernetes-control-plane/0.certificates-relation-changed logger.go:60 Unable to connect to the server: x509: certificate signed by unknown authority
This is because the client kubeconfig had the new CA, but kube-apiserver had not been restarted yet, so it was still serving with a server certificate from the old CA. Since build_kubeconfig could not obtain the secret, it skipped writing a new kubeconfig for kube-scheduler.
During start_control_plane, the charm restarted kube-scheduler to pick up the new CA. However, since no new kubeconfig had been written for kube-scheduler, it started with the old kubeconfig instead, still using the old CA.
Later, configure_apiserver ran, which restarted kube-apiserver with the new server certificate. This fixed the charm's ability to get secrets, but the damage had already been done. Kube-scheduler was never restarted again.
This is a race condition between build_kubeconfig, start_control_ plane, and configure_ apiserver.
In build_kubeconfig, a new client kubeconfig was written[1] with the new CA. Later in build_kubeconfig, it tried to fetch kube-scheduler's token from a secret[2]. Fetching the secret failed:
2023-03-04 02:53:50 INFO unit.kubernetes -control- plane/0. juju-log server.go:316 certificates:55: Executing ['kubectl', '--kubeconfig= /root/. kube/config' , 'get', 'secrets', '-n', 'kube-system', '--field-selector', 'type=juju. is/token- auth', '-o', 'json'] -control- plane/0. certificates- relation- changed logger.go:60 E0304 02:53:50.359454 135532 memcache.go:238] couldn't get current server API group list: Get "https:/ /127.0. 0.1:6443/ api?timeout= 32s": x509: certificate signed by unknown authority -control- plane/0. certificates- relation- changed logger.go:60 E0304 02:53:50.365873 135532 memcache.go:238] couldn't get current server API group list: Get "https:/ /127.0. 0.1:6443/ api?timeout= 32s": x509: certificate signed by unknown authority -control- plane/0. certificates- relation- changed logger.go:60 E0304 02:53:50.369305 135532 memcache.go:238] couldn't get current server API group list: Get "https:/ /127.0. 0.1:6443/ api?timeout= 32s": x509: certificate signed by unknown authority -control- plane/0. certificates- relation- changed logger.go:60 Unable to connect to the server: x509: certificate signed by unknown authority
2023-03-04 02:53:50 WARNING unit.kubernetes
2023-03-04 02:53:50 WARNING unit.kubernetes
2023-03-04 02:53:50 WARNING unit.kubernetes
2023-03-04 02:53:50 WARNING unit.kubernetes
This is because the client kubeconfig had the new CA, but kube-apiserver had not been restarted yet, so it was still serving with a server certificate from the old CA. Since build_kubeconfig could not obtain the secret, it skipped writing a new kubeconfig for kube-scheduler.
During start_control_ plane, the charm restarted kube-scheduler to pick up the new CA. However, since no new kubeconfig had been written for kube-scheduler, it started with the old kubeconfig instead, still using the old CA.
Later, configure_apiserver ran, which restarted kube-apiserver with the new server certificate. This fixed the charm's ability to get secrets, but the damage had already been done. Kube-scheduler was never restarted again.
[1]: https:/ /github. com/charmed- kubernetes/ charm-kubernete s-control- plane/blob/ d9f276f1e54c22f 3f5d739c82f1a3b 5894d140c7/ reactive/ kubernetes_ control_ plane.py# L2151-L2157 /github. com/charmed- kubernetes/ charm-kubernete s-control- plane/blob/ d9f276f1e54c22f 3f5d739c82f1a3b 5894d140c7/ reactive/ kubernetes_ control_ plane.py# L2198-L2206
[2]: https:/