Comment 4 for bug 1922578

Revision history for this message
George Kraft (cynerva) wrote :

In /var/log/juju/unit-kubernetes-master-2.log we see that the charm tries to restart kube-apiserver:

2021-04-03 09:05:30 INFO juju-log certificates:24: Invoking reactive handler: reactive/kubernetes_master.py:2192:configure_apiserver
2021-04-03 09:05:34 INFO juju-log certificates:24: status-set: maintenance: Restarting snap.kube-apiserver.daemon service

In /var/log/syslog we see the stop attempt:

Apr 3 09:05:34 juju-8dd1b7-4-lxd-1 systemd[1]: Stopping Cluster Controlled snap.kube-apiserver.daemon...
Apr 3 09:05:35 juju-8dd1b7-4-lxd-1 pacemaker-controld[82061]: notice: Initiating stop operation res_kube_apiserver_snap.kube_apiserver.daemon_stop_0 locally on juju-8dd1b7-4-lxd-1

kube-apiserver starts shutting things down, it has clearly received sigterm:

Apr 3 09:05:34 juju-8dd1b7-4-lxd-1 kube-apiserver.daemon[115586]: I0403 09:05:34.434449 115586 dynamic_cafile_content.go:182] Shutting down client-ca-bundle::/root/cdk/ca.crt
Apr 3 09:05:34 juju-8dd1b7-4-lxd-1 kube-apiserver.daemon[115586]: I0403 09:05:34.434560 115586 dynamic_serving_content.go:145] Shutting down aggregator-proxy-cert::/root/cdk/client.crt::/root/cdk/client.key
Apr 3 09:05:34 juju-8dd1b7-4-lxd-1 kube-apiserver.daemon[115586]: I0403 09:05:34.434576 115586 controller.go:123] Shutting down OpenAPI controller

Weirdly, it stops listening but then logs an error trying to talk to itself:

Apr 3 09:05:34 juju-8dd1b7-4-lxd-1 kube-apiserver.daemon[115586]: I0403 09:05:34.434762 115586 secure_serving.go:241] Stopped listening on [::]:6443
Apr 3 09:05:34 juju-8dd1b7-4-lxd-1 kube-apiserver.daemon[115586]: E0403 09:05:34.457316 115586 controller.go:184] Get "https://[::1]:6443/api/v1/namespaces/default/endpoints/kubernetes": dial tcp [::1]:6443: connect: connection refused

After 20 seconds, pacemaker gives up on waiting:

Apr 3 09:05:55 juju-8dd1b7-4-lxd-1 pacemaker-execd[82058]: notice: Giving up on res_kube_apiserver_snap.kube_apiserver.daemon stop (rc=196): timeout (elapsed=19980ms, remaining=20ms)
Apr 3 09:05:55 juju-8dd1b7-4-lxd-1 pacemaker-controld[82061]: error: Result of stop operation for res_kube_apiserver_snap.kube_apiserver.daemon on juju-8dd1b7-4-lxd-1: Timed Out

After 10 more seconds, systemd falls back to sigkill:

Apr 3 09:06:04 juju-8dd1b7-4-lxd-1 systemd[1]: snap.kube-apiserver.daemon.service: State 'stop-sigterm' timed out. Killing.
Apr 3 09:06:04 juju-8dd1b7-4-lxd-1 systemd[1]: snap.kube-apiserver.daemon.service: Killing process 115586 (kube-apiserver) with signal SIGKILL.
Apr 3 09:06:04 juju-8dd1b7-4-lxd-1 systemd[1]: snap.kube-apiserver.daemon.service: Killing process 115611 (kube-apiserver) with signal SIGKILL.
Apr 3 09:06:04 juju-8dd1b7-4-lxd-1 systemd[1]: snap.kube-apiserver.daemon.service: Killing process 115612 (n/a) with signal SIGKILL.
Apr 3 09:06:04 juju-8dd1b7-4-lxd-1 systemd[1]: snap.kube-apiserver.daemon.service: Killing process 115620 (kube-apiserver) with signal SIGKILL.
Apr 3 09:06:04 juju-8dd1b7-4-lxd-1 systemd[1]: snap.kube-apiserver.daemon.service: Killing process 115625 (n/a) with signal SIGKILL.
Apr 3 09:06:04 juju-8dd1b7-4-lxd-1 systemd[1]: snap.kube-apiserver.daemon.service: Killing process 115627 (n/a) with signal SIGKILL.
Apr 3 09:06:04 juju-8dd1b7-4-lxd-1 systemd[1]: snap.kube-apiserver.daemon.service: Killing process 115632 (n/a) with signal SIGKILL.
Apr 3 09:06:04 juju-8dd1b7-4-lxd-1 systemd[1]: snap.kube-apiserver.daemon.service: Killing process 115746 (n/a) with signal SIGKILL.
Apr 3 09:06:04 juju-8dd1b7-4-lxd-1 systemd[1]: snap.kube-apiserver.daemon.service: Main process exited, code=killed, status=9/KILL
Apr 3 09:06:04 juju-8dd1b7-4-lxd-1 systemd[1]: snap.kube-apiserver.daemon.service: Failed with result 'timeout'.
Apr 3 09:06:04 juju-8dd1b7-4-lxd-1 systemd[1]: Stopped Service for snap application kube-apiserver.daemon.