OpenStack HA Cluster Charm

[1.19] services on baremetal sometimes fail start with hacluster on focal

Bug #1896639 reported by Alexander Balderson on 2020-09-22

This bug report is a duplicate of: Bug #1881762: resource timeout not respecting units. Edit Remove

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Kubernetes Control Plane Charm	New	Undecided	Unassigned
	OpenStack HA Cluster Charm	New	Undecided	Unassigned

Bug Description

Currently our deploys on baremetal deploys the kubernetes master leader units regularly fail to have services start; sometimes they come up and then go down later.

Some notes about our baremetal deployments:
we deploy with 3 k8s master units using hacluster
we limit traffic for kube-api-endpoint, kube-control, and loadbalancer endpoints traffic to their own space. we call this the internal-space
we have all the network spaces used set in juju-no-proxy

The only consistent thing in the logs is that the k8s master regularly fails to connect to itself over the internal space usually with a connection refused.

a run where kube-controller-manager failed:

kube-controller-manager.daemon[133660]: unable to load configmap based request-header-client-ca-file: Get "https://192.168.33.165:6443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication": dial tcp 192.168.33.165:6443: connect: connection refused

a run where kube-apiserver failed

2020-09-22 11:43:59 INFO juju-log Executing ['kubectl', '--kubeconfig=/root/.kube/config', 'get', 'secrets', '-n', 'kube-system', '--field-selector', 'type=juju.is/token-auth', '-o', 'json']
2020-09-22 11:43:59 DEBUG update-status The connection to the server 192.168.33.31:6443 was refused - did you specify the right host or port?

Tags:

Revision history for this message

Alexander Balderson (asbalderson) wrote on 2020-09-22:

Runs affected by this bug can be found at:
https://solutions.qa.canonical.com/bugs/bugs/bug/1896639

Revision history for this message

George Kraft (cynerva) wrote on 2020-09-23:

I'm able to reproduce this. The issue only occurs when using hacluster, and only on focal.

The hacluster charm disables systemd's normal management of the Kubernetes services, and configures Pacemaker to monitor and restart the services instead. Normally, this is fine. But in the failure case, sometimes, pacemaker does not detect when the service has stopped:

$ systemctl status snap.kube-scheduler.daemon
● snap.kube-scheduler.daemon.service - Service for snap application kube-scheduler.daemon
     Loaded: loaded (/etc/systemd/system/snap.kube-scheduler.daemon.service; enabled; vendor preset: enabled)
    Drop-In: /etc/systemd/system/snap.kube-scheduler.daemon.service.d
             └─always-restart.conf
     Active: inactive (dead) since Wed 2020-09-23 20:46:42 UTC; 8min ago
    Process: 91811 ExecStart=/usr/bin/snap run kube-scheduler.daemon (code=killed, signal=TERM)
   Main PID: 91811 (code=killed, signal=TERM)

$ sudo crm resource status
res_kubernetes-master_e5dc7f9_vip (ocf::heartbeat:IPaddr2): Started
Clone Set: cl_res_kube_apiserver_snap.kube_apiserver.daemon [res_kube_apiserver_snap.kube_apiserver.daemon]
     Started: [ witty-pup ]
     Stopped: [ node1 ]
Clone Set: cl_res_kube_controller_manager_snap.kube_controller_manager.daemon [res_kube_controller_manager_snap.kube_controller_manager.daemon]
     Started: [ witty-pup ]
     Stopped: [ node1 ]
Clone Set: cl_res_kube_proxy_snap.kube_proxy.daemon [res_kube_proxy_snap.kube_proxy.daemon]
     Started: [ witty-pup ]
     Stopped: [ node1 ]
Clone Set: cl_res_kube_scheduler_snap.kube_scheduler.daemon [res_kube_scheduler_snap.kube_scheduler.daemon]
     Started: [ witty-pup ]
     Stopped: [ node1 ]

Side note: I don't know what node1 is. It doesn't appear on bionic and it's not any machine in this cluster.

I think there's a strong chance that this bug lies in the hacluster charm, although it's possible that it lies in the resource definitions that originate from kubernetes-master. Needs further investigation.

I'm able to reproduce this. The issue only occurs when using hacluster, and only on focal.

$ sudo crm resource status
 res_kubernetes-master_e5dc7f9_vip      (ocf::heartbeat:IPaddr2):       Started
 Clone Set: cl_res_kube_apiserver_snap.kube_apiserver.daemon [res_kube_apiserver_snap.kube_apiserver.daemon]
     Started: [ witty-pup ]
     Stopped: [ node1 ]
 Clone Set: cl_res_kube_controller_manager_snap.kube_controller_manager.daemon [res_kube_controller_manager_snap.kube_controller_manager.daemon]
     Started: [ witty-pup ]
     Stopped: [ node1 ]
 Clone Set: cl_res_kube_proxy_snap.kube_proxy.daemon [res_kube_proxy_snap.kube_proxy.daemon]
     Started: [ witty-pup ]
     Stopped: [ node1 ]
 Clone Set: cl_res_kube_scheduler_snap.kube_scheduler.daemon [res_kube_scheduler_snap.kube_scheduler.daemon]
     Started: [ witty-pup ]
     Stopped: [ node1 ]

Side note: I don't know what node1 is. It doesn't appear on bionic and it's not any machine in this cluster.

summary:

- [1.19] services on baremetal sometimes fail start
+ [1.19] services on baremetal sometimes fail start with hacluster on
+ focal

Revision history for this message

George Kraft (cynerva) wrote on 2020-09-24:

This is caused by a bug in the pacemaker package on focal[1].

Start and stop commands are timing out instantly, when they are not supposed to:

Sep 24 19:33:00 witty-pup pacemaker-execd[1707462]: debug: Processed lrmd_rsc_exec operation from a9a3f4af-4edc-4a26-9120-e4fc741ccd46: rc=96, reply=1, notify=0
Sep 24 19:33:00 witty-pup pacemaker-execd[1707462]: info: executing - rsc:res_kube_scheduler_snap.kube_scheduler.daemon action:stop call_id:96
Sep 24 19:33:00 witty-pup pacemaker-execd[1707462]: debug: Performing asynchronous stop op on systemd unit snap.kube-scheduler.daemon named 'res_kube_scheduler_snap.kube_scheduler.daemon'
Sep 24 19:33:00 witty-pup pacemaker-execd[1707462]: debug: Calling StopUnit for res_kube_scheduler_snap.kube_scheduler.daemon: /org/freedesktop/systemd1/unit/snap_2ekube_2dscheduler_2edaemon_2eservice
Sep 24 19:33:00 witty-pup pacemaker-execd[1707462]: info: Call to stop passed: /org/freedesktop/systemd1/job/204414
Sep 24 19:33:00 witty-pup pacemaker-execd[1707462]: notice: Giving up on res_kube_scheduler_snap.kube_scheduler.daemon stop (rc=0): timeout (elapsed=381357ms, remaining=-361357ms)
Sep 24 19:33:00 witty-pup pacemaker-execd[1707462]: debug: finished - rsc:res_kube_scheduler_snap.kube_scheduler.daemon action:monitor call_id:96 exit-code:198 exec-time:381382ms queue-time:226ms
Sep 24 19:33:00 witty-pup pacemaker-controld[1707465]: error: Result of stop operation for res_kube_scheduler_snap.kube_scheduler.daemon on witty-pup: Timed Out

Note the values for elapsed and exec-time are nonsensical. This command ran from start to finish in under a second.

From an upstream issue[2]:

> The incorrect date is a result of bugs that occur in systemd resources when Pacemaker 2.0.3 is built with the -UPCMK_TIME_EMERGENCY_CGT C flag ... The underlying bugs are fixed as of the Pacemaker 2.0.4 release.

The pacemaker package on focal does indeed have -UPCMK_TIME_EMERGENCY_CGT set.

[1]: https://bugs.launchpad.net/ubuntu/+source/pacemaker/+bug/1881762
[2]: https://bugs.clusterlabs.org/show_bug.cgi?id=5429

This is caused by a bug in the pacemaker package on focal[1].

Start and stop commands are timing out instantly, when they are not supposed to:

Sep 24 19:33:00 witty-pup pacemaker-execd[1707462]:  debug: Processed lrmd_rsc_exec operation from a9a3f4af-4edc-4a26-9120-e4fc741ccd46: rc=96, reply=1, notify=0
Sep 24 19:33:00 witty-pup pacemaker-execd[1707462]:  info: executing - rsc:res_kube_scheduler_snap.kube_scheduler.daemon action:stop call_id:96
Sep 24 19:33:00 witty-pup pacemaker-execd[1707462]:  debug: Performing asynchronous stop op on systemd unit snap.kube-scheduler.daemon named 'res_kube_scheduler_snap.kube_scheduler.daemon'
Sep 24 19:33:00 witty-pup pacemaker-execd[1707462]:  debug: Calling StopUnit for res_kube_scheduler_snap.kube_scheduler.daemon: /org/freedesktop/systemd1/unit/snap_2ekube_2dscheduler_2edaemon_2eservice
Sep 24 19:33:00 witty-pup pacemaker-execd[1707462]:  info: Call to stop passed: /org/freedesktop/systemd1/job/204414
Sep 24 19:33:00 witty-pup pacemaker-execd[1707462]:  notice: Giving up on res_kube_scheduler_snap.kube_scheduler.daemon stop (rc=0): timeout (elapsed=381357ms, remaining=-361357ms)
Sep 24 19:33:00 witty-pup pacemaker-execd[1707462]:  debug: finished - rsc:res_kube_scheduler_snap.kube_scheduler.daemon action:monitor call_id:96  exit-code:198 exec-time:381382ms queue-time:226ms
Sep 24 19:33:00 witty-pup pacemaker-controld[1707465]:  error: Result of stop operation for res_kube_scheduler_snap.kube_scheduler.daemon on witty-pup: Timed Out

Note the values for elapsed and exec-time are nonsensical. This command ran from start to finish in under a second.

From an upstream issue[2]:

The pacemaker package on focal does indeed have -UPCMK_TIME_EMERGENCY_CGT set.

[1]: https://bugs.launchpad.net/ubuntu/+source/pacemaker/+bug/1881762
[2]: https://bugs.clusterlabs.org/show_bug.cgi?id=5429

no longer affects:

charm-kubernetes-master