test_930_scaleback fails with corosync node offline

Bug #1951649 reported by Corey Bryant
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack HA Cluster Charm
New
Undecided
Unassigned

Bug Description

This was seen here:
https://review.opendev.org/c/openstack/charm-hacluster/+/817786

This is an intermittent failure.

2021-11-18 05:53:11.679697 | focal-medium | 2021-11-18 05:53:11 [INFO] test_930_scaleback (zaza.openstack.charm_tests.hacluster.tests.HaclusterScaleBackAndForthTest)
2021-11-18 05:53:11.679900 | focal-medium | 2021-11-18 05:53:11 [INFO] Remove one unit, recalculate quorum and re-add one unit.
2021-11-18 05:53:11.679952 | focal-medium | 2021-11-18 05:53:11 [INFO] ...
2021-11-18 05:53:20.162704 | focal-medium | 2021-11-18 05:53:20 [INFO] Pausing unit hacluster/1
2021-11-18 05:53:24.448960 | focal-medium | 2021-11-18 05:53:24 [INFO] Removing keystone/1
2021-11-18 06:14:30.729852 | focal-medium | 2021-11-18 06:14:30 [INFO] Waiting for model to settle
2021-11-18 06:14:35.000103 | focal-medium | 2021-11-18 06:14:34 [INFO] Checking that corosync considers at least one node to be offline
2021-11-18 06:14:37.711484 | focal-medium | 2021-11-18 06:14:37 [INFO] Updating corosync ring
2021-11-18 06:14:40.901333 | focal-medium | 2021-11-18 06:14:40 [INFO] Checking that corosync considers all nodes to be online
2021-11-18 06:14:41.541436 | focal-medium | 2021-11-18 06:14:41 [INFO] Re-adding an hacluster unit
2021-11-18 06:14:43.767125 | focal-medium | 2021-11-18 06:14:43 [INFO] Waiting for model to settle
2021-11-18 06:19:54.798415 | focal-medium | 2021-11-18 06:19:54 [INFO] Updating corosync ring - workaround for lp:1874719
2021-11-18 06:19:56.518663 | focal-medium | 2021-11-18 06:19:56 [INFO] Checking that corosync considers all nodes to be online
2021-11-18 06:19:57.340589 | focal-medium | 2021-11-18 06:19:57 [INFO] FAIL
2021-11-18 06:19:57.340782 | focal-medium | 2021-11-18 06:19:57 [INFO] ======================================================================
2021-11-18 06:19:57.340936 | focal-medium | 2021-11-18 06:19:57 [INFO] FAIL: test_930_scaleback (zaza.openstack.charm_tests.hacluster.tests.HaclusterScaleBackAndForthTest)
2021-11-18 06:19:57.341009 | focal-medium | 2021-11-18 06:19:57 [INFO] Remove one unit, recalculate quorum and re-add one unit.
2021-11-18 06:19:57.341119 | focal-medium | 2021-11-18 06:19:57 [INFO] ----------------------------------------------------------------------
2021-11-18 06:19:57.341183 | focal-medium | 2021-11-18 06:19:57 [INFO] Traceback (most recent call last):
2021-11-18 06:19:57.341239 | focal-medium | 2021-11-18 06:19:57 [INFO] File "/home/ubuntu/src/review.opendev.org/openstack/charm-hacluster/.tox/func-target/lib/python3.8/site-packages/zaza/openstack/charm_tests/hacluster/tests.py", line 172, in test_930_scaleback
2021-11-18 06:19:57.341279 | focal-medium | 2021-11-18 06:19:57 [INFO] self.__assert_all_corosync_nodes_are_online(surviving_hacluster_unit)
2021-11-18 06:19:57.341321 | focal-medium | 2021-11-18 06:19:57 [INFO] File "/home/ubuntu/src/review.opendev.org/openstack/charm-hacluster/.tox/func-target/lib/python3.8/site-packages/zaza/openstack/charm_tests/hacluster/tests.py", line 184, in __assert_all_corosync_nodes_are_online
2021-11-18 06:19:57.341349 | focal-medium | 2021-11-18 06:19:57 [INFO] self.assertNotIn('OFFLINE', output,
2021-11-18 06:19:57.341407 | focal-medium | 2021-11-18 06:19:57 [INFO] AssertionError: 'OFFLINE' unexpectedly found in 'Stack: corosync\nCurrent DC: juju-695414-zaza-49ee924eee85-1 (version 1.1.18-2b07d5c5a9) - partition with quorum\nLast updated: Thu Nov 18 06:19:57 2021\nLast change: Thu Nov 18 06:19:48 2021 by hacluster via crmd on juju-695414-zaza-49ee924eee85-1\n\n3 nodes configured\n4 resources configured\n\nOnline: [ juju-695414-zaza-49ee924eee85-1 juju-695414-zaza-49ee924eee85-3 ]\nOFFLINE: [ juju-695414-zaza-49ee924eee85-4 ]\n\nFull list of resources:\n\n Resource Group: grp_ks_vips\n res_ks_8d586a0_vip\t(ocf::heartbeat:IPaddr2):\tStopped\n Clone Set: cl_ks_haproxy [res_ks_haproxy]\n Stopped: [ juju-695414-zaza-49ee924eee85-1 juju-695414-zaza-49ee924eee85-3 juju-695414-zaza-49ee924eee85-4 ]' : corosync shouldn't list any offline node

description: updated
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.