[systest] Failover tests for RabbitMQ cluster should destroy nodes instead of suspend (pause)

Bug #1512735 reported by Artem Panchenko
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
High
Fuel QA Team
7.0.x
Won't Fix
High
Fuel QA Team

Bug Description

In system tests we check HA for RabbitMQ cluster by suspending controllers and verifying it passes health checks, for example:

https://github.com/openstack/fuel-qa/blob/master/fuelweb_test/tests/tests_strength/test_failover_base.py#L1043

Then we resume suspended node and perform health checks once again:

https://github.com/openstack/fuel-qa/blob/master/fuelweb_test/tests/tests_strength/test_failover_base.py#L1079

This scenario is not applicable for real hardware environments and could cause additional issues for RabbitMQ recovering script, so recovering takes more time than usually:

Traceback (most recent call last):
  File "/usr/lib/python2.7/unittest/case.py", line 331, in run
    testMethod()
  File "/usr/lib/python2.7/unittest/case.py", line 1043, in runTest
    self._testFunc()
  File "/home/jenkins/venv-nailgun-tests-2.9/local/lib/python2.7/site-packages/proboscis/case.py", line 296, in testng_method_mistake_capture_func
    compatability.capture_type_error(s_func)
  File "/home/jenkins/venv-nailgun-tests-2.9/local/lib/python2.7/site-packages/proboscis/compatability/exceptions_2_6.py", line 27, in capture_type_error
    func()
  File "/home/jenkins/venv-nailgun-tests-2.9/local/lib/python2.7/site-packages/proboscis/case.py", line 350, in func
    func(test_case.state.get_state())
  File "/home/jenkins/workspace/7.0.system_test.ubuntu.ha_neutron_destructive/fuelweb_test/helpers/decorators.py", line 80, in wrapper
    result = func(*args, **kwargs)
  File "/home/jenkins/workspace/7.0.system_test.ubuntu.ha_neutron_destructive/fuelweb_test/tests/tests_strength/test_failover.py", line 313, in ha_neutron_test_3_1_rabbit_failover
    super(self.__class__, self).test_3_1_rabbit_failover()
  File "/home/jenkins/workspace/7.0.system_test.ubuntu.ha_neutron_destructive/fuelweb_test/tests/tests_strength/test_failover_base.py", line 1025, in test_3_1_rabbit_failover
    self.fuel_web.assert_ha_services_ready(cluster_id, timeout=300)
  File "/home/jenkins/workspace/7.0.system_test.ubuntu.ha_neutron_destructive/fuelweb_test/__init__.py", line 58, in wrapped
    result = func(*args, **kwargs)
  File "/home/jenkins/workspace/7.0.system_test.ubuntu.ha_neutron_destructive/fuelweb_test/models/fuel_web_client.py", line 155, in assert_ha_services_ready
    interval=20, timeout=timeout)
  File "/home/jenkins/venv-nailgun-tests-2.9/local/lib/python2.7/site-packages/devops/helpers/helpers.py", line 108, in _wait
    return raising_predicate()
  File "/home/jenkins/workspace/7.0.system_test.ubuntu.ha_neutron_destructive/fuelweb_test/models/fuel_web_client.py", line 154, in <lambda>
    should_fail=should_fail),
  File "/home/jenkins/workspace/7.0.system_test.ubuntu.ha_neutron_destructive/fuelweb_test/__init__.py", line 58, in wrapped
    result = func(*args, **kwargs)
  File "/home/jenkins/workspace/7.0.system_test.ubuntu.ha_neutron_destructive/fuelweb_test/models/fuel_web_client.py", line 1141, in run_ostf
    failed_test_name=failed_test_name, test_sets=test_sets)
  File "/home/jenkins/workspace/7.0.system_test.ubuntu.ha_neutron_destructive/fuelweb_test/__init__.py", line 58, in wrapped
    result = func(*args, **kwargs)
  File "/home/jenkins/workspace/7.0.system_test.ubuntu.ha_neutron_destructive/fuelweb_test/models/fuel_web_client.py", line 258, in assert_ostf_run
    indent=1)))
  File "/home/jenkins/venv-nailgun-tests-2.9/local/lib/python2.7/site-packages/proboscis/asserts.py", line 163, in assert_true
    raise ASSERTION_ERROR(message)
AssertionError: Failed 2 OSTF tests; should fail 0 tests. Names of failed tests: [
 {
  "RabbitMQ availability (failure)": "Time limit exceeded while waiting for to finish. Please refer to OpenStack logs for more details."
 },
 {
  "RabbitMQ replication (failure)": "Failed to establish AMQP connection to 5673/tcp port on 10.109.7.4 from controller node! Please refer to OpenStack logs for more details."
 }]

We need to replace suspend/resume actions by destroy/start in system tests for RabbitMQ failover.

Revision history for this message
Artem Panchenko (apanchenko-8) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-qa (master)

Fix proposed to branch: master
Review: https://review.openstack.org/241961

Changed in fuel:
status: Confirmed → In Progress
Dmitry Pyzhov (dpyzhov)
no longer affects: fuel/8.0.x
tags: added: area-qa
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-qa (master)

Reviewed: https://review.openstack.org/241961
Committed: https://git.openstack.org/cgit/openstack/fuel-qa/commit/?id=9b2b40ade967be2fe1e0056cb0c9394b166cca92
Submitter: Jenkins
Branch: master

commit 9b2b40ade967be2fe1e0056cb0c9394b166cca92
Author: Tatyana Leontovich <email address hidden>
Date: Wed Nov 4 12:11:48 2015 +0200

    Increate timeout for 3-1 test

    Change suspend action to
    destroy and start and increase timeout

    Change-Id: I1b528ec50417ce8ef845a57ce814487306c386a4
    Closes-Bug: #1512735

Changed in fuel:
status: In Progress → Fix Committed
Changed in fuel:
status: Fix Committed → Fix Released
tags: added: non-release
Revision history for this message
Alexey Stupnikov (astupnikov) wrote :

We no longer support MOS5.1, MOS6.0, MOS6.1
We deliver only Critical/Security fixes to MOS7.0, MOS8.0.
We deliver only High/Critical/Security fixes to MOS9.2.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.