[BVT][9.0] OSTF failure: Number of RabbitMQ nodes is not equal to number of cluster nodes

Bug #1653722 reported by Roman Podoliaka
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
Invalid
High
MOS Oslo

Bug Description

9.0 BVT build ( https://ci.fuel-infra.org/job/9.0-community.main.ubuntu.bvt_2/771/ ) failed with the following error after successful deployment:

Traceback (most recent call last):
  File "/home/jenkins/workspace/9.0-community.main.ubuntu.bvt_2/core/helpers/log_helpers.py", line 215, in wrapped
    result = func(*args, **kwargs)
  File "/home/jenkins/workspace/9.0-community.main.ubuntu.bvt_2/fuelweb_test/models/fuel_web_client.py", line 1369, in run_ostf
    failed_test_name=failed_test_name, test_sets=test_sets)
  File "/home/jenkins/workspace/9.0-community.main.ubuntu.bvt_2/core/helpers/log_helpers.py", line 215, in wrapped
    result = func(*args, **kwargs)
  File "/home/jenkins/workspace/9.0-community.main.ubuntu.bvt_2/fuelweb_test/models/fuel_web_client.py", line 305, in assert_ostf_run
    indent=1)))
  File "/home/jenkins/venv-nailgun-tests-2.9/local/lib/python2.7/site-packages/proboscis/asserts.py", line 163, in assert_true
    raise ASSERTION_ERROR(message)
AssertionError: Failed 3 OSTF tests; should fail 0 tests. Names of failed tests:
  - Check pacemaker status (failure) On the controller node-2.test.domain.local, resource master_p_rabbitmq-server is active but failed to start (managed).. Please refer to OpenStack logs for more details.
  - RabbitMQ availability (failure) Number of RabbitMQ nodes is not equal to number of cluster nodes.
  - RabbitMQ replication (failure) Failed to establish AMQP connection to 5673/tcp port on 10.109.6.2 from controller node! Please refer to OpenStack logs for more details.

Tags: area-oslo
Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :
Changed in mos:
milestone: none → 9.2
assignee: nobody → MOS Oslo (mos-oslo)
tags: added: area-oslo
Revision history for this message
Alexey Lebedeff (alebedev-a) wrote :

rabbit logs on node-2 show that there was some network error between it and node-1

=INFO REPORT==== 2-Jan-2017::17:04:43 ===
node 'rabbit@messaging-node-1' down: etimedout

and it took pacemaker around 2 minutes to recover the cluster

=INFO REPORT==== 2-Jan-2017::17:06:50 ===
Starting RabbitMQ 3.6.5 on Erlang 18.1

Looks like it's some transient issue and everything should work just fine after restarting the test.

Revision history for this message
Alexey Lebedeff (alebedev-a) wrote :

Actually, recovery took a bit longer - around 5 minutes.

But I don't think we'll be able to find out why network connection timed-out.

Changed in mos:
status: New → Confirmed
importance: Undecided → High
Revision history for this message
Vitaly Sedelnik (vsedelnik) wrote :

Doesn't reproduce anymore, looks like that was one-off env issue. Invalid.

Changed in mos:
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.