StarlingX

OpenStack: cable pull test for cluster and MGT on active controller cause both controllers as not active

Bug #1840182 reported by Anujeyan Manokeran on 2019-08-14

This bug report is a duplicate of: Bug #1835807: neutron-l3-agent and neutron-dhcp-agent never recovered after force reboot. Edit Remove

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	StarlingX	Won't Fix	High	Bart Wensley

Bug Description

Brief Description
-----------------.
When I do the cable pull cluster and MGT together on active controller(controller-1) and put back active controller (controller-1) got rebooted and controller-0 become active as expected. After sometimes both controller-0 and controller-1 went to a state none of them are active . There was no recovery to reboot them self.
controller-0:~$ source /etc/platform/openrc
Openstack Admin credentials can only be loaded from the active controller.

controller-1:~$ source /etc/platform/openrc
Openstack Admin credentials can only be loaded from the active controller.

Bin Did analysis and found below.

There was issue with vim on controller-0.

2019-08-14T16:00:06.686 controller-0 VIM_Thread[333973] ERROR Queue.declare: (404) NOT_FOUND - home node '<email address hidden>' of durable queue 'notifications.nfvi_nova_listener_queue' in vhost 'nova' is down or inaccessible
Traceback (most recent call last):
  File "/usr/lib64/python2.7/site-packages/nfv_vim/vim.py", line 161, in process_main
    init_complete = process_initialize()
  File "/usr/lib64/python2.7/site-packages/nfv_vim/vim.py", line 79, in process_initialize
    if not nfvi.nfvi_initialize(config.CONF['nfvi']):
  File "/usr/lib64/python2.7/site-packages/nfv_vim/nfvi/_nfvi_module.py", line 80, in nfvi_initialize
    _task_worker_pools['compute'])
  File "/usr/lib64/python2.7/site-packages/nfv_vim/nfvi/_nfvi_compute_module.py", line 464, in nfvi_compute_initialize
    _compute_plugin.initialize(config['config_file'])
  File "/usr/lib64/python2.7/site-packages/nfv_vim/nfvi/_nfvi_plugin.py", line 94, in initialize
    self._plugin.obj.initialize(config_file)
  File "/usr/lib64/python2.7/site-packages/nfv_plugins/nfvi_plugins/nfvi_compute_api.py", line 3393, in initialize
    'notifications.nfvi_nova_listener_queue')
  File "/usr/lib64/python2.7/site-packages/nfv_plugins/nfvi_plugins/openstack/rpc_listener.py", line 42, in __init__
    self._consumer = Consumer(self._connection, self._rpc_receive_queue)
  File "/usr/lib/python2.7/site-packages/kombu/messaging.py", line 386, in __init__
    self.revive(self.channel)
  File "/usr/lib/python2.7/site-packages/kombu/messaging.py", line 407, in revive
    self.declare()
  File "/usr/lib/python2.7/site-packages/kombu/messaging.py", line 420, in declare
    queue.declare()
  File "/usr/lib/python2.7/site-packages/kombu/entity.py", line 605, in declare
    self._create_queue(nowait=nowait, channel=channel)
  File "/usr/lib/python2.7/site-packages/kombu/entity.py", line 614, in _create_queue
    self.queue_declare(nowait=nowait, passive=False, channel=channel)
  File "/usr/lib/python2.7/site-packages/kombu/entity.py", line 649, in queue_declare
    nowait=nowait,
  File "/usr/lib/python2.7/site-packages/amqp/channel.py", line 1165, in queue_declare
    spec.Queue.DeclareOk, returns_tuple=True,
  File "/usr/lib/python2.7/site-packages/amqp/abstract_channel.py", line 93, in wait
    self.connection.drain_events(timeout=timeout)
  File "/usr/lib/python2.7/site-packages/amqp/connection.py", line 464, in drain_events
    return self.blocking_read(timeout)
  File "/usr/lib/python2.7/site-packages/amqp/connection.py", line 469, in blocking_read
    return self.on_inbound_frame(frame)
  File "/usr/lib/python2.7/site-packages/amqp/method_framing.py", line 68, in on_frame
    callback(channel, method_sig, buf, None)
  File "/usr/lib/python2.7/site-packages/amqp/connection.py", line 473, in on_inbound_method
    method_sig, payload, content,
  File "/usr/lib/python2.7/site-packages/amqp/abstract_channel.py", line 142, in dispatch_method
    listener(*args)
  File "/usr/lib/python2.7/site-packages/amqp/channel.py", line 293, in _on_close
    reply_code, reply_text, (class_id, method_id), ChannelError,
NotFound: Queue.declare: (404) NOT_FOUND - home node '<email address hidden>' of durable queue 'notifications.nfvi_nova_listener_queue' in vhost 'nova' is down or inaccessible
2019-08-14T16:00:06.690 controller-0 VIM_Image-Worker-0_Thread[334031] INFO _thread.py.232 Thread Image-Worker-0: shutting down.
2019-08-14T16:00:06.691 controller-0 VIM_Image-Worker-0_Thread[334031] INFO _thread.py.236 Thread Image-Worker-0: shutdown.
2019-08-14T16:00:06.698 controller-0 VIM_Compute-Worker-0_Thread[334037] INFO _thread.py.232 Thread Compute-Worker-0: shutting down.
2019-08-14T16:00:06.699 controller-0 VIM_Compute-Worker-0_Thread[334037] INFO _thread.py.236 Thread Compute-Worker-0: shutdown.
2019-08-14T16:00:06.706 controller-0 VIM_Compute-Worker-1_Thread[334038] INFO _thread.py.232 Thread Compute-Worker-1: shutting down.
2019-08-14T16:00:06.707 controller-0 VIM_Compute-Worker-1_Thread[334038] INFO _thread.py.236 Thread Compute-Worker-1: shutdown.
2019-08-14T16:00:06.713 controller-0 VIM_BlockStorage-Worker-0_Thread[334034] INFO _thread.py.232 Thread BlockStorage-Worker-0: shutting down.
2019-08-14T16:00:06.714 controller-0 VIM_BlockStorage-Worker-0_Thread[334034] INFO _thread.py.236 Thread BlockStorage-Worker-0: shutdown.
2019-08-14T16:00:06.717 controller-0 VIM_Identity-Worker-0_Thread[334027] INFO _thread.py.232 Thread Identity-Worker-0: shutting down.
2019-08-14T16:00:06.717 controller-0 VIM_Identity-Worker-0_Thread[334027] INFO _thread.py.236 Thread Identity-Worker-0: shutdown.
2019-08-14T16:00:06.724 controller-0 VIM_Alarm_Thread[334024] INFO _thread.py.232 Thread Alarm: shutting down.
2019-08-14T16:00:06.724 controller-0 VIM_Alarm_Thread[334024] INFO _thread.py.236 Thread Alarm: shutdown.

The mtcAgent was restarted as a part of failure recovery, it did not send node update to SM to info that the controller-1 is unlocked-enabled. From SM on controller-0, it doesn't know the controller-1 is enabled.
It still has the controller-1 state as failed, and in this case as uncontrolled swact was not successful.

There are 2 issues, one is obviously vim had problem to go active after controller-0 took over active. 2nd issue is a uncontrolled swact shouldn't start before standby controller is confirmed enabled.

controller-0:~$ source /etc/platform/openrc
Openstack Admin credentials can only be loaded from the active controller.

controller-1:~$ source /etc/platform/openrc
Openstack Admin credentials can only be loaded from the active controller.

controller-0:~$ sudo sm-dump
sudo: ldap_sasl_bind_s(): Can't contact LDAP server
Password:

-Service_Groups------------------------------------------------------------------------
oam-services disabled disabled
controller-services disabled disabled
cloud-services disabled disabled
patching-services active active
directory-services active active
web-services active active
storage-services active active
storage-monitoring-services active active
vim-services disabled disabled failed
---------------------------------------------------------------------------------------

-Services------------------------------------------------------------------------------
oam-ip disabled disabled
management-ip disabled disabled
drbd-pg disabled disabled
drbd-rabbit disabled disabled
drbd-platform disabled disabled
pg-fs disabled disabled
rabbit-fs disabled disabled
nfs-mgmt disabled disabled
platform-fs disabled disabled
postgres disabled disabled
rabbit disabled disabled
platform-export-fs disabled disabled
platform-nfs-ip disabled disabled
sysinv-inv disabled disabled
sysinv-conductor disabled disabled
mtc-agent disabled disabled
hw-mon disabled disabled
dnsmasq disabled disabled
fm-mgr disabled disabled
keystone disabled disabled
open-ldap enabled-active enabled-active
snmp disabled disabled
lighttpd enabled-active enabled-active
horizon enabled-active enabled-active
patch-alarm-manager enabled-active enabled-active
mgr-restful-plugin enabled-active enabled-active
ceph-manager enabled-active enabled-active
vim disabled disabled
vim-api disabled disabled
vim-webserver disabled disabled
guest-agent disabled disabled
haproxy disabled disabled
pxeboot-ip disabled disabled
drbd-extension disabled disabled
extension-fs disabled disabled
extension-export-fs disabled disabled
etcd disabled disabled
drbd-etcd disabled disabled
etcd-fs disabled disabled
barbican-api disabled disabled
barbican-keystone-listener disabled disabled
barbican-worker disabled disabled
cluster-host-ip disabled disabled
docker-distribution disabled disabled
dockerdistribution-fs disabled disabled
drbd-dockerdistribution disabled disabled
helmrepository-fs disabled disabled
registry-token-server disabled disabled

controller-1:~$ sudo sm-dump
sudo: ldap_sasl_bind_s(): Can't contact LDAP server
Password:

-Service_Groups------------------------------------------------------------------------
oam-services standby standby
controller-services standby standby
cloud-services standby standby
patching-services standby standby
directory-services disabled disabled
web-services disabled disabled
storage-services disabled disabled
storage-monitoring-services standby standby
vim-services standby standby
---------------------------------------------------------------------------------------

-Services------------------------------------------------------------------------------
oam-ip enabled-standby disabled
management-ip enabled-standby disabled
drbd-pg enabled-standby enabled-standby
drbd-rabbit enabled-standby enabled-standby
drbd-platform enabled-standby enabled-standby
pg-fs enabled-standby disabled
rabbit-fs enabled-standby disabled
nfs-mgmt enabled-standby disabled
platform-fs enabled-standby disabled
postgres enabled-standby disabled
rabbit enabled-standby disabled
platform-export-fs enabled-standby disabled
platform-nfs-ip enabled-standby disabled
sysinv-inv enabled-standby disabled
sysinv-conductor enabled-standby disabled
mtc-agent enabled-standby disabled
hw-mon enabled-standby disabled
dnsmasq enabled-standby disabled
fm-mgr enabled-standby disabled
keystone enabled-standby disabled
open-ldap disabled disabled
snmp enabled-standby disabled
lighttpd disabled disabled
horizon disabled disabled
patch-alarm-manager enabled-standby disabled
mgr-restful-plugin disabled disabled
ceph-manager enabled-standby disabled
vim enabled-standby disabled
vim-api enabled-standby disabled
vim-webserver enabled-standby disabled
guest-agent enabled-standby disabled
haproxy enabled-standby disabled
pxeboot-ip enabled-standby disabled
drbd-extension enabled-standby enabled-standby
extension-fs enabled-standby disabled
extension-export-fs enabled-standby disabled
etcd enabled-standby disabled
drbd-etcd enabled-standby enabled-standby
etcd-fs enabled-standby disabled
barbican-api enabled-standby disabled
barbican-keystone-listener enabled-standby disabled
barbican-worker enabled-standby disabled
cluster-host-ip enabled-standby disabled
docker-distribution enabled-standby disabled
dockerdistribution-fs enabled-standby disabled
drbd-dockerdistribution enabled-standby enabled-standby
helmrepository-fs enabled-standby disabled
registry-token-server enabled-standby disabled
------------------------------------------------------------------------------
controller-1:~$

Severity
--------.
Major
Steps to Reproduce
------------------
1. Make sure system is installed and good health . No alarms.
2. Pull cable provision with Management and cluster network from active controller (Above test was on Controller-1)
3. Put back the cable as it was before.
4. Reboot on Controller-1 and Controller-0 become active.
5. After some time Controller-0 was not active admin crendtials are not working as per description.
controller-0:~$ source /etc/platform/openrc
Openstack Admin credentials can only be loaded from the active controller.

Expected Behavior
------------------
Controller-0 is active and Openstack Admin credentials

Actual Behavior
----------------
As per description no active controller
Reproducibility
---------------
Not sure tested once with the same load controller-0 rebooted and become disabled failed .
System Configuration
--------------------
Storage system

Branch/Pull Time/Commit
-----------------------
BUILD_DATE= 2019-08-12 21:00:17 -0400

Last Pass
---------
2019-02-10_20-18-00

Timestamp/Logs
--------------
2019-08-14T16:00:06

Test Activity
-------------
Regression test

See original description

Tags:

Revision history for this message

Anujeyan Manokeran (anujeyan) wrote on 2019-08-14:

collect logs Edit (29.3 MiB, application/x-tar)

Revision history for this message

Anujeyan Manokeran (anujeyan) wrote on 2019-08-14:

collect logs 2 Edit (24.3 MiB, application/x-tar)

Revision history for this message

Bin Qian (bqian20) wrote on 2019-08-14:

I looked at the logs again, it actually didn't have the 2nd issue that a uncontrolled swact started before the other controller was enabled. it was a service failure-recovery on a single controller (as SM did not receive the controller-1 being enabled msg, SM on controller-0 had controller-1 state as unlocked-failed). with a 5 minutes debounce wait time.

A vim patch should fix the issue reported.

Revision history for this message

Brent Rowsell (brent-rowsell) wrote on 2019-08-15:

Can you clarify Infra ? There is no infra interface anymore.

Revision history for this message

Anujeyan Manokeran (anujeyan) wrote on 2019-08-16: Re: cable pull test for cluster and MGT on active controller cause both controllers are not active

I mean cluster network . I have remove infra from launch pad text.

summary:	- cable pull test for infra and MGT on active controller cause both + cable pull test for cluster and MGT on active controller cause both controllers are not active
description:	updated

Numan Waheed (nwaheed) on 2019-08-16

tags:

added: stx.retestneeded

Numan Waheed (nwaheed) on 2019-08-16

summary:

cable pull test for cluster and MGT on active controller cause both
- controllers are not active
+ controllers as not active

Revision history for this message

Ghada Khalil (gkhalil) wrote on 2019-08-16: Re: cable pull test for cluster and MGT on active controller cause both controllers as not active

Marking as stx.3.0 / high priority -- failure scenario not handled properly by the system.

tags:	added: stx.nfv
tags:	added: stx.3.0
Changed in starlingx:
importance:	Undecided → High
status:	New → Triaged
assignee:	nobody → Bart Wensley (bartwensley)

Revision history for this message

Bart Wensley (bartwensley) wrote on 2019-09-11:

Download full text (5.2 KiB)

The VIM on controller-0 is failing to connect to the nova notifications queue:

NotFound: Queue.declare: (404) NOT_FOUND - home node '<email address hidden>' of durable queue 'notifications.nfvi_nova_listener_queue' in vhost 'nova' is down or inaccessible

This error indicates that the queue exists, but cannot be connected to. The rabbitmq logs indicate that rabbitmq is shutting down on controller-0:

{"log":"2019-08-14 {"log":"2019-08-14 {"log":" {"log":" * Winner: {"log":" * Losers: {"log":"2019-08-14 {"log":"2019-08-14 {"log":"2019-08-14 {"log":" {"log":" {"log":" {"log":" {"log":" {"log":" {"log":" {"log":" {"log":" {"log":" {"log":" {"log":" 15:58:15.535 [info] \u003c0.472.0\u003e Autoheal request received from '<email address hidden>'\n","stream":"stdout","time":"2019-08-14T15:58:15.535800761Z"}
15:58:15.536 [info] \u003c0.472.0\u003e Autoheal decision\n","stream":"stdout","time":"2019-08-14T15:58:15.536559958Z"}
* Partitions: [['<email address hidden>'],['<email address hidden>']]\n","stream":"stdout","time":"2019-08-14T15:58:15.536586158Z"}
'<email address hidden>'\n","stream":"stdout","time":"2019-08-14T15:58:15.536601564Z"}
['<email address hidden>']\n","stream":"stdout","time":"2019-08-14T15:58:15.536613184Z"}
15:58:15.536 [warning] \u003c0.472.0\u003e Autoheal: we were selected to restart; winner is '<email address hidden>'\n","stream":"stdout","time":"2019-08-14T15:58:15.537049161Z"}
15:58:15.536 [info] \u003c0.20917.1\u003e RabbitMQ is asked to stop...\n","stream":"stdout","time":"2019-08-14T15:58:15.537061637Z"}
15:58:15.585 [info] \u003c0.20917.1\u003e Stopping RabbitMQ applications and their dependencies in the following order:\n","stream":"stdout","time":"2019-08-14T15:58:15.585461935Z"}
rabbitmq_management\n","stream":"stdout","time":"2019-08-14T15:58:15.585474435Z"}
amqp_client\n","stream":"stdout","time":"2019-08-14T15:58:15.585478042Z"}
rabbitmq_web_dispatch\n","stream":"stdout","time":"2019-08-14T15:58:15.585481157Z"}
cowboy\n","stream":"stdout","time":"2019-08-14T15:58:15.585484303Z"}
cowlib\n","stream":"stdout","time":"2019-08-14T15:58:15.585491628Z"}
rabbitmq_peer_discovery_k8s\n","stream":"stdout","time":"2019-08-14T15:58:15.58549482Z"}
rabbitmq_peer_discovery_common\n","stream":"stdout","time":"2019-08-14T15:58:15.585497784Z"}
rabbitmq_management_agent\n","stream":"stdout","time":"2019-08-14T15:58:15.585500727Z"}
rabbit\n","stream":"stdout","time":"2019-08-14T15:58:15.585503554Z"}
mnesia\n","stream":"stdout","time":"2019-08-14T15:58:15.585506253Z"}
rabbit_common\n","stream":"stdout","time":"2019-08-14T15:58:15.585508974Z"}
sysmon_handler\n","stream":"stdout","time":"2019-08-14T15:58:15.58...

The VIM on controller-0 is failing to connect to the nova notifications queue:

NotFound: Queue.declare: (404) NOT_FOUND - home node 'rabbit@osh-openstack-rabbitmq-rabbitmq-1.osh-openstac-dsv-e50215.openstack.svc.cluster.local' of durable queue 'notifications.nfvi_nova_listener_queue' in vhost 'nova' is down or inaccessible

This error indicates that the queue exists, but cannot be connected to. The rabbitmq logs indicate that rabbitmq is shutting down on controller-0:

{"log":"2019-08-14 15:58:15.535 [info] \u003c0.472.0\u003e Autoheal request received from 'rabbit@osh-openstack-rabbitmq-rabbitmq-1.osh-openstac-dsv-e50215.openstack.svc.cluster.local'\n","stream":"stdout","time":"2019-08-14T15:58:15.535800761Z"}
{"log":"2019-08-14 15:58:15.536 [info] \u003c0.472.0\u003e Autoheal decision\n","stream":"stdout","time":"2019-08-14T15:58:15.536559958Z"}
{"log":"  * Partitions: [['rabbit@osh-openstack-rabbitmq-rabbitmq-1.osh-openstac-dsv-e50215.openstack.svc.cluster.local'],['rabbit@osh-openstack-rabbitmq-rabbitmq-0.osh-openstac-dsv-e50215.openstack.svc.cluster.local']]\n","stream":"stdout","time":"2019-08-14T15:58:15.536586158Z"}
{"log":"  * Winner:     'rabbit@osh-openstack-rabbitmq-rabbitmq-1.osh-openstac-dsv-e50215.openstack.svc.cluster.local'\n","stream":"stdout","time":"2019-08-14T15:58:15.536601564Z"}
{"log":"  * Losers:     ['rabbit@osh-openstack-rabbitmq-rabbitmq-0.osh-openstac-dsv-e50215.openstack.svc.cluster.local']\n","stream":"stdout","time":"2019-08-14T15:58:15.536613184Z"}
{"log":"2019-08-14 15:58:15.536 [warning] \u003c0.472.0\u003e Autoheal: we were selected to restart; winner is 'rabbit@osh-openstack-rabbitmq-rabbitmq-1.osh-openstac-dsv-e50215.openstack.svc.cluster.local'\n","stream":"stdout","time":"2019-08-14T15:58:15.537049161Z"}
{"log":"2019-08-14 15:58:15.536 [info] \u003c0.20917.1\u003e RabbitMQ is asked to stop...\n","stream":"stdout","time":"2019-08-14T15:58:15.537061637Z"}
{"log":"2019-08-14 15:58:15.585 [info] \u003c0.20917.1\u003e Stopping RabbitMQ applications and their dependencies in the following order:\n","stream":"stdout","time":"2019-08-14T15:58:15.585461935Z"}
{"log":"    rabbitmq_management\n","stream":"stdout","time":"2019-08-14T15:58:15.585474435Z"}
{"log":"    amqp_client\n","stream":"stdout","time":"2019-08-14T15:58:15.585478042Z"}
{"log":"    rabbitmq_web_dispatch\n","stream":"stdout","time":"2019-08-14T15:58:15.585481157Z"}
{"log":"    cowboy\n","stream":"stdout","time":"2019-08-14T15:58:15.585484303Z"}
{"log":"    cowlib\n","stream":"stdout","time":"2019-08-14T15:58:15.585491628Z"}
{"log":"    rabbitmq_peer_discovery_k8s\n","stream":"stdout","time":"2019-08-14T15:58:15.58549482Z"}
{"log":"    rabbitmq_peer_discovery_common\n","stream":"stdout","time":"2019-08-14T15:58:15.585497784Z"}
{"log":"    rabbitmq_management_agent\n","stream":"stdout","time":"2019-08-14T15:58:15.585500727Z"}
{"log":"    rabbit\n","stream":"stdout","time":"2019-08-14T15:58:15.585503554Z"}
{"log":"    mnesia\n","stream":"stdout","time":"2019-08-14T15:58:15.585506253Z"}
{"log":"    rabbit_common\n","stream":"stdout","time":"2019-08-14T15:58:15.585508974Z"}
{"log":"    sysmon_handler\n","stream":"stdout","time":"2019-08-14T15:58:15.585512513Z"}
{"log":"    os_mon\n","stream":"stdout","time":"2019-08-14T15:58:15.58551719Z"}
{"log":"2019-08-14 15:58:15.585 [info] \u003c0.20917.1\u003e Stopping application 'rabbitmq_management'\n","stream":"stdout","time":"2019-08-14T15:58:15.585521823Z"}
{"log":"2019-08-14 15:58:15.587 [warning] \u003c0.959.0\u003e RabbitMQ HTTP listener registry could not find context rabbitmq_management_tls\n","stream":"stdout","time":"2019-08-14T15:58:15.587999948Z"}
{"log":"2019-08-14 15:58:15.590 [info] \u003c0.20917.1\u003e Stopping application 'amqp_client'\n","stream":"stdout","time":"2019-08-14T15:58:15.591123618Z"}
{"log":"2019-08-14 15:58:15.592 [info] \u003c0.20917.1\u003e Stopping application 'rabbitmq_web_dispatch'\n","stream":"stdout","time":"2019-08-14T15:58:15.592915835Z"}
{"log":"2019-08-14 15:58:15.594 [info] \u003c0.20917.1\u003e Stopping application 'cowboy'\n","stream":"stdout","time":"2019-08-14T15:58:15.594802094Z"}
{"log":"2019-08-14 15:58:15.596 [info] \u003c0.20917.1\u003e Stopping application 'cowlib'\n","stream":"stdout","time":"2019-08-14T15:58:15.596375611Z"}
{"log":"2019-08-14 15:58:15.596 [info] \u003c0.20917.1\u003e Stopping application 'rabbitmq_peer_discovery_k8s'\n","stream":"stdout","time":"2019-08-14T15:58:15.596404374Z"}
{"log":"2019-08-14 15:58:15.596 [info] \u003c0.20917.1\u003e Stopping application 'rabbitmq_peer_discovery_common'\n","stream":"stdout","time":"2019-08-14T15:58:15.596423837Z"}
{"log":"2019-08-14 15:58:15.597 [info] \u003c0.20917.1\u003e Stopping application 'rabbitmq_management_agent'\n","stream":"stdout","time":"2019-08-14T15:58:15.598089513Z"}
{"log":"2019-08-14 15:58:15.602 [info] \u003c0.20917.1\u003e Stopping application 'rabbit'\n","stream":"stdout","time":"2019-08-14T15:58:15.602731143Z"}
{"log":"2019-08-14 15:58:15.602 [info] \u003c0.282.0\u003e Peer discovery backend rabbit_peer_discovery_k8s does not support registration, skipping unregistration.\n","stream":"stdout","time":"2019-08-14T15:58:15.6027601Z"}

Further investigation is required to understand what is wrong with rabbitmq on controller-0.

Revision history for this message

Dariush Eslimi (deslimi) wrote on 2019-11-22:

Can we reproduce this and keep the lab in that state so we can debug further?

Revision history for this message

Ghada Khalil (gkhalil) wrote on 2019-11-25:

This sounds similar to https://bugs.launchpad.net/starlingx/+bug/1835807

Agree with Dariush that we need to know if this is still reproducible with openstack train

Changed in starlingx:
status:	Triaged → Incomplete

Revision history for this message

Ghada Khalil (gkhalil) wrote on 2020-02-05:

#10

Closing as a duplicate of https://bugs.launchpad.net/starlingx/+bug/1835807

Changed in starlingx:
status:	Incomplete → Won't Fix

Yang Liu (yliu12) on 2020-02-10

summary:

- cable pull test for cluster and MGT on active controller cause both
- controllers as not active
+ OpenStack: cable pull test for cluster and MGT on active controller
+ cause both controllers as not active

Revision history for this message

Ghada Khalil (gkhalil) wrote on 2020-06-27:

#11

LP was closed as Won't Fix a long time ago; removing the stx.retestneeded tag

tags:

removed: stx.retestneeded