juju add-unit --to lxd:7 resulted in non-operational cluster

Bug #1832512 reported by David O Neill
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack RabbitMQ Server Charm
Expired
Undecided
Unassigned

Bug Description

   Description
   ===========
   A node outage brought rabbitmq-server units from 3/3 to 2/3.
   Engineers attempted to fix the cluster and it resulted in a non-operational cluster.

   Steps to reproduce
   ==================
   juju add-unit --to lxd:7
   # 3/4
   wait for "ready and clustered"
   juju remove-mahcine --force (offline machine)
   # 3/3

   Expected result
   ===============
   Rabbit clustered and fully opertional

   Actual result
   =============
   Rabbit became non responsive
   # following command hung
   juju run -a rabbitmq-server 'rabbitmqctl cluster_status; rabbitmqctl list_queues -p openstack |wc -l'

   Suspected quorum lost, leader lost, break away/sharded cluster or other type of scenario.

   Environment
   ===========

   juju --version
   2.6.3-xenial-amd6

   juju config keystone openstack-origin
   cloud:xenial-ocata

   Logs
   ====
   https://canonical.my.salesforce.com/5003z00001yVYpI?srPos=0&srKp=500

   sosreport-juju-controller1.00230959-20190612113112.tar.xz
   sosreport-juju-f20ba5-27-lxd-0.00230959-20190612110204.tar.xz
   sosreport-juju-f20ba5-29-lxd-1.00230959-20190612112052.tar.xz
   sosreport-juju-f20ba5-7-lxd-1.00230959-20190612112743.tar.xz

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

Unfortunately, the logs aren't available (they are locked inside salesforce); it's not clear whether this is a transient error (i.e. a problem with rabbitmq) or an issue with the charm doing the wrong thing. Please could you either attached sanitized logs to the bug report of paste relevant errors from logs/status reports. Thanks.

Changed in charm-rabbitmq-server:
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for OpenStack rabbitmq-server charm because there has been no activity for 60 days.]

Changed in charm-rabbitmq-server:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.