mojo upgrade stable_to_next_ha xenial queens failed with rabbitmq app startup timing out
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack RabbitMQ Server Charm |
Confirmed
|
Medium
|
Unassigned |
Bug Description
Running stable_to_next_ha in mojo for xenial/queens, the rabbitmq server failed on the upgrade:
juju status rabbitmq-server
Model Controller Cloud/Region Version SLA Timestamp
mojo serverstack serverstack/
App Version Status Scale Charm Store Rev OS Notes
rabbitmq-server 3.6.10 error 3 rabbitmq-server local 151 ubuntu
Unit Workload Agent Machine Public address Ports Message
rabbitmq-server/0* error idle 37 10.5.0.51 5672/tcp hook failed: "config-changed"
rabbitmq-server/1 active idle 38 10.5.0.44 5672/tcp Unit is ready and clustered
rabbitmq-server/2 active idle 39 10.5.0.80 5672/tcp Unit is ready and clustered
Machine State DNS Inst id Series AZ Message
37 started 10.5.0.51 49c1ca9a-
38 started 10.5.0.44 3a5260d1-
39 started 10.5.0.80 c0aa0ae0-
Excerpt from end of juju log for rabbitmq-server/0 unit:
2019-04-15 15:23:08 INFO juju-log Cookie already synchronized with peer.
2019-04-15 15:23:08 WARNING juju-log min-cluster-size is not defined, race conditions may occur if this is not a single unit deployment.
2019-04-15 15:23:08 INFO juju-log Clustering with new node
2019-04-15 15:23:08 INFO juju-log Host already clustered with rabbit@
2019-04-15 15:23:08 DEBUG juju-log am I clustered?: True
2019-04-15 15:23:10 DEBUG juju-log Running ['/usr/
2019-04-15 15:23:11 DEBUG config-changed Setting policy "TTL" for pattern "^(versioned_
2019-04-15 15:23:12 DEBUG juju-log Waiting 0 seconds for operation ...
2019-04-15 15:23:55 DEBUG config-changed Job for rabbitmq-
code. See "systemctl status rabbitmq-
2019-04-15 15:23:55 DEBUG juju-log Waiting for rabbitmq app to start: /<email address hidden>
2019-04-15 15:23:56 DEBUG juju-log Running ['timeout', '180', '/usr/sbin/
2019-04-15 15:23:57 DEBUG config-changed Waiting for 'rabbit@
2019-04-15 15:23:57 DEBUG config-changed pid is 961
2019-04-15 15:23:57 DEBUG config-changed Error: process_not_running
2019-04-15 15:23:59 DEBUG config-changed Error: unable to connect to node 'rabbit@
2019-04-15 15:23:59 DEBUG config-changed
2019-04-15 15:23:59 DEBUG config-changed DIAGNOSTICS
2019-04-15 15:23:59 DEBUG config-changed ===========
2019-04-15 15:23:59 DEBUG config-changed
2019-04-15 15:23:59 DEBUG config-changed attempted to contact: ['rabbit@
2019-04-15 15:23:59 DEBUG config-changed
2019-04-15 15:23:59 DEBUG config-changed rabbit@
2019-04-15 15:23:59 DEBUG config-changed * connected to epmd (port 4369) on juju-d3ae8a-mojo-37
2019-04-15 15:23:59 DEBUG config-changed * epmd reports: node 'rabbit' not running at all
2019-04-15 15:23:59 DEBUG config-changed no other nodes on juju-d3ae8a-mojo-37
2019-04-15 15:23:59 DEBUG config-changed * suggestion: start the node
2019-04-15 15:23:59 DEBUG config-changed
2019-04-15 15:23:59 DEBUG config-changed current node details:
2019-04-15 15:23:59 DEBUG config-changed - node name: 'rabbitmq-
2019-04-15 15:23:59 DEBUG config-changed - home dir: /var/lib/rabbitmq
2019-04-15 15:23:59 DEBUG config-changed - cookie hash: qG36QeKV+
2019-04-15 15:23:59 DEBUG config-changed
2019-04-15 15:23:59 DEBUG config-changed Traceback (most recent call last):
2019-04-15 15:23:59 DEBUG config-changed File "/var/lib/
2019-04-15 15:23:59 DEBUG config-changed hooks.execute(
2019-04-15 15:23:59 DEBUG config-changed File "/var/lib/
2019-04-15 15:23:59 DEBUG config-changed self._hooks[
2019-04-15 15:23:59 DEBUG config-changed File "/var/lib/
2019-04-15 15:23:59 DEBUG config-changed wait_app()
2019-04-15 15:23:59 DEBUG config-changed File "/var/lib/
2019-04-15 15:23:59 DEBUG config-changed raise ex
2019-04-15 15:23:59 DEBUG config-changed File "/var/lib/
2019-04-15 15:23:59 DEBUG config-changed rabbitmqctl('wait', pid_file)
2019-04-15 15:23:59 DEBUG config-changed File "/var/lib/
2019-04-15 15:23:59 DEBUG config-changed subprocess.
2019-04-15 15:23:59 DEBUG config-changed File "/usr/lib/
2019-04-15 15:23:59 DEBUG config-changed raise CalledProcessEr
2019-04-15 15:23:59 DEBUG config-changed subprocess.
2019-04-15 15:24:00 ERROR juju.worker.
Changed in charm-rabbitmq-server: | |
importance: | Undecided → Medium |
status: | New → Confirmed |
Changed in charm-rabbitmq-server: | |
milestone: | none → 19.07 |
Changed in charm-rabbitmq-server: | |
milestone: | 19.07 → 19.10 |
Changed in charm-rabbitmq-server: | |
milestone: | 19.10 → 20.01 |
tags: | added: charm-upgrade |
Changed in charm-rabbitmq-server: | |
milestone: | 20.01 → 20.05 |
Changed in charm-rabbitmq-server: | |
milestone: | 20.05 → 20.08 |
Changed in charm-rabbitmq-server: | |
milestone: | 20.08 → none |
More details from another run: from the failed unit, it didn't restart:
Apr 15 16:44:28 juju-494bf1-mojo-37 rabbitmq[13822]: pid is 13832 server. service: Start-post operation timed out. Stopping. server. service: State 'stop-final- sigterm' timed out. Killing. server. service: Unit entered failed state. server. service: Failed with result 'timeout'.
Apr 15 16:45:57 juju-494bf1-mojo-37 systemd[1]: rabbitmq-
Apr 15 16:47:27 juju-494bf1-mojo-37 systemd[1]: rabbitmq-
Apr 15 16:47:27 juju-494bf1-mojo-37 systemd[1]: Failed to start RabbitMQ Messaging Server.
Apr 15 16:47:27 juju-494bf1-mojo-37 systemd[1]: rabbitmq-
Apr 15 16:47:27 juju-494bf1-mojo-37 systemd[1]: rabbitmq-