OpenStack Percona Cluster Charm

hacluster charm series upgrade failed during upgrade 16.04 to 18.04 [xenial->bionic]

Bug #1879135 reported by vinaya on 2020-05-17

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	OpenStack HA Cluster Charm	Expired	Undecided	Unassigned
	OpenStack Percona Cluster Charm	Invalid	Undecided	Unassigned

Bug Description

mysql upgrade charms are looking for crm114 /crmsh packages which may be removed during upgradation process and its staying in loop.
after installing the package upgrade hook completed but still its VIPs are not configure automatically.

Here is the error

unit-mysql-hacluster-0: 03:58:43 ERROR unit.mysql-hacluster/0.juju-log Pacemaker is down. Please manually start it.
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade Traceback (most recent call last):
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade File "/var/lib/juju/agents/unit-mysql-hacluster-0/charm/hooks/utils.py", line 897, in try_pcmk_wait
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade pcmk.wait_for_pcmk()
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade File "/var/lib/juju/agents/unit-mysql-hacluster-0/charm/hooks/pcmk.py", line 55, in wait_for_pcmk
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade "".format(retries, output))
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade pcmk.ServicesNotUp: Pacemaker or Corosync are still down after waiting for 12 retries. Last output: /bin/sh: 1: crm: not found
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade During handling of the above exception, another exception occurred:
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade Traceback (most recent call last):
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade File "/var/lib/juju/agents/unit-mysql-hacluster-0/charm/hooks/post-series-upgrade", line 658, in <module>
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade hooks.execute(sys.argv)
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade File "/var/lib/juju/agents/unit-mysql-hacluster-0/charm/charmhelpers/core/hookenv.py", line 934, in execute
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade self._hooks[hook_name]()
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade File "/var/lib/juju/agents/unit-mysql-hacluster-0/charm/hooks/post-series-upgrade", line 628, in series_upgrade_complete
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade config_changed()
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade File "/var/lib/juju/agents/unit-mysql-hacluster-0/charm/hooks/post-series-upgrade", line 196, in config_changed
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade try_pcmk_wait()
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade File "/var/lib/juju/agents/unit-mysql-hacluster-0/charm/hooks/utils.py", line 903, in try_pcmk_wait
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade raise pcmk.ServicesNotUp(msg)
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade pcmk.ServicesNotUp: Pacemaker is down. Please manually start it.
unit-mysql-hacluster-0: 03:58:43 ERROR juju.worker.uniter.operation hook "post-series-upgrade" failed: exit status 1
unit-mysql-hacluster-0: 04:01:25 INFO unit.mysql-hacluster/0.juju-log Running complete series upgrade hook
unit-mysql-hacluster-0: 04:01:28 INFO unit.mysql-hacluster/0.juju-log Making dir /usr/lib/ocf/resource.d/ceph root:root 555
unit-mysql-hacluster-0: 04:01:

mysql become

Tags:

Revision history for this message

vinaya (agrahar) wrote on 2020-05-17:

more readable output can be found here.

https://pastebin.ubuntu.com/p/TGFVGdZXDx/

followed the exact steps shared here.
https://jaas.ai/percona-cluster

Revision history for this message

Alex Kavanagh (ajkavanagh) wrote on 2020-05-18:

This seems to be affecting the hacluster charm and not percona-cluster. marking as invalid.

Changed in charm-percona-cluster:
status:	New → Invalid
summary:	- mysql series upgrade 16.04 to 18.04 staying loop + hacluster charm series upgrade failed during upgrade 16.04 to 18.04 + [xenial->bionic]

Revision history for this message

Alex Kavanagh (ajkavanagh) wrote on 2020-05-18:

Unfortunately, there's not enough detail to go on to understand what may have caused the problem. If possible, please could you obtain the logs (in /var/log/juju) for the affected percona cluster units - ideally a juju-crashdump, but otherwise the logs and the syslog.

Changed in charm-hacluster:
status:	New → Incomplete

Revision history for this message

vinaya (agrahar) wrote on 2020-05-18:

Hi Alex,

can you just try mysql upgrade percona serie upgrade from ubuntu 16.04 to ubuntu 18.04 then you can reproduce the issue.

As percona upgrade was failing in mysql non-leader units i finally remove them and installed new one that solved my issue.

so currently i dont have logs for failed units.

Revision history for this message

Alex Kavanagh (ajkavanagh) wrote on 2020-05-18:

Are you running distro charms (openstack-origin/source not set), or something else.

We run series upgrade weekly and have not seen this particular issue before, so it would be good to understand how it occurred. As there are no logs now, it's difficult to see how to analyse this one now, unfortunately.

tags:

added: series-upgrade

Revision history for this message

vinaya (agrahar) wrote on 2020-05-18:

Hi Alex,

i will try to reproduce and upload the logs.

followed the exact steps shared here. https://jaas.ai/percona-cluster/288

i think in the instructions its given the command juju upgrade-series $MACHINE_NUMBER prepare $SERIES
that should trigger the source option i feel.

what's your opinion ?

there is no option for openstack-origin for percona cluster charms, so after following steps i set the source to distro.

can you just correct the below steps if anything wrong.

1.Pause all non-leader units and corresponding hacluster units
2.juju upgrade-series $MACHINE_NUMBER prepare $SERIES

do-release-upgrade plus any further steps administratively required steps for an upgrade.

Reboot

Complete the series upgrade on the leader:

juju upgrade-series $MACHINE_NUMBER complete

Administratively validate the leader node database is up and running

Connect to the database and check for expected data
Review "SHOW GLOBAL STATUS;"

Revision history for this message

Alex Kavanagh (ajkavanagh) wrote on 2020-05-26:

Hi, if you are following "Series upgrade" in https://jaas.ai/percona-cluster then it should work. Please do make sure you are using the latest version of the charms (290 at the time of writing this). If there is still a problem, please post the logs from the affected machines.

Revision history for this message

Launchpad Janitor (janitor) wrote on 2020-07-26:

[Expired for OpenStack hacluster charm because there has been no activity for 60 days.]

Changed in charm-hacluster:
status:	Incomplete → Expired

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.