hacluster charm series upgrade failed during upgrade 16.04 to 18.04 [xenial->bionic]

Bug #1879135 reported by vinaya
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack HA Cluster Charm
Expired
Undecided
Unassigned
OpenStack Percona Cluster Charm
Invalid
Undecided
Unassigned

Bug Description

mysql upgrade charms are looking for crm114 /crmsh packages which may be removed during upgradation process and its staying in loop.
after installing the package upgrade hook completed but still its VIPs are not configure automatically.

Here is the error

unit-mysql-hacluster-0: 03:58:43 ERROR unit.mysql-hacluster/0.juju-log Pacemaker is down. Please manually start it.
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade Traceback (most recent call last):
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade File "/var/lib/juju/agents/unit-mysql-hacluster-0/charm/hooks/utils.py", line 897, in try_pcmk_wait
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade pcmk.wait_for_pcmk()
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade File "/var/lib/juju/agents/unit-mysql-hacluster-0/charm/hooks/pcmk.py", line 55, in wait_for_pcmk
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade "".format(retries, output))
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade pcmk.ServicesNotUp: Pacemaker or Corosync are still down after waiting for 12 retries. Last output: /bin/sh: 1: crm: not found
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade During handling of the above exception, another exception occurred:
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade Traceback (most recent call last):
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade File "/var/lib/juju/agents/unit-mysql-hacluster-0/charm/hooks/post-series-upgrade", line 658, in <module>
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade hooks.execute(sys.argv)
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade File "/var/lib/juju/agents/unit-mysql-hacluster-0/charm/charmhelpers/core/hookenv.py", line 934, in execute
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade self._hooks[hook_name]()
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade File "/var/lib/juju/agents/unit-mysql-hacluster-0/charm/hooks/post-series-upgrade", line 628, in series_upgrade_complete
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade config_changed()
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade File "/var/lib/juju/agents/unit-mysql-hacluster-0/charm/hooks/post-series-upgrade", line 196, in config_changed
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade try_pcmk_wait()
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade File "/var/lib/juju/agents/unit-mysql-hacluster-0/charm/hooks/utils.py", line 903, in try_pcmk_wait
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade raise pcmk.ServicesNotUp(msg)
unit-mysql-hacluster-0: 03:58:43 DEBUG unit.mysql-hacluster/0.post-series-upgrade pcmk.ServicesNotUp: Pacemaker is down. Please manually start it.
unit-mysql-hacluster-0: 03:58:43 ERROR juju.worker.uniter.operation hook "post-series-upgrade" failed: exit status 1
unit-mysql-hacluster-0: 04:01:25 INFO unit.mysql-hacluster/0.juju-log Running complete series upgrade hook
unit-mysql-hacluster-0: 04:01:28 INFO unit.mysql-hacluster/0.juju-log Making dir /usr/lib/ocf/resource.d/ceph root:root 555
unit-mysql-hacluster-0: 04:01:

mysql become

Revision history for this message
vinaya (agrahar) wrote :

more readable output can be found here.

https://pastebin.ubuntu.com/p/TGFVGdZXDx/

followed the exact steps shared here.
https://jaas.ai/percona-cluster

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

This seems to be affecting the hacluster charm and not percona-cluster. marking as invalid.

Changed in charm-percona-cluster:
status: New → Invalid
summary: - mysql series upgrade 16.04 to 18.04 staying loop
+ hacluster charm series upgrade failed during upgrade 16.04 to 18.04
+ [xenial->bionic]
Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

Unfortunately, there's not enough detail to go on to understand what may have caused the problem. If possible, please could you obtain the logs (in /var/log/juju) for the affected percona cluster units - ideally a juju-crashdump, but otherwise the logs and the syslog.

Changed in charm-hacluster:
status: New → Incomplete
Revision history for this message
vinaya (agrahar) wrote :

Hi Alex,

can you just try mysql upgrade percona serie upgrade from ubuntu 16.04 to ubuntu 18.04 then you can reproduce the issue.

As percona upgrade was failing in mysql non-leader units i finally remove them and installed new one that solved my issue.

so currently i dont have logs for failed units.

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

Are you running distro charms (openstack-origin/source not set), or something else.

We run series upgrade weekly and have not seen this particular issue before, so it would be good to understand how it occurred. As there are no logs now, it's difficult to see how to analyse this one now, unfortunately.

tags: added: series-upgrade
Revision history for this message
vinaya (agrahar) wrote :

Hi Alex,

i will try to reproduce and upload the logs.

followed the exact steps shared here. https://jaas.ai/percona-cluster/288

i think in the instructions its given the command juju upgrade-series $MACHINE_NUMBER prepare $SERIES
 that should trigger the source option i feel.

what's your opinion ?

there is no option for openstack-origin for percona cluster charms, so after following steps i set the source to distro.

can you just correct the below steps if anything wrong.

1.Pause all non-leader units and corresponding hacluster units
2.juju upgrade-series $MACHINE_NUMBER prepare $SERIES

do-release-upgrade plus any further steps administratively required steps for an upgrade.

Reboot

Complete the series upgrade on the leader:

juju upgrade-series $MACHINE_NUMBER complete

Administratively validate the leader node database is up and running

Connect to the database and check for expected data
Review "SHOW GLOBAL STATUS;"

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

Hi, if you are following "Series upgrade" in https://jaas.ai/percona-cluster then it should work. Please do make sure you are using the latest version of the charms (290 at the time of writing this). If there is still a problem, please post the logs from the affected machines.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for OpenStack hacluster charm because there has been no activity for 60 days.]

Changed in charm-hacluster:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.