Database service unavailable after base upgrade

Bug #2047979 reported by Peter Matulis
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MySQL InnoDB Cluster Charm
Triaged
Medium
Unassigned

Bug Description

The Charm Guide [0] does not currently cover the series (base) upgrade of machines hosting mysql-innodb-cluster units. I therefore followed the process for the documented percona-cluster process. It failed. All units show 'Cluster is inaccessible'. Note that the percona-cluster charm had an action to perform once the series upgrade is complete. No such action is supported by mysql-innodb-cluster.

The environment is based on three physical MAAS nodes, as per the Charm Guide's Getting Started tutorial [1].

This is a fresh and working deployment of Focal-Yoga and I was attempting a series (base) upgrade to Jammy-Yoga.

Juju versioning:

* client: 3.1.7
* controller: 3.1.5
* model: 3.1.5

[0]: https://docs.openstack.org/charm-guide/latest/admin/upgrades/series-openstack.html
[1]: https://docs.openstack.org/charm-guide/latest/getting-started/index.html

Revision history for this message
Peter Matulis (petermatulis) wrote :
Revision history for this message
Peter Matulis (petermatulis) wrote :
Revision history for this message
Peter Matulis (petermatulis) wrote :
Revision history for this message
Peter Matulis (petermatulis) wrote :
Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

Hi Peter

Thanks for taking the time to file this bug report and the associated information. Unfortunately, the crashdump doesn't contain the /var/log/mysql/error.log files that would tell us what's going on with the mysql units. If you still have the environment up, please could you add those as well?

Thanks!

Changed in charm-mysql-innodb-cluster:
status: New → Incomplete
Revision history for this message
Peter Matulis (petermatulis) wrote :

@Alex, I added the last 2000 lines of the MySQL error log for the unit leader but the whole log contains the same, which isn't much - just deprecation warnings, which is strange.

Changed in charm-mysql-innodb-cluster:
status: Incomplete → New
Revision history for this message
Peter Matulis (petermatulis) wrote :

Tested this again on a model consisting of just three database units (one cluster).

I again went through the procedure as documented. This led to all three units in a Blocked state, as before.

I then did:

juju run mysql-innodb-cluster/leader reboot-cluster-from-complete-outage

This brought the leader unit back to an Active state.

One of the remaining two units was then resumed:

juju run mysql-innodb-cluster/1 resume

This surprisingly brought the entire cluster online.

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

> This surprisingly brought the entire cluster online.

@Peter, that's both good and interesting news. If this is a viable workaround, we should probably just document it as part of the upgrade. I'm still on the fence on how to actually solve the bug in a robust, elegant, way.

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

Triaged as medium as there is potentially a workaround.

Changed in charm-mysql-innodb-cluster:
importance: Undecided → Medium
status: New → Triaged
Revision history for this message
Peter Matulis (petermatulis) wrote :

Trying this yet again, I didn't pause any units and I did not run the action. I let each upgraded unit settle before upgrading the next. The cluster ended up healthy. I don't know if the clinical nature of the test played a role in this (the database wasn't being used by any workload).

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.