Fail to upgrade mysql-router charm from version #6 to #8

Bug #1927981 reported by Thobias Trevisan
22
This bug affects 3 people
Affects Status Importance Assigned to Milestone
MySQL Router Charm
Fix Released
High
Alex Kavanagh

Bug Description

mysql-router charm fails to upgrade when there is a customized "cluster-name" defined on charm mysql-innodb-cluster.

Openstack version Victoria

Log error:
2021-05-04 18:09:57 INFO juju-log Invoking reactive handler: reactive/layer_openstack.py:46:default_upgrade_charm
2021-05-04 18:09:57 INFO juju-log Invoking reactive handler: reactive/layer_openstack.py:34:default_config_changed
2021-05-04 18:09:57 ERROR juju-log Hook error:
Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-aodh-mysql-router-1/.venv/lib/python3.8/site-packages/charms/reactive/__init__.py", line 74, in main
    bus.dispatch(restricted=restricted_mode)
  File "/var/lib/juju/agents/unit-aodh-mysql-router-1/.venv/lib/python3.8/site-packages/charms/reactive/bus.py", line 390, in dispatch
    _invoke(other_handlers)
  File "/var/lib/juju/agents/unit-aodh-mysql-router-1/.venv/lib/python3.8/site-packages/charms/reactive/bus.py", line 359, in _invoke
    handler.invoke()
  File "/var/lib/juju/agents/unit-aodh-mysql-router-1/.venv/lib/python3.8/site-packages/charms/reactive/bus.py", line 181, in invoke
    self._action(*args)
  File "/var/lib/juju/agents/unit-aodh-mysql-router-1/charm/reactive/layer_openstack.py", line 42, in default_config_changed
    instance.config_changed()
  File "lib/charm/openstack/mysql_router.py", line 670, in config_changed
    self.update_config_parameters(_parameters)
  File "lib/charm/openstack/mysql_router.py", line 627, in update_config_parameters
    config[heading][param] = parameters[heading][param]
  File "/usr/lib/python3.8/configparser.py", line 960, in __getitem__
    raise KeyError(key)
KeyError: 'metadata_cache:jujuCluster'

2021-05-04 18:09:57 WARNING upgrade-charm Traceback (most recent call last):
2021-05-04 18:09:57 WARNING upgrade-charm File "/var/lib/juju/agents/unit-aodh-mysql-router-1/charm/hooks/upgrade-charm", line 22, in <module>
2021-05-04 18:09:57 WARNING upgrade-charm main()
2021-05-04 18:09:57 WARNING upgrade-charm File "/var/lib/juju/agents/unit-aodh-mysql-router-1/.venv/lib/python3.8/site-packages/charms/reactive/__init__.py", line 74, in main
2021-05-04 18:09:57 WARNING upgrade-charm bus.dispatch(restricted=restricted_mode)
2021-05-04 18:09:57 WARNING upgrade-charm File "/var/lib/juju/agents/unit-aodh-mysql-router-1/.venv/lib/python3.8/site-packages/charms/reactive/bus.py", line 390, in dispatch
2021-05-04 18:09:57 WARNING upgrade-charm _invoke(other_handlers)
2021-05-04 18:09:57 WARNING upgrade-charm File "/var/lib/juju/agents/unit-aodh-mysql-router-1/.venv/lib/python3.8/site-packages/charms/reactive/bus.py", line 359, in _invoke
2021-05-04 18:09:57 WARNING upgrade-charm handler.invoke()
2021-05-04 18:09:57 WARNING upgrade-charm File "/var/lib/juju/agents/unit-aodh-mysql-router-1/.venv/lib/python3.8/site-packages/charms/reactive/bus.py", line 181, in invoke
2021-05-04 18:09:57 WARNING upgrade-charm self._action(*args)
2021-05-04 18:09:57 WARNING upgrade-charm File "/var/lib/juju/agents/unit-aodh-mysql-router-1/charm/reactive/layer_openstack.py", line 42, in default_config_changed
2021-05-04 18:09:57 WARNING upgrade-charm instance.config_changed()
2021-05-04 18:09:57 WARNING upgrade-charm File "lib/charm/openstack/mysql_router.py", line 670, in config_changed
2021-05-04 18:09:57 WARNING upgrade-charm self.update_config_parameters(_parameters)
2021-05-04 18:09:57 WARNING upgrade-charm File "lib/charm/openstack/mysql_router.py", line 627, in update_config_parameters
2021-05-04 18:09:57 WARNING upgrade-charm config[heading][param] = parameters[heading][param]
2021-05-04 18:09:57 WARNING upgrade-charm File "/usr/lib/python3.8/configparser.py", line 960, in __getitem__
2021-05-04 18:09:57 WARNING upgrade-charm raise KeyError(key)
2021-05-04 18:09:57 WARNING upgrade-charm KeyError: 'metadata_cache:jujuCluster'
2021-05-04 18:09:57 ERROR juju.worker.uniter.operation runhook.go:139 hook "upgrade-charm" (via explicit, bespoke hook script) failed: exit status 1

It seems that the cluster name does not match:
https://opendev.org/openstack/charm-mysql-router/src/branch/master/src/lib/charm/openstack/mysql_router.py#L624
https://opendev.org/openstack/charm-mysql-router/src/branch/master/src/lib/charm/openstack/mysql_router.py#L652
so the configparser failure with the keyerror when it tries to update the config.
https://opendev.org/openstack/charm-mysql-router/src/branch/master/src/lib/charm/openstack/mysql_router.py#L670

tags: added: charm-upgrade
Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

Yup, the issue is in this code:

        config = configparser.ConfigParser()
        config.read(self.mysqlrouter_conf)
        for heading in parameters.keys():
            for param in parameters[heading].keys():
                config[heading][param] = parameters[heading][param]

During charm-upgrade, the charm may have introduced new headings which didn't exist in the old version of the charm. Thus, the above code would break on the config[heading] dict-like object not having the heading.

Going to work on a fix.

Changed in charm-mysql-router:
importance: Undecided → High
status: New → Triaged
assignee: nobody → Alex Kavanagh (ajkavanagh)
tags: added: needs-backport
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-mysql-router (master)
Changed in charm-mysql-router:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-mysql-router (master)

Reviewed: https://review.opendev.org/c/openstack/charm-mysql-router/+/798372
Committed: https://opendev.org/openstack/charm-mysql-router/commit/114fb0805139224d2f281b959a187c80c6ede782
Submitter: "Zuul (22348)"
Branch: master

commit 114fb0805139224d2f281b959a187c80c6ede782
Author: Alex Kavanagh <email address hidden>
Date: Mon Jun 28 20:01:40 2021 +0100

    Ensure update config function can handle new headers

    The linked bug occurred due to a new header for the config file being in
    the upgraded charm, but not in the previous version of the charm. As
    the config file is read, updated, and then written, new headings in the
    INI file would crash the charm code with a KeyError.

    This patch just sets the header and parameter if it is missing.

    Closes-Bug: #1927981
    Change-Id: I18a6a4143ee0a1144eade5caa50611b802cba28a

Changed in charm-mysql-router:
status: In Progress → Fix Committed
Changed in charm-mysql-router:
milestone: none → 21.10
Changed in charm-mysql-router:
status: Fix Committed → Fix Released
Revision history for this message
Felipe Alencastro (falencastro) wrote (last edit ):

Issue was partially fixed, the original sympton is gone but the charm is writing two different metadata_cache sections on the config file preventing the service from starting up (dth is our cluster name).

root@juju-2752e1-5-lxd-18:/var/log# cat syslog|grep -B2 -A4 1834373
Feb 16 14:19:44 juju-2752e1-5-lxd-18 systemd[1]: Starting MySQL Router...
Feb 16 14:19:44 juju-2752e1-5-lxd-18 systemd[1]: Started MySQL Router.
Feb 16 14:19:44 juju-2752e1-5-lxd-18 start.sh[1834373]: PID 1834373 written to '/run/mysql/mysqlrouter-magnum-mysql-router.pid'
Feb 16 14:19:44 juju-2752e1-5-lxd-18 start.sh[1834373]: Error: MySQL Router currently supports only one metadata_cache instance. There is more than one metadata_cache section in the router configuration. Exiting.
Feb 16 14:19:44 juju-2752e1-5-lxd-18 systemd[1]: magnum-mysql-router.service: Main process exited, code=exited, status=1/FAILURE
Feb 16 14:19:44 juju-2752e1-5-lxd-18 systemd[1]: magnum-mysql-router.service: Failed with result 'exit-code'.
Feb 16 14:19:44 juju-2752e1-5-lxd-18 systemd[1]: magnum-mysql-router.service: Scheduled restart job, restart counter is at 5.
Feb 16 14:19:44 juju-2752e1-5-lxd-18 systemd[1]: Stopped MySQL Router.

root@juju-2752e1-5-lxd-18:/var/log# cat /var/lib/mysql/magnum-mysql-router/mysqlrouter.conf|grep metadata_cache
[metadata_cache:dth]
[metadata_cache:jujuCluster]

Changed in charm-mysql-router:
status: Fix Released → New
Revision history for this message
Felipe Alencastro (falencastro) wrote :

Here's some additional detail after we attempted another mysql-router charm upgrade:

/var/lib/mysql/magnum-mysql-router/mysqlrouter.conf
Before upgrade: https://paste.ubuntu.com/p/tzCjBkSPjp/
After upgrade: https://paste.ubuntu.com/p/NvBrF3Wcdm/

/var/log/juju/unit-magnum-mysql-router-2.log
After upgrade: https://paste.ubuntu.com/p/Yhw7tJdpTs/

/var/log/syslog
After upgrade: https://paste.ubuntu.com/p/Rmg3bB6tjQ/

Following config blocks is being added to mysqlrouter.conf and causing this issue:
[metadata_cache:jujuCluster]
ttl = 0.5
auth_cache_ttl = -1
auth_cache_refresh_interval = 2

Revision history for this message
Billy Olsen (billy-olsen) wrote :

Taking a look at this, the problem is that the mysql-router charm is attempting to write the metadata_cache for a hardcoded cluster name of `jujuCluster`, which is the default cluster name for the mysql-innodb-cluster charm.

The fix could be either to 1) add the cluster information to the mysql-router relation or 2) inspect the current file for the metadata_cache cluster information. Option 2 does not require updating the relation and multiple charms.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-mysql-router (master)
Changed in charm-mysql-router:
status: New → In Progress
Revision history for this message
Felipe Alencastro (falencastro) wrote :

Since the metadata_cache section must be unique, wouldn't it be easier to not name it? i.e.: Instead of [metadata_cache:jujuCluster] or [metadata_cache:customName] just [metadata_cache].

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-mysql-router (master)

Reviewed: https://review.opendev.org/c/openstack/charm-mysql-router/+/834359
Committed: https://opendev.org/openstack/charm-mysql-router/commit/5a2da1800272af2875064a295b8ba7044cf7bcea
Submitter: "Zuul (22348)"
Branch: master

commit 5a2da1800272af2875064a295b8ba7044cf7bcea
Author: Billy Olsen <email address hidden>
Date: Fri Mar 18 15:11:42 2022 -0700

    Configure mysqlrouter.conf sections based on wildcards

    Upon bootstrap, the mysqlrouter will create a mysqlrouter.conf file for
    the cluster it is connecting to. This creates sections such as
    metadata_cache:<cluster_name>, routing:<cluster_name>_rw,
    routing:<cluster_name>_ro, etc. The cluster name is not provided on the
    mysql-router interface so this information is not available in
    determining the correct section name. Since the mysql-router is designed
    to work with a single cluster, the need to update the interface which in
    turn requires the user to update a number of deployed charms in the
    environment, an approach is taken to allow regular expressions to be
    used when matching the section name.

    There is some risk to this in that it requires that future edits
    carefully consider the possible section names when future sections are
    added. However, this developer cost is traded off in order to ease the
    burden of operators.

    For the upgrade scenario, this patch also checks to see if the file
    rendered on disk contains multiple 'metadata_cache' sections, and if so
    rewrites the mysqlrouter.conf file with the hardcoded
    metadata_cache:jujuCluster section removed.

    Closes-Bug: #1927981
    Change-Id: Iad44744ad01c0b6429fbafb041e6fc11887dbfb9

Changed in charm-mysql-router:
status: In Progress → Fix Committed
Felipe Reyes (freyes)
Changed in charm-mysql-router:
milestone: 21.10 → none
Revision history for this message
Trent Lloyd (lathiat) wrote :

Just wanted to note that I've hit this same issue on a fresh stsstack-bundles test deployment today before the deployment was ever up and running. Both metadata_cache sections are present in the vault-mysql-router config.

I have [metadata_cache:bootstrap] and [metadata_cache:jujuCluster]

Not entirely clear to me if the patch will manage to handle this case since it otherwise seems this is happening on upgrade so wanted to mention this occurance.

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

> Just wanted to note that I've hit this same issue on a fresh stsstack-bundles test deployment today before the deployment was ever up and running. Both metadata_cache sections are present in the vault-mysql-router config.

Hi Trent; the patch/commit isn't in the 8.0.19/edge (or candidate) on charmhub but is on latest/edge (not latest/stable). It'll need a backport to the 8.0.19 channel.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-mysql-router (stable/jammy)

Fix proposed to branch: stable/jammy
Review: https://review.opendev.org/c/openstack/charm-mysql-router/+/840475

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-mysql-router (stable/jammy)

Reviewed: https://review.opendev.org/c/openstack/charm-mysql-router/+/840475
Committed: https://opendev.org/openstack/charm-mysql-router/commit/3bbcaa0b37b48503e61f68e5b3d247d624bc7e3d
Submitter: "Zuul (22348)"
Branch: stable/jammy

commit 3bbcaa0b37b48503e61f68e5b3d247d624bc7e3d
Author: Billy Olsen <email address hidden>
Date: Fri Mar 18 15:11:42 2022 -0700

    Configure mysqlrouter.conf sections based on wildcards

    Upon bootstrap, the mysqlrouter will create a mysqlrouter.conf file for
    the cluster it is connecting to. This creates sections such as
    metadata_cache:<cluster_name>, routing:<cluster_name>_rw,
    routing:<cluster_name>_ro, etc. The cluster name is not provided on the
    mysql-router interface so this information is not available in
    determining the correct section name. Since the mysql-router is designed
    to work with a single cluster, the need to update the interface which in
    turn requires the user to update a number of deployed charms in the
    environment, an approach is taken to allow regular expressions to be
    used when matching the section name.

    There is some risk to this in that it requires that future edits
    carefully consider the possible section names when future sections are
    added. However, this developer cost is traded off in order to ease the
    burden of operators.

    For the upgrade scenario, this patch also checks to see if the file
    rendered on disk contains multiple 'metadata_cache' sections, and if so
    rewrites the mysqlrouter.conf file with the hardcoded
    metadata_cache:jujuCluster section removed.

    Closes-Bug: #1927981
    Change-Id: Iad44744ad01c0b6429fbafb041e6fc11887dbfb9
    (cherry picked from commit 5a2da1800272af2875064a295b8ba7044cf7bcea)

tags: added: in-stable-jammy
Revision history for this message
Rodrigo Barbieri (rodrigo-barbieri2010) wrote :

This bug is still not fixed, I just tested in my just deployed env (which already includes the latest mysql-router package with that other fix) upgrading from cs:mysql-router-6 to 8.0.19/stable, which includes the previously committed fix, and the same error occurs:

2022-05-11 17:26:16 ERROR unit.glance-mysql-router/0.juju-log server.go:327 Failed to connect to database due to '(2003, "Can't connect to MySQL server on '127.0.0.1:3306' (111)")'

the workaround is to restart the mysql-router service manually

Apparently the charm needs to determine that the PID path changed and make the appropriate adjustments, from the journal:

on successful deployment:

May 11 16:58:27 juju-6c2392-fv-1 start.sh[36457]: PID 36457 written to '/var/lib/mysql/glance-mysql-router/mysqlrouter.pid'

after upgrade:

May 11 17:18:24 juju-6c2392-fv-1 start.sh[41362]: Error: Failed writing PID to '/run/mysql/mysqlrouter-glance-mysql-router.pid': No such file or directory

after restart:

May 11 18:25:00 juju-6c2392-fv-1 start.sh[52145]: PID 52145 written to '/run/mysql/mysqlrouter-glance-mysql-router.pid'

Changed in charm-mysql-router:
status: Fix Committed → Confirmed
tags: added: sts
Revision history for this message
Felipe Reyes (freyes) wrote : Re: [Bug 1927981] Re: Fail to upgrade mysql-router charm from version #6 to #8

TL;DR: file a new bug on the specific issue you are seeing, so we can analyze
it, the patches that have already landed did fix honest bugs, it's just that we
are finding new/more bugs.

On Wed, 2022-05-11 at 18:28 +0000, Rodrigo Barbieri wrote:
> This bug is still not fixed, I just tested in my just deployed env
> (which already includes the latest mysql-router package with that other
> fix) upgrading from cs:mysql-router-6 to 8.0.19/stable, which includes
> the previously committed fix, and the same error occurs:
>
> 2022-05-11 17:26:16 ERROR unit.glance-mysql-router/0.juju-log
> server.go:327 Failed to connect to database due to '(2003, "Can't
> connect to MySQL server on '127.0.0.1:3306' (111)")'

Liam was analyzing a similar (the same?) issue today, we discussed that the
error 2003 is effectively an error where the charm should restart mysqlrouter[0]

>
>
> the workaround is to restart the mysql-router service manually
>
> Apparently the charm needs to determine that the PID path changed and
> make the appropriate adjustments, from the journal:
>
>
> on successful deployment:
>
> May 11 16:58:27 juju-6c2392-fv-1 start.sh[36457]: PID 36457 written to
> '/var/lib/mysql/glance-mysql-router/mysqlrouter.pid'
>
> after upgrade:
>
> May 11 17:18:24 juju-6c2392-fv-1 start.sh[41362]: Error: Failed writing
> PID to '/run/mysql/mysqlrouter-glance-mysql-router.pid': No such file or
> directory
>
> after restart:
>
> May 11 18:25:00 juju-6c2392-fv-1 start.sh[52145]: PID 52145 written to
> '/run/mysql/mysqlrouter-glance-mysql-router.pid'

This is definitively something new, please file a bug for this pid related
issue, in any case the service definition doesn't declare the PIDFile=
stanza[1], so a change in the path of the pidfile written by mysqlrouter
shouldn't be affecting systemd.

mysqlrouter defines the pidfilein the start.sh/stop.sh[2] files, generated by
this code[3], and also in the mysqlrouter.conf

while the charm sets the pid_file key to make the process write the pidfile
under /run[4].

>

[0]
https://opendev.org/openstack/charm-mysql-router/src/branch/master/src/lib/charm/openstack/mysql_router.py#L790
[1] https://paste.ubuntu.com/p/Rz4w6QPVcy/
[2] https://pastebin.ubuntu.com/p/JB2832Xpsk/
[3]
https://github.com/mysql/mysql-server/blob/8.0/router/src/router/src/config_generator.cc#L3264
[4]
https://opendev.org/openstack/charm-mysql-router/src/branch/master/src/lib/charm/openstack/mysql_router.py#L752

Felipe Reyes (freyes)
Changed in charm-mysql-router:
status: Confirmed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to charm-mysql-router (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/charm-mysql-router/+/848673

Revision history for this message
Rodrigo Barbieri (rodrigo-barbieri2010) wrote :

This issue is still partially present and work on it continues on https://bugs.launchpad.net/charm-mysql-router/+bug/1980693

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to charm-mysql-router (master)

Reviewed: https://review.opendev.org/c/openstack/charm-mysql-router/+/848673
Committed: https://opendev.org/openstack/charm-mysql-router/commit/6df798a939c9f5e167a73f6e8c145c6ec631ac6c
Submitter: "Zuul (22348)"
Branch: master

commit 6df798a939c9f5e167a73f6e8c145c6ec631ac6c
Author: Rodrigo Barbieri <email address hidden>
Date: Mon Jul 4 17:22:02 2022 -0300

    Fix restarts during upgrade-charm hook

    Upgrade-charm code needs to make various adjustments to
    the config files and restart the mysql-router service,
    but the code is currently running config-changed and
    getting stuck in a restart loop before upgrade-charm
    is able to run to make the config adjustments and
    invoke a successful restart.

    This change adds a condition to skip the execution of
    config_changed function when upgrade-charm hook runs,
    so upgrade_charm code takes care of the adjustments.

    Also, now upgrade-charm hook is guaranteed to restart the
    mysql-router service whenever any change to the config
    file is made as part of its execution, whereas before
    it wasn't doing so when making the adjustments of the
    custom upgrade_charm function, resulting in the possibility
    of no restarts after a charm upgrade.

    This change includes the now-abandoned change:
    https://review.opendev.org/848748

    Co-Authored-By: Felipe Reyes <email address hidden>

    Closes-bug: #1980693
    Closes-bug: #1979263
    Related-bug: #1927981
    Change-Id: If9d71bdb839c9c0ee3f4b33e4d44a5c93bdd13de

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to charm-mysql-router (stable/jammy)

Related fix proposed to branch: stable/jammy
Review: https://review.opendev.org/c/openstack/charm-mysql-router/+/851761

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to charm-mysql-router (stable/jammy)

Reviewed: https://review.opendev.org/c/openstack/charm-mysql-router/+/851761
Committed: https://opendev.org/openstack/charm-mysql-router/commit/cfe6b2e3c0081fdc9c3aafec26a942e722de34a9
Submitter: "Zuul (22348)"
Branch: stable/jammy

commit cfe6b2e3c0081fdc9c3aafec26a942e722de34a9
Author: Rodrigo Barbieri <email address hidden>
Date: Mon Jul 4 17:22:02 2022 -0300

    Fix restarts during upgrade-charm hook

    Upgrade-charm code needs to make various adjustments to
    the config files and restart the mysql-router service,
    but the code is currently running config-changed and
    getting stuck in a restart loop before upgrade-charm
    is able to run to make the config adjustments and
    invoke a successful restart.

    This change adds a condition to skip the execution of
    config_changed function when upgrade-charm hook runs,
    so upgrade_charm code takes care of the adjustments.

    Also, now upgrade-charm hook is guaranteed to restart the
    mysql-router service whenever any change to the config
    file is made as part of its execution, whereas before
    it wasn't doing so when making the adjustments of the
    custom upgrade_charm function, resulting in the possibility
    of no restarts after a charm upgrade.

    This change includes the now-abandoned change:
    https://review.opendev.org/848748

    Co-Authored-By: Felipe Reyes <email address hidden>

    Closes-bug: #1980693
    Closes-bug: #1979263
    Related-bug: #1927981
    Change-Id: If9d71bdb839c9c0ee3f4b33e4d44a5c93bdd13de
    (cherry picked from commit 6df798a939c9f5e167a73f6e8c145c6ec631ac6c)

Revision history for this message
Paul Goins (vultaire) wrote :

Related to lathiat's comment - I had the same thing happen on a customer cloud. It seems that during the bootstrap of mysql-router, a [metadata_cache:bootstrap] section is created. The cluster-name of mysql-innodb-cluster was jujuCluster. We had to redeploy a vault unit, and that's when we saw this break.

When I test upgrading from cs:mysql-router-15 to ch:mysql-router --channel 8.0/stable locally, I see that the end result of the upgrade is that that the [metadata_cache:jujuCluster] section is dropped and the [metadata_cache:bootstrap] section is retained.

I don't know enough about mysql-router to know whether the destinations clauses need to be updated though. I see they still refer to jujuCluster; is that normal and expected, even with the metadata_cache section being labeled "bootstrap"?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.