"Failed to connect to MySQL" if mysql-innodb-cluster is momentarily unavailable.

Bug #1973177 reported by Liam Young
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MySQL Router Charm
Fix Committed
High
Liam Young
OpenStack Bundles
New
Undecided
Unassigned

Bug Description

One of the causes of a charm going into a "Failed to connect to MySQL" state is that a connection to the database failed when the db-router charm attempted to restart the db-router service. Currently the charm will only retry the connection in response to one return code from the mysql. The return code is 2013 which is "Message: Lost connection to MySQL server during query" *1. However, if the connection fails to be established in the first place then the error returned is 2003 "Can't connect to MySQL server on...".

*1 https://dev.mysql.com/doc/mysql-errors/8.0/en/client-error-reference.html

Liam Young (gnuoy)
Changed in charm-mysql-router:
status: New → Triaged
importance: Undecided → High
assignee: nobody → Liam Young (gnuoy)
status: Triaged → Confirmed
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-mysql-router (master)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-mysql-router (master)

Reviewed: https://review.opendev.org/c/openstack/charm-mysql-router/+/841708
Committed: https://opendev.org/openstack/charm-mysql-router/commit/5942b035f6ed136632b7a20711a0422ca561e0a6
Submitter: "Zuul (22348)"
Branch: master

commit 5942b035f6ed136632b7a20711a0422ca561e0a6
Author: Liam Young <email address hidden>
Date: Fri May 13 09:25:19 2022 +0000

    Restart router if connections fail with 2003 code

    At the moment if a connection through the router fails after a
    configuration update the router is only restarted if the connection
    error has a code of 2013 but often the error thrown is 2003 (see
    *1) . This patch alters the charms
    behaviour to also restart the router on a 2013 error.

    While testing this patch it became apparent that a connection
    attempt through the router immediatly after the router has been
    restarted very often fail. So, the connection attempt has been
    moved into its own method with its own tenacity retry logic.
    A side effect of this is that the total possible wait time
    has increased from 5 * 10 (outer tenacity loop) to 5 * 10 * 5
    (outer tenacity loop and inner tenacity loop).

    *1 https://dev.mysql.com/doc/mysql-errors/8.0/en/client-error-reference.html

    Closes-Bug: #1973177
    Change-Id: I9c2846bf4f21d2dcb1958bee4c9fa72dd4464b6c

Changed in charm-mysql-router:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-mysql-router (stable/jammy)

Fix proposed to branch: stable/jammy
Review: https://review.opendev.org/c/openstack/charm-mysql-router/+/843313

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-mysql-router (stable/jammy)

Reviewed: https://review.opendev.org/c/openstack/charm-mysql-router/+/843313
Committed: https://opendev.org/openstack/charm-mysql-router/commit/7488ce17a5aa6e4f542234fac588c853ee74fc8c
Submitter: "Zuul (22348)"
Branch: stable/jammy

commit 7488ce17a5aa6e4f542234fac588c853ee74fc8c
Author: Liam Young <email address hidden>
Date: Fri May 13 09:25:19 2022 +0000

    Restart router if connections fail with 2003 code

    At the moment if a connection through the router fails after a
    configuration update the router is only restarted if the connection
    error has a code of 2013 but often the error thrown is 2003 (see
    *1) . This patch alters the charms
    behaviour to also restart the router on a 2013 error.

    While testing this patch it became apparent that a connection
    attempt through the router immediatly after the router has been
    restarted very often fail. So, the connection attempt has been
    moved into its own method with its own tenacity retry logic.
    A side effect of this is that the total possible wait time
    has increased from 5 * 10 (outer tenacity loop) to 5 * 10 * 5
    (outer tenacity loop and inner tenacity loop).

    *1 https://dev.mysql.com/doc/mysql-errors/8.0/en/client-error-reference.html

    Closes-Bug: #1973177
    Change-Id: I9c2846bf4f21d2dcb1958bee4c9fa72dd4464b6c
    (cherry picked from commit 5942b035f6ed136632b7a20711a0422ca561e0a6)

tags: added: in-stable-jammy
Changed in charm-mysql-router:
status: Fix Committed → Confirmed
status: Confirmed → Fix Committed
Revision history for this message
Chi Wai CHAN (raychan96) wrote :

This bug also affects the deployment of openstack base bundle [1]. So, the fixed release of mysql-router should be included in that bundle whenever possible.

[1] https://github.com/openstack-charmers/openstack-bundles/blob/master/stable/openstack-base/bundle.yaml

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.