kolla-ansible

Issues with kolla-ansible mariadb_recovery

Bug #1834467 reported by Mark Goddard on 2019-06-27

This bug affects 1 person

	Status	Importance	Assigned to	Milestone
kolla-ansible	Fix Released	High	Mark Goddard	kolla-ansible 9.0.0 "Train"
Rocky	Fix Released	High	Mark Goddard	kolla-ansible 7.1.2 "rocky"
Stein	Fix Released	High	Unassigned	kolla-ansible 8.0.0 "Stein"
Train	Fix Released	High	Mark Goddard	kolla-ansible 9.0.0 "Train"

Bug Description

There are currently various issues with the kolla-ansible mariadb_recovery command.

* wsrep sequence number detection is broken. Log message format is
'WSREP: Recovered position: <UUID>:<seqno>' but we were picking out
the UUID rather than the sequence number. This is as good as random.

* Need to add become: true to log file removal since
I4a5ebcedaccb9261dbc958ec67e8077d7980e496 added become: true to the
'docker cp' command which creates it.

* Shouldn't run handlers during recovery. If the config files change we
would end up restarting the cluster twice.

* Need to wait for wsrep recovery container completion (don't detach). This
avoids a potential race between wsrep recovery and the subsequent
'stop_container'.

Tags:

Mark Goddard (mgoddard) on 2019-06-27

Changed in kolla-ansible:
importance:	Undecided → High

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-06-27: Fix proposed to kolla-ansible (master)

Fix proposed to branch: master
Review: https://review.opendev.org/667904

Changed in kolla-ansible:
assignee:	nobody → Mark Goddard (mgoddard)
status:	New → In Progress

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-07-08: Fix merged to kolla-ansible (master)

Reviewed: https://review.opendev.org/667904
Committed: https://git.openstack.org/cgit/openstack/kolla-ansible/commit/?id=86f373a198620a7082db8e243644fd8d53802c73
Submitter: Zuul
Branch: master

commit 86f373a198620a7082db8e243644fd8d53802c73
Author: Mark Goddard <email address hidden>
Date: Thu Jun 27 12:17:17 2019 +0100

Fixes for MariaDB bootstrap and recovery

    * Fix wsrep sequence number detection. Log message format is
      'WSREP: Recovered position: <UUID>:<seqno>' but we were picking out
      the UUID rather than the sequence number. This is as good as random.

    * Add become: true to log file reading and removal since
      I4a5ebcedaccb9261dbc958ec67e8077d7980e496 added become: true to the
      'docker cp' command which creates it.

* Don't run handlers during recovery. If the config files change we
would end up restarting the cluster twice.

    * Wait for wsrep recovery container completion (don't detach). This
      avoids a potential race between wsrep recovery and the subsequent
      'stop_container'.

    * Finally, we now wait for the bootstrap host to report that it is in
      an OPERATIONAL state. Without this we can see errors where the
      MariaDB cluster is not ready when used by other services.

Change-Id: Iaf7862be1affab390f811fc485fd0eb6879fd583
Closes-Bug: #1834467

Changed in kolla-ansible:
status:	In Progress → Fix Released

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-07-08: Fix proposed to kolla-ansible (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.opendev.org/669701

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-07-08: Fix proposed to kolla-ansible (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.opendev.org/669703

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-07-16: Fix merged to kolla-ansible (stable/rocky)

Reviewed: https://review.opendev.org/669701
Committed: https://git.openstack.org/cgit/openstack/kolla-ansible/commit/?id=25bf57fb56ab9cf414d7c020b40971e39c72ac20
Submitter: Zuul
Branch: stable/rocky

commit 25bf57fb56ab9cf414d7c020b40971e39c72ac20
Author: Mark Goddard <email address hidden>
Date: Thu Jun 27 12:17:17 2019 +0100

Fixes for MariaDB bootstrap and recovery

    * Add become: true to log file reading and removal since
      I4a5ebcedaccb9261dbc958ec67e8077d7980e496 added become: true to the
      'docker cp' command which creates it.

* Don't run handlers during recovery. If the config files change we
would end up restarting the cluster twice.

    * Wait for wsrep recovery container completion (don't detach). This
      avoids a potential race between wsrep recovery and the subsequent
      'stop_container'.

    Change-Id: Iaf7862be1affab390f811fc485fd0eb6879fd583
    Closes-Bug: #1834467
    (cherry picked from commit 86f373a198620a7082db8e243644fd8d53802c73)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-08-07: Fix merged to kolla-ansible (stable/queens)

Reviewed: https://review.opendev.org/669703
Committed: https://git.openstack.org/cgit/openstack/kolla-ansible/commit/?id=9ec00c24eb09bc7bc6ab533022f56837b7505253
Submitter: Zuul
Branch: stable/queens

commit 9ec00c24eb09bc7bc6ab533022f56837b7505253
Author: Mark Goddard <email address hidden>
Date: Thu Jun 27 12:17:17 2019 +0100

Fixes for MariaDB bootstrap and recovery

    * Add become: true to log file reading and removal since
      I4a5ebcedaccb9261dbc958ec67e8077d7980e496 added become: true to the
      'docker cp' command which creates it.

* Don't run handlers during recovery. If the config files change we
would end up restarting the cluster twice.

    * Wait for wsrep recovery container completion (don't detach). This
      avoids a potential race between wsrep recovery and the subsequent
      'stop_container'.

    Change-Id: Iaf7862be1affab390f811fc485fd0eb6879fd583
    Closes-Bug: #1834467
    (cherry picked from commit 86f373a198620a7082db8e243644fd8d53802c73)

tags:

added: in-stable-queens

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-09-09: Fix included in openstack/kolla-ansible 6.2.2

This issue was fixed in the openstack/kolla-ansible 6.2.2 release.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-09-09: Fix included in openstack/kolla-ansible 7.1.2

This issue was fixed in the openstack/kolla-ansible 7.1.2 release.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-11-11: Fix included in openstack/kolla-ansible 9.0.0.0rc1

This issue was fixed in the openstack/kolla-ansible 9.0.0.0rc1 release candidate.

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.