Keystone become "Database not initialised" after scale up and scale down

Bug #1942289 reported by Eric Chen
56
This bug affects 9 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Invalid
Undecided
Unassigned
OpenStack Keystone Charm
In Progress
Medium
Phan Trung Thanh

Bug Description

Description
==========

Keystone HA is enabled.
After I scale up 1 keystone unit and remove the keystone unit with vip.
The cluster become "Database not initialised".

Versions
===============================
Cloud focal-wallaby

Deploy openstack from https://github.com/openstack-charmers/openstack-bundles
(git commit: 3f74a67704ed7654e0f574ccfd724dc3a224c9f8)

Modify the charm version and turn on ovn in bundle.yaml

diff --git a/stable/openstack-base/bundle.yaml b/stable/openstack-base/bundle.yaml
index 39297cf..c52ecb2 100644
--- a/stable/openstack-base/bundle.yaml
+++ b/stable/openstack-base/bundle.yaml
@@ -219,7 +219,7 @@ applications:
     annotations:
       gui-x: '300'
       gui-y: '1270'
- charm: cs:keystone-323
+ charm: cs:keystone-326
     num_units: 1
     options:
       worker-multiplier: *worker-multiplier
@@ -359,7 +359,7 @@ applications:
     # top of this file.
     options:
       ovn-bridge-mappings: physnet1:br-ex
- bridge-interface-mappings: *data-port
+ #bridge-interface-mappings: *data-port
   vault-mysql-router:
     annotations:
       gui-x: '1535'

Reproduce Step
===================

1. Prepare the Openstack environment like above in serverstack.

2. Turn on Keystone HA

  juju add-unit -n 2 keystone
  juju config keystone vip=10.5.55.1
  juju deploy --config cluster_count=3 hacluster keystone-hacluster
  juju add-relation keystone-hacluster:ha keystone:ha

Everything is good except the certificate issue in LP: #1930763

3. Scale up 1 keystone unit

  juju add-unit -n 1 keystone

4. Remove the keystone unit with VIP

  juju remove-unit keystone/28

After that, the cluster status become "blcoked"

```
Unit Workload Agent Machine Public address Ports Message
keystone/29 blocked idle 122 10.5.2.26 5000/tcp Database not initialised
  keystone-hacluster/18 active idle 10.5.2.26 Unit is ready and clustered
  keystone-mysql-router/50 active idle 10.5.2.26 Unit is ready
keystone/30* blocked idle 123 10.5.3.123 5000/tcp Database not initialised
  keystone-hacluster/17 active idle 10.5.3.123 Unit is ready and clustered
  keystone-mysql-router/49* active idle 10.5.3.123 Unit is ready
keystone/31 blocked idle 124 10.5.2.152 5000/tcp Database not initialised
  keystone-hacluster/19* active idle 10.5.2.152 Unit is ready and clustered
  keystone-mysql-router/51 active idle 10.5.2.152 Unit is ready
```

Detail Information
=================

I find related log in keystone/29.

2021-09-01 03:28:42 DEBUG unit.keystone/29.juju-log server.go:314 Checking for maintenance notifications
2021-09-01 03:28:42 DEBUG unit.keystone/29.juju-log server.go:314 This unit (keystone/29) is in allowed unit list from keystone-mysql-router/50
2021-09-01 03:28:42 DEBUG unit.keystone/29.juju-log server.go:314 db-initialised key missing, assuming db is not initialised
2021-09-01 03:28:42 DEBUG unit.keystone/29.juju-log server.go:314 Database initialised: False
2021-09-01 03:28:42 DEBUG unit.keystone/29.juju-log server.go:314 Database not initialised
2021-09-01 03:28:42 DEBUG unit.keystone/29.juju-log server.go:314 Database not initialised
2021-09-01 03:28:42 INFO unit.keystone/29.juju-log server.go:314 Keystone charm unit not ready - deferring identity-relation updates
2021-09-01 03:28:42 DEBUG unit.keystone/29.juju-log server.go:314 Checking for maintenance notifications
2021-09-01 03:28:42 DEBUG unit.keystone/29.juju-log server.go:314 This unit (keystone/29) is in allowed unit list from keystone-mysql-router/50
2021-09-01 03:28:42 DEBUG unit.keystone/29.juju-log server.go:314 db-initialised key missing, assuming db is not initialised
2021-09-01 03:28:42 DEBUG unit.keystone/29.juju-log server.go:314 Database initialised: False
2021-09-01 03:28:42 DEBUG unit.keystone/29.juju-log server.go:314 Database not initialised
2021-09-01 03:28:42 DEBUG unit.keystone/29.juju-log server.go:314 Telling peers this unit is: NOTREADY
2021-09-01 03:28:42 DEBUG unit.keystone/29.juju-log server.go:314 Telling peer behind relation cluster:311 that keystone/29 is NOTREADY

At the meantime, one of the mysql-innodb-cluster unit become error.
This error can be fixed after this command

   juju resolve --no-retry mysql-innodb-cluster/4

But keystone is still stay in blocked status.

Unit Workload Agent Machine Public address Ports Message
mysql-innodb-cluster/3 active idle 69 10.5.3.213 Unit is ready: Mode: R/W, Cluster is ONLIN
E and can tolerate up to ONE failure. mysql-innodb-cluster/4 error idle 70 10.5.1.215 hook failed: "update-status"
mysql-innodb-cluster/5* active idle 71 10.5.1.138 Unit is ready: Mode: R/O, Cluster is ONLINE and can tolerate up to ONE failure.

I will attach juju crashdump later, it's too big to upload now.

Expected result
===============================
keystone cluster action complete and the unit is "active" state.

Revision history for this message
Eric Chen (eric-chen) wrote :

Attach the juju crashdump

Eric Chen (eric-chen)
description: updated
description: updated
Revision history for this message
Eric Chen (eric-chen) wrote :

Other people remind me that I can add --small when doing crashdump.
Attach the crashdump again.

I get this crashdump after the mysql innodb cluster recovered.

Revision history for this message
Aurelien Lourot (aurelien-lourot) wrote :

Thanks a lot for reporting Eric! Indeed I confirm that I have seen it myself. This is in fact visible and worked around in one of our test gates:

https://github.com/openstack-charmers/zaza-openstack-tests/blob/master/zaza/openstack/charm_tests/hacluster/tests.py#L154

So by running this test gate the issue can easily be reproduced.

Changed in charm-keystone:
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
Paul Goins (vultaire) wrote :

I'm suspecting this is caused by this particular reactive handler:

@hooks.hook('shared-db-relation-departed',
            'shared-db-relation-broken')
def db_departed_or_broken():
    if is_leader():
        leader_set({'db-initialised': None})

Perhaps if you remove the current leader of the keystone app, this would be triggered?

Revision history for this message
Andrea Ieri (aieri) wrote :

And should perhaps juju move leadership away from the node to be removed before firing any hook?

Revision history for this message
Garrett Neugent (thogarre) wrote :

BootStack encountered this same error today, when removing and re-adding a keystone unit on a cloud.

the cloud is running focal/ussuri, and keystone is on cs:keystone-330 . In our case, mysql did not go into an error state, but we were able to work around the issue by setting the db-initialised flag:

$ juju run -u keystone/leader 'leader-set db-initialised=True'

Note that we had verified keystone itself was running as expected, but relational data was not being shared (nor relations created) with sibling units, via the identity-service interface.

Revision history for this message
Xav Paice (xavpaice) wrote :

Added Juju, to determine if the leadership should be removed prior to a unit being removed.

Revision history for this message
Xav Paice (xavpaice) wrote :

Given the significant impact on production clouds when this bug occurs, I would like to see us re-evaluate the importance setting.

Revision history for this message
Canonical Juju QA Bot (juju-qa-bot) wrote (last edit ):

Juju is designed to keep a unit the leader during the dying phase and will only revoke leadership once the unit transitions to dead, just prior to removal and after all hooks have been run.

The unit, if leader prior to removal, retains leadership so that it may do "leader" type things during the relation departed/broken hooks.

I'll mark as Invalid for juju since the leadership behaviour is by design.

Changed in juju:
status: New → Invalid
Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

> Juju is designed to keep a unit the leader during the dying phase and will only revoke leadership once the unit transitions to dead, just prior to removal and after all hooks have been run.

> The unit, if leader prior to removal, retains leadership so that it may do "leader" type things during the relation departed/broken hooks.

Just seeking clarification, but essentially, in a multi-unit set-up, if the leader is removed, a leadership election doesn't take place until the unit is removed?

So to pose a question: how does a unit 'know' that, when it is dying, that it isn't the last unit and not to do things like clean-up, etc? i.e. as the leader, how should it determine not to do clean-up? Is there an (option of/way to implement) atomic counter of units that are still 'alive' in the application?

Revision history for this message
John A Meinel (jameinel) wrote :

If you need to know about counterpart units, that is what a peer relation is for. If you want to know if your unit is going away that is $JUJU_DEPARTING_UNIT, IIRC.

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

> If you need to know about counterpart units, that is what a peer relation is for. If you want to know if your unit is going away that is $JUJU_DEPARTING_UNIT, IIRC.

Has the behaviour in juju changed at all since earlier versions? e.g. around departing units and leadership. It's just keystone has been around for a long time, and it's curious that this bug has surfaced. Certainly, it has a bug, but I'm wondering if leadership was dropped on the departing unit unless it was the last one, previously?

Revision history for this message
Alan Baghumian (alanbach) wrote :

I just encountered this exact same problem while shuffling my units around compute hosts.

I am using Focal/Xena with keystone charm revision 576 from the xena/stable channel.

- Added a new replacement unit to a new node going up from 3 keystone to 4

- Juju remove-unit keystone/1* (leader)

- All keystone units started displaying the "Database not initialised" message.

- Executing the following command resolved the issue:

$ juju run -u keystone/leader 'leader-set db-initialised=True'

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

Okay, so the reason this is happening is due to this code in keystone:

hooks/keystone_hooks.py, line 442:

@hooks.hook('shared-db-relation-departed',
            'shared-db-relation-broken')
def db_departed_or_broken():
    if is_leader():
        leader_set({'db-initialised': None})

In order to fix it, the code should verify (via the cluster/peer relationship) whether it is the only unit and NOT set the flag 'db-initialised' to None. However, if it's the only unit left, it would be valid to clear it.

tags: added: good-first-bug
Changed in charm-keystone:
assignee: nobody → Phan Trung Thanh (tphan025)
Changed in charm-keystone:
status: Triaged → In Progress
Revision history for this message
Phan Trung Thanh (tphan025) wrote (last edit ):

Hi everyone,

I thought this was an interesting issue to pick up.

I initially implemented @ajkavanagh 's proposal but someone pointed out to me that the 'shared-db-relation-departed' and 'shared-db-relation-broken' hooks are also ran if I run ""juju remove-relation keystone:shared-db mysql-router", so the problem became :

- The charm should NOT set the flag 'db-initialised' to None if I run "juju remove-unit" on the leader unit
when it's not the only unit
- But the charm should also correctly set the flag to None if I run "juju remove-relation"
- All that has to be done in the hook callback without knowing the whole context (which command triggered the hook, is the unit getting torn down or is it a simple remove-relation)

So I decided to set a new leader value and next time db-initialised is checked, the leader (not necessarily the unit that set this value to None) can use this extra information to revert if needed. I have a patch for this here : https://review.opendev.org/c/openstack/charm-keystone/+/897259.

Can someone have a look and let me know what they think?

Thanks!
Thanh

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :
Download full text (4.4 KiB)

Hi

Firstly, thank you to Thanh for working on the bug. Since your patch has been submitted, I've taken a moment to review the original code and patch code, and there's a few more complications lurking in this problem space. I added the following comment (and a -2 hold) on the review:

> I'm putting a tentative do-no-merge on this whilst we discuss a few issues. The complicating factor is the switch from percona-cluster to mysql-innodb-cluster which introduced the MITM mysql-router charm. This has changed the semantics of 'breaking the shared-db relation' which means the original fix in #1797229 to say the DB is not initialised now doesn't quite hold. I'm going to take this discussion back to the bug to try to sort this out.

---

The current code in keystone looks like this for shared-db 'going away':

@hooks.hook('shared-db-relation-departed',
            'shared-db-relation-broken')
def db_departed_or_broken():
    if is_leader():
        leader_set({'db-initialised': None})

This code's purpose was to allow for removing the database cluster and then re-adding it so that keystone would re-create the database. It's basically assumes that the only reason the relation is going away is because the database (cluster) is being destroyed, and the way the keystone charm is currently written is that when the charm is related to a database it will *always* be necessary to re-create the database.

Thus, the code was added (due to https://bugs.launchpad.net/charm-keystone/+bug/1797229) to essentially put the charm back into 'database not initialised' if the relation went away. It didn't take into account scale up/down and doing that to the leader, and the leader being the one scaled down; this broke the charm as it's only a unit being removed.

---

So let's set up a scenario: 3 keystone units k/0*, k/1, k/2 and each with it's shared-db relation to a cluster with mysql-router. The state is that the DB is initialised and everything is running.

There are two operational scenarios that need to be resolved:

a) Remove the old database and get it to re-initialise (bug #1797229)
b) Scale-up and then scale down (remove the leader) - database remains the same.

The leader setting 'db-initialised' is a cached flag indicating that the leader initialised the database, and when the leader loses its relation with the database, that cached value (rightly, I think) should also go away.

In both cases, the decision on whether to re-initialise the database or not rests with the leader. If leadership changes without the shared-db relation changing then the cached value is still valid for the new leader.

---

So what the issue really is is the validity of the cached flag 'db-initialised' in leader settings. I don't think that an additional flag (caching a cache) is the right approach. What I propose is:

a) The current leader 'owns' the cached flag 'db-initialised'.
b) when the leader's shared-db relation is broken, it clears the db-initialised flag.
c) If a leader checks the 'db-initialised' flag and finds that it is not set then (this is new behaviour) checks the database to see whether it is actually initialised or not. (i.e. verifying whether a key...

Read more...

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.