Not all keystone units setting relation data

Bug #1761562 reported by Ryan Beisner
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Keystone Charm
Fix Released
High
David Ames

Bug Description

keystone_wait_for_propagation races with api_version relation data presence

Keystone full functional tests reveal a fairly regular occurrence of api_version relation data not being present at the time that it is checked. This is not limited to a single openstack version. It is observed in multiple releases. I believe this is just a test race.

Further, I'd say that the method (keystone_wait_for_propagation) is either mis-named, or is lacking that which is advertised: wait logic. As-written, a better name would be keystone_api_version_relation_data_check_now() or some such.

01:49:55.615 DEBUG:runner:Traceback (most recent call last):
01:49:55.615 DEBUG:runner: File "/tmp/bundletester-Wqi6PS/keystone/tests/gate-basic-xenial-mitaka", line 22, in <module>
01:49:55.615 DEBUG:runner: deployment = KeystoneBasicDeployment(series='xenial')
01:49:55.615 DEBUG:runner: File "/tmp/bundletester-Wqi6PS/keystone/tests/basic_deployment.py", line 66, in __init__
01:49:55.615 DEBUG:runner: self._initialize_test_differences()
01:49:55.615 DEBUG:runner: File "/tmp/bundletester-Wqi6PS/keystone/tests/basic_deployment.py", line 331, in _initialize_test_differences
01:49:55.615 DEBUG:runner: self.set_api_version(2)
01:49:55.615 DEBUG:runner: File "/tmp/bundletester-Wqi6PS/keystone/tests/basic_deployment.py", line 197, in set_api_version
01:49:55.615 DEBUG:runner: u.keystone_configure_api_version(se_rels, self, api_version)
01:49:55.615 DEBUG:runner: File "/tmp/bundletester-Wqi6PS/keystone/tests/charmhelpers/contrib/openstack/amulet/utils.py", line 464, in keystone_configure_api_version
01:49:55.615 DEBUG:runner: self.keystone_wait_for_propagation(sentry_relation_pairs, api_version)
01:49:55.615 DEBUG:runner: File "/tmp/bundletester-Wqi6PS/keystone/tests/charmhelpers/contrib/openstack/amulet/utils.py", line 444, in keystone_wait_for_propagation
01:49:55.615 DEBUG:runner: "".format(rel.get('api_version'), api_version))
01:49:55.615 DEBUG:runner:Exception: api_version not propagated through relation data yet ('None' != '2').
01:49:56.515 DEBUG:runner:Exit Code: 1

Tags: uosci
Revision history for this message
David Ames (thedac) wrote :

This is a legitimate bug in keystone where sometimes all units have not updated their relation data with client services. In other words, the test was catching a real bug.

Note in the amulet failure that one unit has api_version set and the other does not:
https://openstack-ci-reports.ubuntu.com/artifacts/test_charm_pipeline_amulet_full/openstack/charm-keystone/557680/9/1328/consoleText.test_charm_amulet_full_2678.txt

DEBUG:runner:2018-04-05 15:07:14,389 keystone_wait_for_propagation DEBUG: keystone relation data: {u'service_protocol': u'http', u'service_tenant': u'services', u'admin_token': u'ubuntutesting', u'ingress-address': u'172.17.106.30', u'service_password': u'NJtFrZ9X59p9gdp5VwJjcK3Gnq8gcV4xkWmp5HrC5P4CzVMnLBjcdT5LVy8KFHB5', u'service_port': u'5000', u'auth_port': u'35357', u'auth_protocol': u'http', u'private-address': u'172.17.106.30', u'egress-subnets': u'172.17.106.30/32', u'service_host': u'172.17.106.30', u'auth_host': u'172.17.106.30', u'service_username': u'cinder_cinderv2', u'service_tenant_id': u'7cbf0289a54849e4a815cee7ce1e8928', u'api_version': u'2'}
DEBUG:runner:2018-04-05 15:07:21,182 keystone_wait_for_propagation DEBUG: keystone relation data: {u'private-address': u'172.17.106.36', u'egress-subnets': u'172.17.106.36/32', u'ingress-address': u'172.17.106.36'}

Here is another example. Keystone 0 and 1 have set the data but keystone 2 has not:

$ get-relation-data.sh identity-service cinder/0 keystone/0
CMD: juju run --unit cinder/0 'relation-get -r identity-service:4 - keystone/0'
admin_token: ubuntutesting
api_version: "2"
auth_host: 10.5.0.162
auth_port: "35357"
auth_protocol: http
egress-subnets: 10.5.0.162/32
ingress-address: 10.5.0.162
private-address: 10.5.0.162
service_host: 10.5.0.162
service_password: Xh6j8VVW7TXptYyM8ZHgKPknrzjCpYRHZzKpWsJZm6b86gbN6VqG8YCGwVw33yCB
service_port: "5000"
service_protocol: http
service_tenant: services
service_tenant_id: f4fa132500b44def888498be104d2b37
service_username: cinderv2_cinderv3

$ get-relation-data.sh identity-service cinder/0 keystone/1
CMD: juju run --unit cinder/0 'relation-get -r identity-service:4 - keystone/1'
admin_token: ubuntutesting
api_version: "2"
auth_host: 10.5.0.162
auth_port: "35357"
auth_protocol: http
egress-subnets: 10.5.0.152/32
ingress-address: 10.5.0.152
private-address: 10.5.0.152
service_host: 10.5.0.162
service_password: Xh6j8VVW7TXptYyM8ZHgKPknrzjCpYRHZzKpWsJZm6b86gbN6VqG8YCGwVw33yCB
service_port: "5000"
service_protocol: http
service_tenant: services
service_tenant_id: f4fa132500b44def888498be104d2b37
service_username: cinderv2_cinderv3

$ get-relation-data.sh identity-service cinder/0 keystone/2
CMD: juju run --unit cinder/0 'relation-get -r identity-service:4 - keystone/2'
egress-subnets: 10.5.0.169/32
ingress-address: 10.5.0.169
private-address: 10.5.0.169

summary: - keystone_wait_for_propagation races with api_version relation data
- presence
+ Not all keystone units setting relation data
Revision history for this message
David Ames (thedac) wrote :

I should note this is *after* everything has settled. So it is not a race per se. However, it is intermittent, I had to run multiple loops before running into this.

Changed in charm-keystone:
status: New → Confirmed
importance: Undecided → Critical
importance: Critical → High
milestone: none → 18.05
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-keystone (master)

Reviewed: https://review.openstack.org/559850
Committed: https://git.openstack.org/cgit/openstack/charm-keystone/commit/?id=a240c520a520ca92f338efa3717b2a09f4a7281d
Submitter: Zuul
Branch: master

commit a240c520a520ca92f338efa3717b2a09f4a7281d
Author: David Ames <email address hidden>
Date: Mon Apr 9 14:37:29 2018 -0700

    Run identity client relations when db is complete

    When keystone is deployed with multiple units but without hacluster one
    off scenarios occur where one non-leader unit will fail to update its
    client relations.

    This change runs all identity client relations when the database
    relation is complete thus guaranteeing all keystone units update there
    identity relation data with clients.

    Small timing fix to amulet tests.

    Closes-Bug: #1761562
    Change-Id: I338e500dbc155b75c75b9261a9b5b471bd73088a

Changed in charm-keystone:
status: Confirmed → Fix Committed
David Ames (thedac)
Changed in charm-keystone:
assignee: nobody → David Ames (thedac)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-keystone (stable/18.02)

Fix proposed to branch: stable/18.02
Review: https://review.openstack.org/568299

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-keystone (stable/18.02)

Reviewed: https://review.openstack.org/568299
Committed: https://git.openstack.org/cgit/openstack/charm-keystone/commit/?id=c5b98f1a26deda0efd612bb532d5d02f8e259c1c
Submitter: Zuul
Branch: stable/18.02

commit c5b98f1a26deda0efd612bb532d5d02f8e259c1c
Author: David Ames <email address hidden>
Date: Mon Apr 9 14:37:29 2018 -0700

    Run identity client relations when db is complete

    When keystone is deployed with multiple units but without hacluster one
    off scenarios occur where one non-leader unit will fail to update its
    client relations.

    This change runs all identity client relations when the database
    relation is complete thus guaranteeing all keystone units update there
    identity relation data with clients.

    Small timing fix to amulet tests.

    Closes-Bug: #1761562
    Change-Id: I338e500dbc155b75c75b9261a9b5b471bd73088a
    (cherry picked from commit a240c520a520ca92f338efa3717b2a09f4a7281d)

David Ames (thedac)
Changed in charm-keystone:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.