Service user's password unnecessarily updated

Bug #1648677 reported by Billy Olsen
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Autopilot Log Analyser
Fix Committed
Undecided
Francis Ginther
Landscape Server
Fix Released
High
Francis Ginther
keystone (Juju Charms Collection)
Fix Released
High
Edward Hope-Morley
nova-compute (Juju Charms Collection)
Won't Fix
High
Unassigned

Bug Description

Whenever the identity-service-relation-changed hook fires, the remote service's/application's password is updated if the user already exists. The service user's password did not change in the relation data, but the keystone charm attempts to update the user password anyway in order to ensure that the password matches the current value.

This causes problems for the nova-compute services. Nova maintains a global neutron client which operates under the nova credentials within the [neutron] section of the nova.conf file. These credentials are only read in on startup of the nova-compute service.

Note: Whenever the password is updated, the token used in this service is revoked and each nova-compute service in the environment must be restarted.

Steps to recreate:

1. juju deploy cloud
2. launch an instance on a nova compute node
3. Wait at least one minute to ensure that the global token has been created
   Note: this 1 minute wait is based on the polling cycle of the heal_instance_info_cache_interval
3. trigger an identity-service-relation-changed hook to fire, e.g.:
   juju run --unit nova-cloud-controller/0 'relation-set -r identity-service:<id> foo=bar'
4. Observe during next polling cycles authentication errors to neutron.

Example on Kilo:

2016-12-09 01:12:21.647 1478 TRACE nova.compute.manager [instance: fa73d9f3-431b-4d1a-a2c9-dd52d2851840] Traceback (most recent call last):
2016-12-09 01:12:21.647 1478 TRACE nova.compute.manager [instance: fa73d9f3-431b-4d1a-a2c9-dd52d2851840] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 5691, in _heal_instance_info_cache
...
2016-12-09 01:12:21.647 1478 TRACE nova.compute.manager [instance: fa73d9f3-431b-4d1a-a2c9-dd52d2851840] message=message)
2016-12-09 01:12:21.647 1478 TRACE nova.compute.manager [instance: fa73d9f3-431b-4d1a-a2c9-dd52d2851840] NeutronClientException: Authentication required

Changed in keystone (Juju Charms Collection):
importance: Undecided → High
tags: added: openstack
Revision history for this message
Edward Hope-Morley (hopem) wrote :

A bit more info, I have just tried this out and I see that as mentioned, the real password does not change but its record in the keystone database does. This in turn triggers a revocation_event that causes all active tokens for that service user to be revoked. I can only assume that when the password update call is made, keystone and/or mysql is using extra information such as timestamp when encoding the password string and this results in a new record in the database each time the call is made even if the password itself has not changed.

Revision history for this message
Edward Hope-Morley (hopem) wrote :

I've now tried the update password action on both nova-compute and glance services and I see that it is only nova-compute that is affected since when Glance gets a 401 from keystone it requests a new token whereas nova-compute simply explodes when it gets the 401.

Revision history for this message
Edward Hope-Morley (hopem) wrote :

Adding the following to my nova.conf appears to stop Nova from failing when it gets a 401:

[keystone_authtoken]
...
check_revocations_for_cached = True

So, since Nova appears to by-design not automatically check and handle revoked tokens, I think we have the following options to solve the problem:

  1. set the above config in the nova-compute charm

  2. stop the keystone charm from calling update if password has not changed

  3. always restart nova-compute service when password update occurs (i.e. whenever nova-cloud-controller identity-service relation is called)

Personally I think that option 1 is least intrusive and simplest to implement.

Changed in nova-compute (Juju Charms Collection):
status: New → Confirmed
importance: Undecided → High
milestone: none → 17.01
Revision history for this message
Edward Hope-Morley (hopem) wrote :

Sounds like option 1 is the preferred path so i will go ahead and propose a patch.

Changed in keystone (Juju Charms Collection):
status: Confirmed → Invalid
assignee: Billy Olsen (billy-olsen) → nobody
Revision history for this message
Edward Hope-Morley (hopem) wrote :

Ok on second thoughts we'll go with option 2. The service passwords are stored in the leader setting so that charm should easily be able to check whether a password update is necessary. That way we'll avoid service token revocations for any service related to keystone.

Changed in nova-compute (Juju Charms Collection):
status: Confirmed → Won't Fix
Changed in keystone (Juju Charms Collection):
status: Invalid → New
status: New → Confirmed
Revision history for this message
Felipe Reyes (freyes) wrote :

> check_revocations_for_cached = True
> ...
> 1. set the above config in the nova-compute charm

We need to be careful with this option, because this will make nova-compute daemons to ask keystone for the list of revoked tokens (GET /tokens/revoked)[0] every X seconds[1], so the extra pressure on keystone is something to consider.

On top of the previously said, the real problem with this option is that this only works for PKI token format which is being deprecated[2]

[0] https://github.com/openstack/keystonemiddleware/blob/master/keystonemiddleware/auth_token/__init__.py#L744
[1] https://github.com/openstack/keystonemiddleware/blob/master/keystonemiddleware/auth_token/__init__.py#L601
[2] https://github.com/openstack/keystonemiddleware/commit/77909fdc169e4b6f9b177212514f10913bc389e6

Revision history for this message
Edward Hope-Morley (hopem) wrote :

Agreed Felipe, although i tried with UUID tokens and it worked as well. I am now actually implementing option 2.

Revision history for this message
Edward Hope-Morley (hopem) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/417036

Changed in keystone (Juju Charms Collection):
assignee: nobody → Edward Hope-Morley (hopem)
status: Confirmed → In Progress
Chad Smith (chad.smith)
Changed in landscape:
milestone: none → 16.12
Changed in landscape:
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
Chad Smith (chad.smith) wrote :

As dosaboy mentioned in IRC, this fix will not initially target stable charms.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-keystone (master)

Reviewed: https://review.openstack.org/417036
Committed: https://git.openstack.org/cgit/openstack/charm-keystone/commit/?id=f9670295de698b44ff9e9db4653af4ea6ded82b9
Submitter: Jenkins
Branch: master

commit f9670295de698b44ff9e9db4653af4ea6ded82b9
Author: Edward Hope-Morley <email address hidden>
Date: Thu Jan 5 16:28:47 2017 +0000

    Avoid keystone password update if unchanged

    Avoid calling update_password() if the password has not
    changed since it will actually change the db value
    regardless resulting in a revocation event and all current
    tokens being invalidated.

    Change-Id: Icb901b5e87d9cd716fa1a0d146e2252339e5678b
    Closes-Bug: 1648677

Changed in keystone (Juju Charms Collection):
status: In Progress → Fix Committed
Revision history for this message
Chad Smith (chad.smith) wrote :

Definitely seeing this on newton deploys with stable charms. Lots of 503s and Unauthorized messages from various services talking to keystone causing failures on a number of nova api calls which prevent glance simplestreams from uploading images and cloud configuration calls to configure the deployed region.

Below you can see prints of all of the superfluous password updates from almost every service.
https://pastebin.canonical.com/175833/

Revision history for this message
Chad Smith (chad.smith) wrote :

cs:xenial/keystone-260 for reference

Changed in landscape:
milestone: 16.12 → 17.01
Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Just a note that landscape has been using cs:xenial/keystone-260 since landscape trunk r10798. Let's see if we have seen this bug since then in our ci test runs.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Oh, sorry, I see now that the fix is NOT in keystone-r260.

Changed in autopilot-log-analyser:
assignee: nobody → Francis Ginther (fginther)
Changed in autopilot-log-analyser:
status: New → Fix Committed
Changed in landscape:
importance: Medium → High
assignee: nobody → Francis Ginther (fginther)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-keystone (stable/16.10)

Fix proposed to branch: stable/16.10
Review: https://review.openstack.org/423662

Chris Gregan (cgregan)
tags: added: cdo-qa-blocker
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-keystone (stable/16.10)

Reviewed: https://review.openstack.org/423662
Committed: https://git.openstack.org/cgit/openstack/charm-keystone/commit/?id=7bd0832bc3730a42d103143f1331f54257b3a3de
Submitter: Jenkins
Branch: stable/16.10

commit 7bd0832bc3730a42d103143f1331f54257b3a3de
Author: Edward Hope-Morley <email address hidden>
Date: Thu Jan 5 16:28:47 2017 +0000

    Avoid keystone password update if unchanged

    Avoid calling update_password() if the password has not
    changed since it will actually change the db value
    regardless resulting in a revocation event and all current
    tokens being invalidated.

    Change-Id: Icb901b5e87d9cd716fa1a0d146e2252339e5678b
    Closes-Bug: 1648677
    (cherry picked from commit f9670295de698b44ff9e9db4653af4ea6ded82b9)

Revision history for this message
Billy Olsen (billy-olsen) wrote :

Marking as fixed released - its now available in charm revision 261

Changed in keystone (Juju Charms Collection):
status: Fix Committed → Fix Released
Chad Smith (chad.smith)
Changed in landscape:
milestone: 17.01 → 17.02
Changed in landscape:
status: Triaged → Fix Committed
Changed in landscape:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.