Race condition in keystone domain config

Bug #1549726 reported by Divya K Konoor
14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Identity (keystone)
Fix Released
Critical
Divya K Konoor

Bug Description

This is a very difficult to reproduce bug but occurs nevertheless and can be observed when we switch the backend drivers.

Steps to reproduce:
1. Switch the backened driver in keystone conf file from one driver to another . Restart keystone
2. Immediately (if you wait for few seconds, this cannot be reproduced) , make calls that in turn access the keystone methods /keystone/identity/core >> get_group and list_users_in_group. It doesn't have to be exactly these two. It can be any two similar methods in identity.core.py which uses the @domains_configured decorator.

https://github.com/openstack/keystone/blob/master/keystone/identity/core.py#L100

Invoke two methods that use this decorators and these method invocations must be almost parallel. Both the methods hit the following flow where the race condition occurs:

def domains_configured(f):
   """Wraps API calls to lazy load domain configs after init.
    """
    @functools.wraps(f)
    def wrapper(self, *args, **kwargs):
        if (not self.domain_configs.configured and
                CONF.identity.domain_specific_drivers_enabled):
            self.domain_configs.setup_domain_drivers(
                self.driver, self.resource_api)
        else:
            LOG.error('domains will not be configured')
        return f(self, *args, **kwargs)
    return wrapper

def setup_domain_drivers(self, standard_driver, resource_api):
        # This is called by the api call wrapper
       self.configured = True
        self.driver = standard_driver

.....

When the first call is placed, it sets self.configured to True and then proceeds towards loading the driver that corresponds to the domain-..However, the second request call assumes the the driver load is already complete (purely based on the value set to self.configured - which is True even though driver is not really loaded). It thus ends up using the default driver (ie driver which is not domain specific ) and retrieves the values.

There should be some synchronization logic added inside domains_configured (or one of the internal methods) so that incorrect backend driver is not used by a request.

Tags: rc-potential
Changed in keystone:
assignee: nobody → Divya K Konoor (dikonoor)
Revision history for this message
Guang Yee (guang-yee) wrote :

I presume you are running Keystone under Apache, configured with multi-thread and multi-process? This problem may not be reproducible under devstack with single-thread, multi-process configuration.

WSGIDaemonProcess keystone-admin processes=5 threads=1 user=gyee display-name=%{GROUP}

Revision history for this message
Matthew Edmonds (edmondsw) wrote :

Yes it is running under Apache httpd. Neither the number of processes nor threads were specified, and mod_wsgi appears to default to processes=1 threads=15... so single-process, multi-thread. Is there a reason devstack doesn't use the defaults?

Revision history for this message
Guang Yee (guang-yee) wrote :

Not sure about upstream devstack. But for us, we didn't find any noticeable performance advantage when running it with multi-threads. My understanding is the Keystone is more CUP-bound than I/O-bound, and with Python's GIL, I am not sure multi-threading provide us any real advantage.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to keystone (master)

Fix proposed to branch: master
Review: https://review.openstack.org/287020

Changed in keystone:
status: New → In Progress
Henry Nash (henry-nash)
tags: added: rc-potential
Dolph Mathews (dolph)
Changed in keystone:
importance: Undecided → Critical
Changed in keystone:
assignee: Divya K Konoor (dikonoor) → Dolph Mathews (dolph)
Changed in keystone:
assignee: Dolph Mathews (dolph) → Divya K Konoor (dikonoor)
Changed in keystone:
milestone: none → mitaka-rc1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to keystone (master)

Reviewed: https://review.openstack.org/287020
Committed: https://git.openstack.org/cgit/openstack/keystone/commit/?id=0b18edab226f6e9dc531febd4eb6f65ccd3c031e
Submitter: Jenkins
Branch: master

commit 0b18edab226f6e9dc531febd4eb6f65ccd3c031e
Author: Divya <email address hidden>
Date: Wed Mar 2 08:05:42 2016 +0100

    Race condition in keystone domain config

    This bug fixes a race condition in the domains_config
    decorator. The race condition occurs when more than
    one thread accesses the decorator. The first thread
    sets the configured flag to True before proceeding with
    driver load leading the second thread to use the default
    driver. This fix ensures that the second thread waits for
    the first thread to finish configuration before it uses
    the driver.

    Change-Id: I0289a4d38e0d30d39c67e29bf77b0a89d1dd23f6
    Closes-Bug: 1549726

Changed in keystone:
status: In Progress → Fix Released
Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/keystone 9.0.0.0rc1

This issue was fixed in the openstack/keystone 9.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.