Role assignment API doesn't prune system roles when querying role.id={role_id}

Bug #1748970 reported by Lance Bragstad on 2018-02-12
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Identity (keystone)
High
Lance Bragstad
Queens
High
Lance Bragstad
tempest
Undecided
Unassigned

Bug Description

During the Queens release, keystone added support for a new scope type called system. This extended the support for users and groups to not only have roles on projects and domains, but also on a different entity called the "system". This is an effort to make RBAC support more flexible and robust, in a way to isolate system administrative APIs from project or end-user APIs.

During keystone's boostrapping process, it attempts to setup an administrator for the deployment. To be backwards compatible, the implementation for system scope included a patch to ensure the admin user not only had authorization on at least one project, but also the system [0]. This makes it so that new and old installations are guaranteed an administrative user for all APIs by running an idempotent operation. Otherwise it would be possible for an administrative user to lock themselves out of system-level APIs if they opt into enforcing scope without having at least one system administrator.

The patch to add this functionality is currently failing tempest [0], even though tempest doesn't know anything about system role assignments or requesting system scoped tokens. Opening this bug so that we can investigate tempest and understand how adding a separate role assignment is resulting 401 Authorized responses during tempest tests.

[0] https://review.openstack.org/#/c/530410/

Changed in keystone:
status: New → Triaged
importance: Undecided → High
tags: added: queens-backport-potential

I was able to recreate this locally with tempest and keystone. It appears that the token used to delete users or list domain is considered invalid during the token validation process, resulting in a 401 [0]. Keystone compares the token being sent with the request against all matched revocation events it knows about, and it does determine a match. The 401 always pops up during tearDown or when clearing credentials in tempest.

I ran a subset of the tests with and without the patch in question [1] and I did notice that more revocation events were generated *with* the patch than without, which doesn't really make sense.

Still digging into this, but wanted to document my findings.

[0] https://github.com/openstack/keystone/blob/602a2b30a3c9cb250d06b2e5b70f961cb5e2cecc/keystone/revoke/core.py#L139-L141
[1] https://review.openstack.org/#/c/530410/

Adam Young (ayoung) wrote :

Is the revocation even coming from the Role assignment? It should not be, but we used to be really aggressive about revocations upon changes. That was more important with PKI tokens than today with Fernet.

Can you classify the new revocation events?

summary: - bootstrapping system administrator causes issues with tempest
+ Role assignment API doesn't prune system roles when querying
+ role.id={role_id}
Changed in tempest:
status: New → Invalid
Lance Bragstad (lbragstad) wrote :

This is actually a bug in the role assignment API. When asking keystone for a list of role assignments for a specific role (GET /v3/role_assignments?role.id={role_id}), it won't prune system role assignments that don't match the role in question. This was causing a problem because we have a notification call back system that attempts to remove assignments before a role is deleted [0]. When the role assignment API goes to ask for all role assignments for a specific role, it doesn't filter out the system role assignments [1]. This was problematic because it was returning a system role assignment for the admin user, which resulted in a false revocation event getting persisted. As soon as tempest cleaned up the roles for a test class, a revocation event would be stored for the admin user and prevent them from cleaning up user, etc...

[0] https://github.com/openstack/keystone/blob/602a2b30a3c9cb250d06b2e5b70f961cb5e2cecc/keystone/assignment/core.py#L1332
[1] https://github.com/openstack/keystone/blob/602a2b30a3c9cb250d06b2e5b70f961cb5e2cecc/keystone/assignment/core.py#L1042

Fix proposed to branch: master
Review: https://review.openstack.org/544011

Changed in keystone:
assignee: nobody → Lance Bragstad (lbragstad)
status: Triaged → In Progress

Fix proposed to branch: master
Review: https://review.openstack.org/544012

Changed in keystone:
milestone: none → queens-rc2
no longer affects: keystone/trunk

Fix proposed to branch: stable/queens
Review: https://review.openstack.org/544096

Reviewed: https://review.openstack.org/544011
Committed: https://git.openstack.org/cgit/openstack/keystone/commit/?id=a226a3d8be5ba720f149606a84df0432ec4858c7
Submitter: Zuul
Branch: master

commit a226a3d8be5ba720f149606a84df0432ec4858c7
Author: Lance Bragstad <email address hidden>
Date: Tue Feb 13 16:52:57 2018 +0000

    Expose bug in /role_assignments API with system-scope

    The role_assignment API supports a bunch of query parameters that
    gives users flexibility when querying for role assignments. This
    commit exposes an issue when querying keystone for a specific role
    using /role_assignments?role.id={role_id}. The expected result was
    that the returned list would only contain role assignments for that
    specific role ID. The actual result is a set of role assignments with
    that role ID and all system role assignments.

    This caused issues in tempest because tempest goes through and cleans
    up resources using `tearDownClass`, and it is common to remove
    specific roles used in the test class. The problem is that keystone
    queries the role assignment API for all role assignment with a
    specific role ID, which is the equivalent to
    `GET /v3/role_assignments?role.id={role_id}` when deleting a role. The
    list returned included false positives, which were system role
    assignments, resulting in revocation events getting persisted for
    users in those role assignments. This prevented the administrator in
    tempest from cleaning up the rest of the resources because the
    revocation event would make the token being used to do resource
    cleanup.

    This commit exposes the bug using tests.

    Change-Id: If93400be3c9d3fe8e266bb36c16accca93d77154
    Partial-Bug: 1748970

Changed in keystone:
status: In Progress → Fix Released

Reviewed: https://review.openstack.org/544012
Committed: https://git.openstack.org/cgit/openstack/keystone/commit/?id=8748e729b2f139c245316fcece181625978c70a1
Submitter: Zuul
Branch: master

commit 8748e729b2f139c245316fcece181625978c70a1
Author: Lance Bragstad <email address hidden>
Date: Tue Feb 13 17:09:55 2018 +0000

    Fix querying role_assignment with system roles

    This commit removes system role assignments when querying keystone
    for a list of assignments pertaining to a specific role. For example,
    `GET /v3/role_assignments?role.id={role_id}`, now returns assignments
    only for that role. Previously, the list contained false positives
    because some system role assignments weren't being removed. This
    was introduced in queens with the system scope work.

    Change-Id: Iab35ae01bb715da5813e62cd09900de555dceaaa
    Closes-Bug: 1748970

Reviewed: https://review.openstack.org/544095
Committed: https://git.openstack.org/cgit/openstack/keystone/commit/?id=752d299d58f63810136966d9dc9a6e97252c9d32
Submitter: Zuul
Branch: stable/queens

commit 752d299d58f63810136966d9dc9a6e97252c9d32
Author: Lance Bragstad <email address hidden>
Date: Tue Feb 13 16:52:57 2018 +0000

    Expose bug in /role_assignments API with system-scope

    The role_assignment API supports a bunch of query parameters that
    gives users flexibility when querying for role assignments. This
    commit exposes an issue when querying keystone for a specific role
    using /role_assignments?role.id={role_id}. The expected result was
    that the returned list would only contain role assignments for that
    specific role ID. The actual result is a set of role assignments with
    that role ID and all system role assignments.

    This caused issues in tempest because tempest goes through and cleans
    up resources using `tearDownClass`, and it is common to remove
    specific roles used in the test class. The problem is that keystone
    queries the role assignment API for all role assignment with a
    specific role ID, which is the equivalent to
    `GET /v3/role_assignments?role.id={role_id}` when deleting a role. The
    list returned included false positives, which were system role
    assignments, resulting in revocation events getting persisted for
    users in those role assignments. This prevented the administrator in
    tempest from cleaning up the rest of the resources because the
    revocation event would make the token being used to do resource
    cleanup.

    This commit exposes the bug using tests.

    Change-Id: If93400be3c9d3fe8e266bb36c16accca93d77154
    Partial-Bug: 1748970
    (cherry picked from commit a226a3d8be5ba720f149606a84df0432ec4858c7)

tags: added: in-stable-queens

Reviewed: https://review.openstack.org/544096
Committed: https://git.openstack.org/cgit/openstack/keystone/commit/?id=a1ea04de86ebc57e7c2ba142c346f54aa93745c1
Submitter: Zuul
Branch: stable/queens

commit a1ea04de86ebc57e7c2ba142c346f54aa93745c1
Author: Lance Bragstad <email address hidden>
Date: Tue Feb 13 17:09:55 2018 +0000

    Fix querying role_assignment with system roles

    This commit removes system role assignments when querying keystone
    for a list of assignments pertaining to a specific role. For example,
    `GET /v3/role_assignments?role.id={role_id}`, now returns assignments
    only for that role. Previously, the list contained false positives
    because some system role assignments weren't being removed. This
    was introduced in queens with the system scope work.

    Change-Id: Iab35ae01bb715da5813e62cd09900de555dceaaa
    Closes-Bug: 1748970
    (cherry picked from commit 8748e729b2f139c245316fcece181625978c70a1)

This issue was fixed in the openstack/keystone 13.0.0.0rc2 release candidate.

This issue was fixed in the openstack/keystone 14.0.0.0b1 development milestone.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers