Unable to delete domain with users in it

Bug #1718747 reported by Guang Yee
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Identity (keystone)
Fix Released
High
Colleen Murphy
Newton
Won't Fix
High
Unassigned
Ocata
Fix Committed
High
Colleen Murphy
Pike
Fix Released
High
Colleen Murphy

Bug Description

Attempting to delete a domain which contains users and projects may yield an UnexpectedError similiar to this

Sep 21 19:37:17 vagrant-openSUSE-Leap <email address hidden>[23894]: DEBUG keystone.common.sql.core [None req-707ec264-b10c-4079-94bb-2af01db58aab None None] Conflict project: (pymysql.err.IntegrityError) (1451, u'Cannot delete or update a parent row: a foreign key constraint fails (`keystone`.`user`, CONSTRAINT `user_ibfk_1` FOREIGN KEY (`domain_id`) REFERENCES `project` (`id`))') [SQL: u'DELETE FROM project WHERE project.id = %(id)s'] [parameters: {'id': u'63d2d5446e364f00b3181bf49c62c5b8'}] {{(pid=23897) wrapper /opt/stack/keystone/keystone/common/sql/core.py:550}}
Sep 21 19:37:17 vagrant-openSUSE-Leap <email address hidden>[23894]: WARNING keystone.common.wsgi [None req-707ec264-b10c-4079-94bb-2af01db58aab None None] An unexpected error prevented the server from fulfilling your request.: UnexpectedError: An unexpected error prevented the server from fulfilling your request.

Steps to reproduce:

1. Install devstack
2. create a domain 'foo'

  openstack domain create foo

3. create a user in domain 'foo'

  openstack user create --password equifax --domain foo foo_user

4. create a project in domain 'foo'

  openstack project create --domain foo foo_project

5. enable domain user 'foo_user' access to project 'foo_project'

  openstack role add --user foo_user --project foo_project admin

6. now disable domain 'foo'

  openstack domain set --disable foo

7. attempt to delete domain 'foo' will yield an expected error mentioned above

  openstack domain delete foo

This was introduced in: https://github.com/openstack/keystone/commit/2bd88d30e1d2873470af7f40db45a99e07e12ce6

Changed in keystone:
status: New → Confirmed
Changed in keystone:
importance: Undecided → High
Changed in keystone:
assignee: nobody → Samuel de Medeiros Queiroz (samueldmq)
summary: - Unable to delete domain with projects in it
+ Unable to delete domain with users in it
Revision history for this message
Adam Young (ayoung) wrote :

While the error message is ugly (should not 500 the server) the behavior is intentional: we put a referential constraint on there to keep from accidentally deleting a load of data if the user accidentally deletes the domain: you have to disable the domain first, and then delete it for the same reason.

Revision history for this message
Samuel de Medeiros Queiroz (samueldmq) wrote :

Although not explicitly in the SQL model definition, the FK is user.domain_id -> project.id is added in a migration: https://github.com/openstack/keystone/blob/2bd88d3/keystone/common/sql/expand_repo/versions/014_expand_add_domain_id_to_user_table.py#L140-L141

Revision history for this message
Samuel de Medeiros Queiroz (samueldmq) wrote :

@Adam, but the error occurs when trying to delete a domain that is disabled. Notice the steps disable the domain before deleting it.

Changed in keystone:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to keystone (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.openstack.org/511061

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to keystone (stable/ocata)

Fix proposed to branch: stable/ocata
Review: https://review.openstack.org/511062

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to keystone (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/511063

description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on keystone (stable/newton)

Change abandoned by Tony Breeds (<email address hidden>) on branch: stable/newton
Review: https://review.openstack.org/511063
Reason: This branch (stable/newton) is at End Of Life

Revision history for this message
CHARLES WANG (charleswang007) wrote :

@samuel Are you still working on the proposed fix https://review.openstack.org/#/c/511061 ? It failed Zuul and status since October 2017 has not been changed.

Changed in keystone:
assignee: Samuel de Medeiros Queiroz (samueldmq) → CHARLES WANG (charleswang007)
Colleen Murphy (krinkle)
Changed in keystone:
milestone: none → queens-rc1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to keystone (master)

Fix proposed to branch: master
Review: https://review.openstack.org/539347

Changed in keystone:
assignee: CHARLES WANG (charleswang007) → Colleen Murphy (krinkle)
Revision history for this message
Adam Young (ayoung) wrote :

"Unable to delete domain with users in it Edit" is the desired behavior. Please restate the bug. I think the problem is that if you try you get the database into an unstable state?

Revision history for this message
Adam Young (ayoung) wrote :

Deleting a domain should not be a casual thing. Once you do, all of the resources in the remote services are unmanagable. Removing the local user records, while annoying, ensures you do not accidentally destroy your entire user database by accidentally deleting a domain.

Revision history for this message
Guang Yee (guang-yee) wrote :

APIs have no feelings. Nor can it read the user's mind. All it care is performing the function it is designed to do. We've already required a domain to be disabled before being deleted. I know the Hawaii missile alert freaks people out. But come-on. :-) How many safe-guards do we need?

Revision history for this message
Adam Young (ayoung) wrote :

Heh. Before accidentally orphaning all of the users ain a domain? At least one more, I would say.

Revision history for this message
Colleen Murphy (krinkle) wrote :

Adam - deleting all the users and groups was the original behavior: http://git.openstack.org/cgit/openstack/keystone/tree/keystone/identity/core.py#n491

The fact that it doesn't do that any more and emits a 500 error is a regression caused by the addition of a foreign key in the user table.

We require the domain to be disabled before deleting to prevent footgunning.

We could "fix" the behavior to issue a 4XX instead of exploding but this would not be in line with the current intended behavior.

Revision history for this message
Adam Young (ayoung) wrote :

The original behavior was bad, too. Would have been better to fix it deliberately than as a side effect.

The user side actually worries me less than the project side: deleting a project without deleting the resources in the remote services leave them unmanageable and require a system level admin to clean them up with some global scope of responsibility.

On the user side, disabling a domain prior to delete is necessary, as that keeps people from adding new users while someone else is trying frantically to delete them all before domain deletion.

Do we really want to do this?

Revision history for this message
Colleen Murphy (krinkle) wrote :

> The user side actually worries me less than the project side: deleting a project without deleting the resources in the remote services leave them unmanageable and require a system level admin to clean them up with some global scope of responsibility.

That's not really our problem. We have to support deleting projects, and yes if you delete a project that still has a nova server in it then you're going to need to clean that up. That's already the case today and not the focus of this bug.

> Do we really want to do this?

Letting keystone emit a 500 is not an option, we either need to fix the old behavior or change the behavior to emit a validation error. We might be in a gray area now that we've already broken it but in general switching the response from a 2XX to a 4XX is not okay.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to keystone (master)

Reviewed: https://review.openstack.org/539347
Committed: https://git.openstack.org/cgit/openstack/keystone/commit/?id=62ee18b359cbb2e6a9469bdaac9057ef19de1bdf
Submitter: Zuul
Branch: master

commit 62ee18b359cbb2e6a9469bdaac9057ef19de1bdf
Author: Colleen Murphy <email address hidden>
Date: Tue Jan 30 23:23:15 2018 +0100

    Delete SQL users before deleting domain

    Since the users table has a foreign key to the projects table[1], users
    must be deleted before the domain can be deleted. However, the
    notification emitted from the domain deletion comes too late, and
    keystone runs into a foreign key reference error before it can delete
    the users. This patch addresses the problem by adding a new internal
    notification to alert the identity manager that users should be deleted.
    This uses a new notification rather than the existing notification
    because the existing one is used to alert listeners that the domain
    deletion has been fully completed, whereas this one must happen in the
    middle of the domain delete process.

    The callback must also only try to delete SQL users. The LDAP driver
    doesn't support deleting users, and we can't assume other drivers
    support it either. Moreover, the foreign key reference is only a problem
    for SQL users anyway.

    Because our backend unit tests run with SQLite and foreign keys do not
    work properly, we can't properly expose this bug in our unit tests, but
    there is an accompanying tempest test[2][3] to validate this fix.

    [1] https://github.com/openstack/keystone/blob/2bd88d3/keystone/common/sql/expand_repo/versions/014_expand_add_domain_id_to_user_table.py#L140-L141
    [2] https://review.openstack.org/#/c/509610
    [3] https://review.openstack.org/#/c/509947

    Change-Id: If5bdb6f5eef80b50b000aed5188ce7da4dfd1083
    Closes-bug: #1718747

Changed in keystone:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to keystone (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.openstack.org/542483

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/keystone 13.0.0.0rc1

This issue was fixed in the openstack/keystone 13.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to keystone (stable/ocata)

Fix proposed to branch: stable/ocata
Review: https://review.openstack.org/543379

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to keystone (stable/pike)

Reviewed: https://review.openstack.org/542483
Committed: https://git.openstack.org/cgit/openstack/keystone/commit/?id=fb81469404f4c6e8a5f25896d155e02491c3b610
Submitter: Zuul
Branch: stable/pike

commit fb81469404f4c6e8a5f25896d155e02491c3b610
Author: Colleen Murphy <email address hidden>
Date: Tue Jan 30 23:23:15 2018 +0100

    Delete SQL users before deleting domain

    Since the users table has a foreign key to the projects table[1], users
    must be deleted before the domain can be deleted. However, the
    notification emitted from the domain deletion comes too late, and
    keystone runs into a foreign key reference error before it can delete
    the users. This patch addresses the problem by adding a new internal
    notification to alert the identity manager that users should be deleted.
    This uses a new notification rather than the existing notification
    because the existing one is used to alert listeners that the domain
    deletion has been fully completed, whereas this one must happen in the
    middle of the domain delete process.

    The callback must also only try to delete SQL users. The LDAP driver
    doesn't support deleting users, and we can't assume other drivers
    support it either. Moreover, the foreign key reference is only a problem
    for SQL users anyway.

    Because our backend unit tests run with SQLite and foreign keys do not
    work properly, we can't properly expose this bug in our unit tests, but
    there is an accompanying tempest test[2][3] to validate this fix.

    [1] https://github.com/openstack/keystone/blob/2bd88d3/keystone/common/sql/expand_repo/versions/014_expand_add_domain_id_to_user_table.py#L140-L141
    [2] https://review.openstack.org/#/c/509610
    [3] https://review.openstack.org/#/c/509947

    Change-Id: If5bdb6f5eef80b50b000aed5188ce7da4dfd1083
    Closes-bug: #1718747
    (cherry picked from commit 62ee18b359cbb2e6a9469bdaac9057ef19de1bdf)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on keystone (stable/pike)

Change abandoned by Morgan Fainberg (<email address hidden>) on branch: stable/pike
Review: https://review.openstack.org/511061
Reason: Abandoning based upon IRC disucssions. this is not needed.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to keystone (stable/ocata)

Reviewed: https://review.openstack.org/543379
Committed: https://git.openstack.org/cgit/openstack/keystone/commit/?id=b6a009254f7975bf806c499874ad87746bb8cbd8
Submitter: Zuul
Branch: stable/ocata

commit b6a009254f7975bf806c499874ad87746bb8cbd8
Author: Colleen Murphy <email address hidden>
Date: Tue Jan 30 23:23:15 2018 +0100

    Delete SQL users before deleting domain

    Since the users table has a foreign key to the projects table[1], users
    must be deleted before the domain can be deleted. However, the
    notification emitted from the domain deletion comes too late, and
    keystone runs into a foreign key reference error before it can delete
    the users. This patch addresses the problem by adding a new internal
    notification to alert the identity manager that users should be deleted.
    This uses a new notification rather than the existing notification
    because the existing one is used to alert listeners that the domain
    deletion has been fully completed, whereas this one must happen in the
    middle of the domain delete process.

    The callback must also only try to delete SQL users. The LDAP driver
    doesn't support deleting users, and we can't assume other drivers
    support it either. Moreover, the foreign key reference is only a problem
    for SQL users anyway.

    Because our backend unit tests run with SQLite and foreign keys do not
    work properly, we can't properly expose this bug in our unit tests, but
    there is an accompanying tempest test[2][3] to validate this fix.

    [1] https://github.com/openstack/keystone/blob/2bd88d3/keystone/common/sql/expand_repo/versions/014_expand_add_domain_id_to_user_table.py#L140-L141
    [2] https://review.openstack.org/#/c/509610
    [3] https://review.openstack.org/#/c/509947

    Change-Id: If5bdb6f5eef80b50b000aed5188ce7da4dfd1083
    Closes-bug: #1718747
    (cherry picked from commit 62ee18b359cbb2e6a9469bdaac9057ef19de1bdf)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on keystone (master)

Change abandoned by Lance Bragstad (<email address hidden>) on branch: master
Review: https://review.openstack.org/506340

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on keystone (stable/ocata)

Change abandoned by Lance Bragstad (<email address hidden>) on branch: stable/ocata
Review: https://review.openstack.org/511062

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/keystone 11.0.4

This issue was fixed in the openstack/keystone 11.0.4 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/keystone 12.0.1

This issue was fixed in the openstack/keystone 12.0.1 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.