AggregateMultiTenancyIsolation uses wrong tenant_id during cold migrate

Bug #1774205 reported by Matt Riedemann on 2018-05-30
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
High
Matt Riedemann
Ocata
High
Matt Riedemann
Pike
High
Matt Riedemann
Queens
High
Matt Riedemann

Bug Description

The details are in this mailing list thread:

http://lists.openstack.org/pipermail/openstack-operators/2018-May/015347.html

But essentially the case is:

* There are 3 compute hosts.
* compute1 and compute2 are in a host aggregate and a given tenant is restricted to that aggregate
* The user creates a server on compute1
* The admin attempts to cold migrate the server which fails in the AggregateMultiTenancyIsolation filter because it says the tenant_id in the request is not part of the matching host aggregate.

The reason is because the cold migrate task in the conductor replaces the original request spec, which had the instance project_id in it, and uses the current context, which is the admin (which could be in a different project):

https://github.com/openstack/nova/blob/stable/ocata/nova/conductor/tasks/migrate.py#L50

Matt Riedemann (mriedem) wrote :

Looks like the problem code was removed in Pike: https://review.openstack.org/#/c/449640/

In order to fix bug 1675607. I'll see if we can backport that to Ocata.

Fix proposed to branch: master
Review: https://review.openstack.org/571245

Changed in nova:
status: Triaged → In Progress

Reviewed: https://review.openstack.org/571245
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=8c216608194c89d281e8d2b66abd1e50e2405b01
Submitter: Zuul
Branch: master

commit 8c216608194c89d281e8d2b66abd1e50e2405b01
Author: Matt Riedemann <email address hidden>
Date: Wed May 30 12:07:53 2018 -0400

    Use instance project/user when creating RequestSpec during resize reschedule

    When rescheduling from a failed cold migrate / resize, the compute
    service does not pass the request spec back to conductor so we
    create one based on the in-scope variables.

    This introduces a problem for some scheduler filters like the
    AggregateMultiTenancyIsolation filter since it will create the
    RequestSpec using the project and user information from the current
    context, which for a cold migrate is the admin and might not be
    the owner of the instance (which could be in some other project).
    So the AggregateMultiTenancyIsolation filter might reject the
    request or select a host that fits an aggregate for the admin but
    not the end user.

    This fixes the problem by using the instance project/user information
    when constructing the RequestSpec which will take priority over
    the context in RequestSpec.from_components().

    Long-term we need the compute service to pass the request spec back
    to the conductor during a reschedule, but we do this first since we
    can backport it.

    Change-Id: Iaaf7f68d6874fd5d6e737e7d2bc589ea4a048fee
    Closes-Bug: #1774205

Changed in nova:
status: In Progress → Fix Released

Reviewed: https://review.openstack.org/577918
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=1162902280d06eb6201738ef54ff8300f974b374
Submitter: Zuul
Branch: stable/queens

commit 1162902280d06eb6201738ef54ff8300f974b374
Author: Matt Riedemann <email address hidden>
Date: Wed May 30 12:07:53 2018 -0400

    Use instance project/user when creating RequestSpec during resize reschedule

    When rescheduling from a failed cold migrate / resize, the compute
    service does not pass the request spec back to conductor so we
    create one based on the in-scope variables.

    This introduces a problem for some scheduler filters like the
    AggregateMultiTenancyIsolation filter since it will create the
    RequestSpec using the project and user information from the current
    context, which for a cold migrate is the admin and might not be
    the owner of the instance (which could be in some other project).
    So the AggregateMultiTenancyIsolation filter might reject the
    request or select a host that fits an aggregate for the admin but
    not the end user.

    This fixes the problem by using the instance project/user information
    when constructing the RequestSpec which will take priority over
    the context in RequestSpec.from_components().

    Long-term we need the compute service to pass the request spec back
    to the conductor during a reschedule, but we do this first since we
    can backport it.

    NOTE(mriedem): RequestSpec.user_id was added in Rocky in commit
    6e49019fae80586c4bbb8a7281600cf6140c176a so we have to remove its
    usage in this backport.

    Change-Id: Iaaf7f68d6874fd5d6e737e7d2bc589ea4a048fee
    Closes-Bug: #1774205
    (cherry picked from commit 8c216608194c89d281e8d2b66abd1e50e2405b01)

Reviewed: https://review.openstack.org/577926
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=ce7ad878093e5730b8dcec5ca717b831db8965eb
Submitter: Zuul
Branch: stable/pike

commit ce7ad878093e5730b8dcec5ca717b831db8965eb
Author: Matt Riedemann <email address hidden>
Date: Wed May 30 12:07:53 2018 -0400

    Use instance project/user when creating RequestSpec during resize reschedule

    When rescheduling from a failed cold migrate / resize, the compute
    service does not pass the request spec back to conductor so we
    create one based on the in-scope variables.

    This introduces a problem for some scheduler filters like the
    AggregateMultiTenancyIsolation filter since it will create the
    RequestSpec using the project and user information from the current
    context, which for a cold migrate is the admin and might not be
    the owner of the instance (which could be in some other project).
    So the AggregateMultiTenancyIsolation filter might reject the
    request or select a host that fits an aggregate for the admin but
    not the end user.

    This fixes the problem by using the instance project/user information
    when constructing the RequestSpec which will take priority over
    the context in RequestSpec.from_components().

    Long-term we need the compute service to pass the request spec back
    to the conductor during a reschedule, but we do this first since we
    can backport it.

    NOTE(mriedem): RequestSpec.user_id was added in Rocky in commit
    6e49019fae80586c4bbb8a7281600cf6140c176a so we have to remove its
    usage in this backport.

    Conflicts:
          nova/tests/unit/conductor/test_conductor.py

    NOTE(mriedem): The conflict is due to not having change
    Ibc44e3b2261b314bb92062a88ca9ee6b81298dc3 in Pike.

    Change-Id: Iaaf7f68d6874fd5d6e737e7d2bc589ea4a048fee
    Closes-Bug: #1774205
    (cherry picked from commit 8c216608194c89d281e8d2b66abd1e50e2405b01)
    (cherry picked from commit 1162902280d06eb6201738ef54ff8300f974b374)

Reviewed: https://review.openstack.org/577929
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=09e678ee142ea297640e44ce9af630402871c38a
Submitter: Zuul
Branch: stable/ocata

commit 09e678ee142ea297640e44ce9af630402871c38a
Author: Matt Riedemann <email address hidden>
Date: Wed May 30 12:07:53 2018 -0400

    Use instance project/user when creating RequestSpec during resize reschedule

    When rescheduling from a failed cold migrate / resize, the compute
    service does not pass the request spec back to conductor so we
    create one based on the in-scope variables.

    This introduces a problem for some scheduler filters like the
    AggregateMultiTenancyIsolation filter since it will create the
    RequestSpec using the project and user information from the current
    context, which for a cold migrate is the admin and might not be
    the owner of the instance (which could be in some other project).
    So the AggregateMultiTenancyIsolation filter might reject the
    request or select a host that fits an aggregate for the admin but
    not the end user.

    This fixes the problem by using the instance project/user information
    when constructing the RequestSpec which will take priority over
    the context in RequestSpec.from_components().

    Long-term we need the compute service to pass the request spec back
    to the conductor during a reschedule, but we do this first since we
    can backport it.

    NOTE(mriedem): RequestSpec.user_id was added in Rocky in commit
    6e49019fae80586c4bbb8a7281600cf6140c176a so we have to remove its
    usage in this backport.

    NOTE(mriedem): Some existing tests in test_conductor.py had to
    be updated to set a project_id on the fake instance since change
    I34b1d99a9d0d2aca80f094a79ec1656abaf762dc is not in Ocata.

    Change-Id: Iaaf7f68d6874fd5d6e737e7d2bc589ea4a048fee
    Closes-Bug: #1774205
    (cherry picked from commit 8c216608194c89d281e8d2b66abd1e50e2405b01)
    (cherry picked from commit 1162902280d06eb6201738ef54ff8300f974b374)
    (cherry picked from commit ce7ad878093e5730b8dcec5ca717b831db8965eb)

This issue was fixed in the openstack/nova 18.0.0.0b3 development milestone.

This issue was fixed in the openstack/nova 16.1.5 release.

This issue was fixed in the openstack/nova 17.0.6 release.

This issue was fixed in the openstack/nova 15.1.4 release.

Shlomi (shlomi-tsadok) on 2019-04-30
information type: Public → Public Security
Jeremy Stanley (fungi) wrote :

Please don't mark a bug report as a suspected security vulnerability unless you also include some explanation as to why you think it represents one. I'm switching the status from public security back to a normal public bug report again for now.

information type: Public Security → Public

Reviewed: https://review.opendev.org/571265
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=a638685c469da18dcf8f4bc4763763d90e50be17
Submitter: Zuul
Branch: master

commit a638685c469da18dcf8f4bc4763763d90e50be17
Author: Matt Riedemann <email address hidden>
Date: Wed May 30 13:40:35 2018 -0400

    Add functional test for AggregateMultiTenancyIsolation + migrate

    A bug was reported against Ocata where a non-admin user
    creates a server and the user's project is isolated to a
    set of hosts via the AggregateMultiTenancyIsolation filter.

    The admin, with a different project, cold migrates the server
    and the filter rejects the request because before change
    I195d389ac59574724a5e7202ba1a17d92c53a676 the cold migrate
    task would re-generate the RequestSpec using the request context
    which was from the admin, not the owner of the instance.

    Even though this is not a problem past Ocata, we did not have
    functional test coverage for this scenario so it is added here.

    This will also be used to backport the fix to Ocata to show
    the regression and fix in that branch.

    Change-Id: I97559607fc720fb98c3543ff3dd6095281752cd4
    Related-Bug: #1774205
    Related-Bug: #1675607

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers