ResourceTracker races to delete instance allocations before instance is mapped to a cell

Bug #1729371 reported by Matt Riedemann
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Dan Smith
Pike
Fix Committed
High
Matt Riedemann

Bug Description

We hit this in queens CI where we go to swap instance allocations to the migration uuid during a resize and the instance allocations on the source node are not found, which shouldn't happen:

http://logs.openstack.org/08/516708/3/gate/legacy-tempest-dsvm-py35/4d8d6a3/logs/screen-n-super-cond.txt.gz#_Oct_31_23_18_04_391235

Oct 31 23:18:04.391235 ubuntu-xenial-rax-iad-0000635032 nova-conductor[22376]: ERROR nova.conductor.tasks.migrate [None req-2bd7178b-307e-4342-a156-f9645b7f75a5 tempest-MigrationsAdminTest-834796780 tempest-MigrationsAdminTest-834796780] [instance: 87329b8a-0fa5-4467-b69e-6c43d4633f54] Unable to find existing allocations for instance
Oct 31 23:18:04.421480 ubuntu-xenial-rax-iad-0000635032 nova-conductor[22376]: WARNING nova.scheduler.utils [None req-2bd7178b-307e-4342-a156-f9645b7f75a5 tempest-MigrationsAdminTest-834796780 tempest-MigrationsAdminTest-834796780] Failed to compute_task_migrate_server: Instance 87329b8a-0fa5-4467-b69e-6c43d4633f54 is unacceptable: Instance has no source node allocation: nova.exception.InstanceUnacceptable: Instance 87329b8a-0fa5-4467-b69e-6c43d4633f54 is unacceptable: Instance has no source node allocation

Looking in Placement for that instance, it's allocations are created by the scheduler here:

http://logs.openstack.org/08/516708/3/gate/legacy-tempest-dsvm-py35/4d8d6a3/logs/screen-placement-api.txt.gz#_Oct_31_23_18_00_805083

Oct 31 23:18:00.637846 ubuntu-xenial-rax-iad-0000635032 <email address hidden>[15777]: DEBUG nova.api.openstack.placement.requestlog [None req-d285a5a5-704b-4f46-98d5-f6e5ae7f6b4f service placement] Starting request: 104.239.175.193 "PUT /placement/allocations/87329b8a-0fa5-4467-b69e-6c43d4633f54" {{(pid=15780) __call__ /opt/stack/new/nova/nova/api/openstack/placement/requestlog.py:38}}
Oct 31 23:18:00.684152 ubuntu-xenial-rax-iad-0000635032 <email address hidden>[15777]: DEBUG nova.api.openstack.placement.handlers.allocation [None req-d285a5a5-704b-4f46-98d5-f6e5ae7f6b4f service placement] Successfully wrote allocations AllocationList[Allocation(consumer_id=87329b8a-0fa5-4467-b69e-6c43d4633f54,id=23,project_id='d349ceecbff744a2943294be5bb7e427',resource_class='MEMORY_MB',resource_provider=ResourceProvider(ec18595d-7007-47c9-bf13-38c8cf8a8bb0),used=64,user_id='d44d813424704df8996b7d77840283c9'), Allocation(consumer_id=87329b8a-0fa5-4467-b69e-6c43d4633f54,id=24,project_id='d349ceecbff744a2943294be5bb7e427',resource_class='VCPU',resource_provider=ResourceProvider(ec18595d-7007-47c9-bf13-38c8cf8a8bb0),used=1,user_id='d44d813424704df8996b7d77840283c9')] {{(pid=15780) _set_allocations /opt/stack/new/nova/nova/api/openstack/placement/handlers/allocation.py:249}}

And shortly after that we see them deleted:

http://logs.openstack.org/08/516708/3/gate/legacy-tempest-dsvm-py35/4d8d6a3/logs/screen-placement-api.txt.gz#_Oct_31_23_18_00_805083

Oct 31 23:18:00.805083 ubuntu-xenial-rax-iad-0000635032 <email address hidden>[15777]: DEBUG nova.api.openstack.placement.requestlog [None req-1fc99e13-67ca-4180-9a7f-d3a01a219b15 service placement] Starting request: 104.239.175.193 "DELETE /placement/allocations/87329b8a-0fa5-4467-b69e-6c43d4633f54" {{(pid=15780) __call__ /opt/stack/new/nova/nova/api/openstack/placement/requestlog.py:38}}
Oct 31 23:18:00.814342 ubuntu-xenial-rax-iad-0000635032 <email address hidden>[15777]: DEBUG nova.api.openstack.placement.handlers.allocation [None req-1fc99e13-67ca-4180-9a7f-d3a01a219b15 service placement] Successfully deleted allocations AllocationList[Allocation(consumer_id=87329b8a-0fa5-4467-b69e-6c43d4633f54,id=<?>,project_id=<?>,resource_class='MEMORY_MB',resource_provider=ResourceProvider(ec18595d-7007-47c9-bf13-38c8cf8a8bb0),used=64,user_id=<?>), Allocation(consumer_id=87329b8a-0fa5-4467-b69e-6c43d4633f54,id=<?>,project_id=<?>,resource_class='VCPU',resource_provider=ResourceProvider(ec18595d-7007-47c9-bf13-38c8cf8a8bb0),used=1,user_id=<?>)] {{(pid=15780) delete_allocations /opt/stack/new/nova/nova/api/openstack/placement/handlers/allocation.py:307}}

Looking at the compute service logs, it looks like what is happening is the update_available_resource periodic task on the compute is running in between the time that the allocations are created on the compute node via the scheduler and before the instance is created in the cell database:

http://logs.openstack.org/08/516708/3/gate/legacy-tempest-dsvm-py35/4d8d6a3/logs/screen-n-cpu.txt.gz#_Oct_31_23_18_00_165850

http://logs.openstack.org/08/516708/3/gate/legacy-tempest-dsvm-py35/4d8d6a3/logs/screen-n-cpu.txt.gz#_Oct_31_23_18_00_780729

Oct 31 23:18:00.780729 ubuntu-xenial-rax-iad-0000635032 nova-compute[23251]: DEBUG nova.compute.resource_tracker [None req-9743bef5-318d-4e7e-a71b-26c62cd0af2d tempest-MigrationsAdminTest-834796780 tempest-MigrationsAdminTest-834796780] Instance 87329b8a-0fa5-4467-b69e-6c43d4633f54 has been deleted (perhaps locally). Deleting allocations that remained for this instance against this compute host: {'resources': {'VCPU': 1, 'MEMORY_MB': 64}}. {{(pid=23251) _remove_deleted_instances_allocations /opt/stack/new/nova/nova/compute/resource_tracker.py:1180}}
Oct 31 23:18:00.817854 ubuntu-xenial-rax-iad-0000635032 nova-compute[23251]: INFO nova.scheduler.client.report [None req-9743bef5-318d-4e7e-a71b-26c62cd0af2d tempest-MigrationsAdminTest-834796780 tempest-MigrationsAdminTest-834796780] Deleted allocation for instance 87329b8a-0fa5-4467-b69e-6c43d4633f54

That's because of this code that gets an InstanceNotFound and assumes the instance was deleted:

https://github.com/openstack/nova/blob/f974e3c3566f379211d7fdc790d07b5680925584/nova/compute/resource_tracker.py#L1169-L1181

The compute service shouldn't have access to the nova_api database so we can't check to see if the instance is still represented as a build request, so we'll have to redo this deleted instance logic in the resource tracker to fix this race bug.

Revision history for this message
Matt Riedemann (mriedem) wrote :
Revision history for this message
Matt Riedemann (mriedem) wrote :

Note that this would eventually be mitigated if there is at least one compute < Pike in the deployment since then the resource tracker will continue to create allocations in Placement during the update_available_resource periodic task. However, that won't help anyone that is doing a fresh install for pike or has upgraded all of their computes to pike, since we'll stop doing the auto-healing of allocations at that point.

Revision history for this message
Matt Riedemann (mriedem) wrote :
Changed in nova:
assignee: nobody → Dan Smith (danms)
Changed in nova:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/517108

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/517009
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=d176175db456e4a312eed37ed4f5adcc06292fd5
Submitter: Zuul
Branch: master

commit d176175db456e4a312eed37ed4f5adcc06292fd5
Author: Dan Smith <email address hidden>
Date: Wed Nov 1 08:49:07 2017 -0700

    Avoid deleting allocations for instances being built

    The resource tracker's _remove_deleted_instances_allocations() assumes that
    InstanceNotFound means that an instance was deleted. That's not quite accurate,
    as we would also see that in the window between creating allocations and actually
    creating the instance in the cell database. So, the code now will kill
    allocations for those instances before they are created.

    This change makes us look up the instance with read_deleted=yes, and if we find
    it with deleted=True, then we do the allocation removal. This does mean that
    someone running a full DB archive at the instant an instance is deleted in some
    way that didn't result in allocation removal as well could leak those. However,
    we can log that (unlikely) situation.

    Closes-Bug: #1729371

    Change-Id: I4482ac2ecf8e07c197fd24c520b7f11fd5a10945

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.openstack.org/517879

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/pike)

Reviewed: https://review.openstack.org/517879
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=70ef1fa159c9941129827dbe93d4acc718fdb7ad
Submitter: Zuul
Branch: stable/pike

commit 70ef1fa159c9941129827dbe93d4acc718fdb7ad
Author: Dan Smith <email address hidden>
Date: Wed Nov 1 08:49:07 2017 -0700

    Avoid deleting allocations for instances being built

    The resource tracker's _remove_deleted_instances_allocations() assumes that
    InstanceNotFound means that an instance was deleted. That's not quite accurate,
    as we would also see that in the window between creating allocations and actually
    creating the instance in the cell database. So, the code now will kill
    allocations for those instances before they are created.

    This change makes us look up the instance with read_deleted=yes, and if we find
    it with deleted=True, then we do the allocation removal. This does mean that
    someone running a full DB archive at the instant an instance is deleted in some
    way that didn't result in allocation removal as well could leak those. However,
    we can log that (unlikely) situation.

    Closes-Bug: #1729371

    Conflicts:
          nova/compute/resource_tracker.py
          nova/tests/unit/compute/test_resource_tracker.py

    NOTE(mriedem): Conflicts were due to not having change
    1ff1310683c647517aa825059e4e6ad706e3da19 or change
    e3b7f43e3ae9689967d2ecd1cd7974d829c78c85 in Pike.

    Change-Id: I4482ac2ecf8e07c197fd24c520b7f11fd5a10945
    (cherry picked from commit d176175db456e4a312eed37ed4f5adcc06292fd5)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 16.0.3

This issue was fixed in the openstack/nova 16.0.3 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 17.0.0.0b2

This issue was fixed in the openstack/nova 17.0.0.0b2 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.openstack.org/517108
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=bf92b72ce9307e2faf220a75ca858289cf46406c
Submitter: Zuul
Branch: master

commit bf92b72ce9307e2faf220a75ca858289cf46406c
Author: Matt Riedemann <email address hidden>
Date: Wed Nov 1 17:04:45 2017 -0400

    Time how long select_destinations() takes in conductor

    When we see weird bugs due to race windows, like bug 1729371,
    it would be good to know if we're spending an inordinate amount
    of time in the scheduler selecting destinations because that
    can have a real effect on things happening before the instance
    is created in a cell during the initial create.

    Change-Id: I395ac34189c9fc010a37d4a73fac151d0f625ba2
    Related-Bug: #1729371

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/rocky)

Related fix proposed to branch: stable/rocky
Review: https://review.openstack.org/608575

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/rocky)

Reviewed: https://review.openstack.org/608575
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=8e463aa719be5446bb3cff7263274bf7d911aac0
Submitter: Zuul
Branch: stable/rocky

commit 8e463aa719be5446bb3cff7263274bf7d911aac0
Author: Matt Riedemann <email address hidden>
Date: Wed Nov 1 17:04:45 2017 -0400

    Time how long select_destinations() takes in conductor

    When we see weird bugs due to race windows, like bug 1729371,
    it would be good to know if we're spending an inordinate amount
    of time in the scheduler selecting destinations because that
    can have a real effect on things happening before the instance
    is created in a cell during the initial create.

    Change-Id: I395ac34189c9fc010a37d4a73fac151d0f625ba2
    Related-Bug: #1729371
    (cherry picked from commit bf92b72ce9307e2faf220a75ca858289cf46406c)

tags: added: in-stable-rocky
melanie witt (melwitt)
no longer affects: nova/rocky
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.