AllocationList.delete_all() incorrectly assumes a single consumer

Bug #1781430 reported by Jay Pipes
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Jay Pipes

Bug Description

AllocationList.delete_all() looks like this:

```
    def delete_all(self):
        # Allocations can only have a single consumer, so take advantage of
        # that fact and do an efficient batch delete
        consumer_uuid = self.objects[0].consumer.uuid
        _delete_allocations_for_consumer(self._context, consumer_uuid)
        consumer_obj.delete_consumers_if_no_allocations(
            self._context, [consumer_uuid])
```

the problem with the above is that it is based on an old assumption: that a list of allocations will only ever involve a single consumer. This hasn't been the case ever since we introduced the ability to do `POST /allocations` which was 1.12 or 1.13 IIRC.

The safety concern about the above is that if someone in code does this:

```
allocs = AllocationList.get_all_by_resource_provider(self.ctx, compute_node_provider)
allocs.delete_all()
```

they would reasonable expect all of the allocations for a provider to be deleted. However, this is not the case. Only the allocations against the "first" consumer will be deleted.

The fix is simple... check to see if there are >1 consumers in the allocation list's objects and if so, don't call _delete_allocations_for_consumer(). Instead, call _delete_allocations_by_id() and do a DELETE FROM allocations WHERE id IN (...).

Tags: placement
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/582382

Changed in nova:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/582382
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=11c29ae4702ac4f50baa9fe3a29e3e413ad69b09
Submitter: Zuul
Branch: master

commit 11c29ae4702ac4f50baa9fe3a29e3e413ad69b09
Author: Jay Pipes <email address hidden>
Date: Thu Jul 12 14:57:35 2018 -0400

    do not assume 1 consumer in AllocList.delete_all()

    Ever since we introduced support for setting multiple consumers in a
    single POST /allocations, the AllocationList.delete_all() method has
    been housing a latent bad assumption and bug.

    The AllocationList.delete_all() method used to assume that the
    AllocationList's Allocation objects were only ever for a single
    consumer, and took a shortcut in deleting the allocation by deleting all
    allocations with the "first" Allocation's consumer UUID:

    ```python
        def delete_all(self):
            # Allocations can only have a single consumer, so take advantage of
            # that fact and do an efficient batch delete
            consumer_uuid = self.objects[0].consumer.uuid
            _delete_allocations_for_consumer(self._context, consumer_uuid)
            consumer_obj.delete_consumers_if_no_allocations(
                self._context, [consumer_uuid])
    ```

    The problem with the above is that if you get all the allocations for a
    single resource provider, using
    AllocationList.get_all_by_resource_provider() and there are more than
    one consumer allocating resources against that provider, then calling
    AllocationList.delete_all() will only delete *some* of the resource
    provider's allocations, not all of them.

    Luckily, the handler code has never used AllocationList.delete_all()
    after calling AllocationList.get_all_by_resource_provider(), and so
    we've not hit this latent bug in production.

    However, in the next patch in this series (the reshaper DB work), we
    *do* call AllocationList.delete_all() for allocation lists for each
    provider involved in the reshape operation, which is why this fix is
    important to get done correctly.

    Note that this patch renames AllocationList.create_all() to
    AllocationList.replace_all() to make it absolutely clear that all of
    the allocations for all consumers in the list are first *deleted* by the
    codebase and then re-created. We also remove the check in
    AllocationList.create_all() that the Allocation objects in the list must
    not have an 'id' field set. The reason for that is because in order to
    properly implement AllocationList.delete_all() to call DELETE FROM
    allocations WHERE id IN (<...>) we need the list of allocation record
    internal IDs. These id field values are now properly set on the
    Allocation objects when AllocationList.get_all_by_resource_provider()
    and AllocationList.get_all_by_consumer_id() are called. This allows that
    returned object to have delete_all() called on it and the DELETE
    statement to work properly.

    Change-Id: I12393b033054683bcc3e6f20da14e6243b4d5577
    Closes-bug: #1781430

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 18.0.0.0b3

This issue was fixed in the openstack/nova 18.0.0.0b3 development milestone.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.