making new allocations for one consumer against multiple resource providers fails with 409

Bug #1778576 reported by Chris Dent
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Jay Pipes

Bug Description

If you PUT some allocations for a new consumer (thus no generation), and those allocations are against more than one resource provider, a 409 failure will happen with:

consumer generation conflict - expected 0 but got None

This because in _new_allocations in handlers/allocation.py we always use the generation provided in the incoming data when we call util.ensure_consumer. This works for the first resource provider but then on the second one the consumer exists, so our generation has to be different now.

One possible fix (already in progress) is to use the generation from new_allocations[0].consumer.generation in subsequent trips round the loop calling _new_allocations.

I guess we must have missed some test cases. I'll make sure to add some when working on this. I found the problem with my placecat stuff.

Tags: placement
Matt Riedemann (mriedem)
Changed in nova:
status: New → Triaged
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/577914

Changed in nova:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/577915

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.openstack.org/577914
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=6b56bba8edab4a324280be4093f7370892a1712c
Submitter: Zuul
Branch: master

commit 6b56bba8edab4a324280be4093f7370892a1712c
Author: Chris Dent <email address hidden>
Date: Mon Jun 25 20:25:16 2018 +0100

    [placement] Demonstrate bug in consumer generation handling

    When PUTting allocations to a non-existent consumer, the code
    expects the consumer generation to be 'null'. When there is only
    one resource provider in the set of allocations sent, this works
    well.

    Unfortunately when there is more than one resource provider the
    loop which calls _new_allocations still uses the 'null' generation
    but now the code expects the generation to be 0, because the
    consumer now exists.

    This patch provides a gabbi test that demonstrates the problem
    by creating a shared disk resource provider and associating it
    as an aggregate with the existing RP_UUID. A GET
    /allocation_candidates is made and a single allocation_request is
    returned including both resource providers. A test with a new
    consumer is marked as an xfail.

    In the next patch the problem will be fixed.

    Change-Id: I1238f82dfc46fd9dae126abfd16ce3dd1506e991
    Related-Bug: #1778576

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Chris Dent (<email address hidden>) on branch: master
Review: https://review.openstack.org/577915
Reason: there are other, more complete fixes for this in progress

Revision history for this message
Eric Fried (efried) wrote :
Changed in nova:
assignee: Chris Dent (cdent) → Jay Pipes (jaypipes)
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/579921
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=c64198182646e0f264951dc24ed3ef98a297e888
Submitter: Zuul
Branch: master

commit c64198182646e0f264951dc24ed3ef98a297e888
Author: Jay Pipes <email address hidden>
Date: Tue Jul 3 12:48:34 2018 -0400

    placement: delete auto-created consumers on fail

    When we fail to create allocations for new consumers (either when
    issuing a PUT /allocations/{new_consumer_uuid} or a POST /allocations
    where the payload includes a new consumer UUID), we need to ensure that
    we delete the Consumer object and underlying record in the consumers
    table that gets auto-created before calling AllocationList.create_all().

    This auto-created consumer record is what is used to compare things like
    consumer generation in later calls to PUT|POST /allocations, and this
    phantom consumer record was causing confusion when normal retries (for
    things like 409 Conflict due to concurrent provider or inventory
    updates) would be rejected stating that the expected consumer generation
    was 0 and not null (null being the sentinel that indicates the caller is
    expecting the consumer is a new consumer).

    Change-Id: If37ef318bd5482a2d19928002c6f1fa24932946f
    Closes-bug: #1779725
    Closes-bug: #1778576

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 18.0.0.0b3

This issue was fixed in the openstack/nova 18.0.0.0b3 development milestone.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.