schedule_and_build_instances short-circuits on all instances if one build request is already deleted

Bug #1661024 reported by Matt Riedemann
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Matt Riedemann

Bug Description

The 'return' statement here should be a 'continue':

https://github.com/openstack/nova/blob/d308d1f70e7e840b1b0c8f4307998d89f9a5ddff/nova/conductor/manager.py#L950

That's in a block of code that's cleaning up an instance recently created if the build request was already deleted by the time conductor tried to delete the build request, i.e. the user deleted the instance before it was created (which actually deleted the build request in nova-api).

The return is wrong though since we're in a loop over build_requests, so if we hit that, and there are more build requests to process, those other instances won't get built.

It's easy to miss this context because the method is so large. We should break the build request cleanup code into a separate private helper method.

Tags: conductor
Matt Riedemann (mriedem)
Changed in nova:
assignee: nobody → Matt Riedemann (mriedem)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/427839

Changed in nova:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/427839
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=f09d11269a76b04dacd0f1425ae637490d6ddca9
Submitter: Jenkins
Branch: master

commit f09d11269a76b04dacd0f1425ae637490d6ddca9
Author: Matt Riedemann <email address hidden>
Date: Wed Feb 1 13:02:45 2017 -0500

    Continue processing build requests even if one is gone already

    There was a very subtle yet busted 'return' statement in the
    except block when cleaning up an instance in the case that the
    build request was already deleted. This return statement would
    short-circuit the for loop it's in such that no other build
    requests in the loop would get processed (so those instances
    wouldn't get built).

    This was probably missed because of how large the method is so
    that when you're looking at that cleanup code, it's easy to miss
    that you're in a for loop.

    This change moves the build request cleanup block into a private
    method so it's more self-contained, and fixes the issue with the
    return statement by changing it to a 'continue' statement.

    An existing test that deals with multiple instances already is
    updated to show the bug and the fix (and also cleaned up a bit
    in the process to avoid lots of copy/paste).

    Change-Id: I399023ea705c514c33d07cc3613d79744cbf7a07
    Closes-Bug: #1661024

Changed in nova:
status: In Progress → Fix Released
Matt Riedemann (mriedem)
tags: removed: ocata-rc-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 15.0.0.0rc1

This issue was fixed in the openstack/nova 15.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.