Rescheduling an instance leaves it in a scheduling state and never succeeds

Bug #1628530 reported by Andrew Laski
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Critical
Andrew Laski
Newton
Fix Committed
Undecided
Matt Riedemann

Bug Description

mamandle reported to me on IRC that commit https://github.com/openstack/nova/commit/f577f650c7ca9d8dd66eaec919e4805c09d16f6d broke reschedules. Looking at it I see the issue which is that on a reschedule there is no BuildRequest object in the db so the logic that checks for that will stop the build attempt from progressing further. That code is in place to handle the case where a delete happens during the build process but did not properly account for reschedules.

In the case of a reschedule instance.launched_on will be set so we can bypass looking for the BuildRequest object in that case. Since the BuildRequest was deleted during the first scheduling pass it's okay to do that.

Andrew Laski (alaski)
Changed in nova:
importance: Undecided → Critical
tags: added: newton-rc-potential
Changed in nova:
assignee: nobody → Andrew Laski (alaski)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/378636

Changed in nova:
status: New → In Progress
Matt Riedemann (mriedem)
tags: added: cells
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/378951

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/378636
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=9b090aeb7e2d4e4adc0b2a80402cbfb09830bd94
Submitter: Jenkins
Branch: master

commit 9b090aeb7e2d4e4adc0b2a80402cbfb09830bd94
Author: Andrew Laski <email address hidden>
Date: Wed Sep 28 09:47:12 2016 -0400

    Ignore BuildRequest during an instance reschedule

    When booting an instance there is logic in the conductor to check if a
    delete has been issued. This is done by looking for a BuildRequest
    object and discontinuing the build if it's not found. However the
    conductor then deletes the BuildRequest so a reschedule attempt will not
    find the BuildRequest object. This incorrectly stops the reschedule.

    The filter_properties dict is updated with the number of scheduling
    attempts for each reschedule so by looking at the value found there we
    know if a reschedule is being attempted. If that's the case then bypass
    the logic that checks for, and deletes, the BuildRequest object.

    Change-Id: Ibf28d1d8f54703b465ccc497281419356cd0136e
    Closes-Bug: 1628530

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/newton)

Reviewed: https://review.openstack.org/378951
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=d81fcf80323fca1c83aa4a4fd97f39b66315c935
Submitter: Jenkins
Branch: stable/newton

commit d81fcf80323fca1c83aa4a4fd97f39b66315c935
Author: Andrew Laski <email address hidden>
Date: Wed Sep 28 09:47:12 2016 -0400

    Ignore BuildRequest during an instance reschedule

    When booting an instance there is logic in the conductor to check if a
    delete has been issued. This is done by looking for a BuildRequest
    object and discontinuing the build if it's not found. However the
    conductor then deletes the BuildRequest so a reschedule attempt will not
    find the BuildRequest object. This incorrectly stops the reschedule.

    The filter_properties dict is updated with the number of scheduling
    attempts for each reschedule so by looking at the value found there we
    know if a reschedule is being attempted. If that's the case then bypass
    the logic that checks for, and deletes, the BuildRequest object.

    Change-Id: Ibf28d1d8f54703b465ccc497281419356cd0136e
    Closes-Bug: 1628530
    (cherry picked from commit 9b090aeb7e2d4e4adc0b2a80402cbfb09830bd94)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 14.0.0.0rc2

This issue was fixed in the openstack/nova 14.0.0.0rc2 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 15.0.0.0b1

This issue was fixed in the openstack/nova 15.0.0.0b1 development milestone.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.