Bug #1750618 “rebuild to same host with a different image result...” : Series ocata : Bugs : OpenStack Compute (nova)

Matt Riedemann (mriedem) on 2018-02-20

tags:

added: rebuild

Revision history for this message

Matt Riedemann (mriedem) wrote on 2018-02-20:

#1

This is a regression introduced with change I11746d1ea996a0f18b7c54b4c9c21df58cc4714b which was backported all the way to stable/newton upstream:

https://review.openstack.org/#/q/I11746d1ea996a0f18b7c54b4c9c21df58cc4714b

Changed in nova:
importance:	Undecided → High
status:	New → Triaged

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-02-20: Fix proposed to nova (master)

#2

Fix proposed to branch: master
Review: https://review.openstack.org/546268

Changed in nova:
assignee:	nobody → Matt Riedemann (mriedem)
status:	Triaged → In Progress

Revision history for this message

Tristan Cacqueray (tristan-cacqueray) wrote on 2018-02-21:

#3

Is this extra Claim only affect the rebuild action, or can it leaks and affect the project quota or the compute capacity after the rebuild is completed?

Revision history for this message

Chris Friesen (cbf123) wrote on 2018-02-21:

#4

It looks like it's mostly only affecting the rebuild action. In the compute_nodes table in the nova DB I'm seeing "memory_mb_used" be 1024 when it should be 512, but the CPU/disk usage is where it should be so I'm not sure what's going on.

Revision history for this message

Chris Friesen (cbf123) wrote on 2018-02-21:

#5

Actually, it's showing as consuming 512MB under "memory_mb_used" even when the instance is gone, so I think this might be intentional.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-03-07: Fix proposed to nova (stable/queens)

#6

Fix proposed to branch: stable/queens
Review: https://review.openstack.org/550545

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-03-07: Fix proposed to nova (stable/pike)

#7

Fix proposed to branch: stable/pike
Review: https://review.openstack.org/550555

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-03-07: Fix proposed to nova (stable/ocata)

#8

Fix proposed to branch: stable/ocata
Review: https://review.openstack.org/550560

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-03-07: Fix merged to nova (master)

#9

Reviewed: https://review.openstack.org/546268
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=a39029076c7997236a7f999682fb1e998c474204
Submitter: Zuul
Branch: master

commit a39029076c7997236a7f999682fb1e998c474204
Author: Matt Riedemann <email address hidden>
Date: Tue Feb 20 13:48:12 2018 -0500

Only attempt a rebuild claim for an evacuation to a new host

    Change I11746d1ea996a0f18b7c54b4c9c21df58cc4714b changed the
    behavior of the API and conductor when rebuilding an instance
    with a new image such that the image is run through the scheduler
    filters again to see if it will work on the existing host that
    the instance is running on.

    As a result, conductor started passing 'scheduled_node' to the
    compute which was using it for logic to tell if a claim should be
    attempted. We don't need to do a claim for a rebuild since we're
    on the same host.

    This removes the scheduled_node logic from the claim code, as we
    should only ever attempt a claim if we're evacuating, which we
    can determine based on the 'recreate' parameter.

Change-Id: I7fde8ce9dea16679e76b0cb2db1427aeeec0c222
Closes-Bug: #1750618

Changed in nova:
status:	In Progress → Fix Released

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-03-21: Fix merged to nova (stable/queens)

#10

Reviewed: https://review.openstack.org/550545
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=3c5e519a8875a9766a0d3a06cb76cceba26634e6
Submitter: Zuul
Branch: stable/queens

commit 3c5e519a8875a9766a0d3a06cb76cceba26634e6
Author: Matt Riedemann <email address hidden>
Date: Tue Feb 20 13:48:12 2018 -0500

Only attempt a rebuild claim for an evacuation to a new host

    Change I11746d1ea996a0f18b7c54b4c9c21df58cc4714b changed the
    behavior of the API and conductor when rebuilding an instance
    with a new image such that the image is run through the scheduler
    filters again to see if it will work on the existing host that
    the instance is running on.

    As a result, conductor started passing 'scheduled_node' to the
    compute which was using it for logic to tell if a claim should be
    attempted. We don't need to do a claim for a rebuild since we're
    on the same host.

    This removes the scheduled_node logic from the claim code, as we
    should only ever attempt a claim if we're evacuating, which we
    can determine based on the 'recreate' parameter.

    Change-Id: I7fde8ce9dea16679e76b0cb2db1427aeeec0c222
    Closes-Bug: #1750618
    (cherry picked from commit a39029076c7997236a7f999682fb1e998c474204)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-03-30: Fix merged to nova (stable/pike)

#11

Reviewed: https://review.openstack.org/550555
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=9890f3f69622489a0fd57cb1df354d7aa60161e0
Submitter: Zuul
Branch: stable/pike

commit 9890f3f69622489a0fd57cb1df354d7aa60161e0
Author: Matt Riedemann <email address hidden>
Date: Tue Feb 20 13:48:12 2018 -0500

Only attempt a rebuild claim for an evacuation to a new host

    Change I11746d1ea996a0f18b7c54b4c9c21df58cc4714b changed the
    behavior of the API and conductor when rebuilding an instance
    with a new image such that the image is run through the scheduler
    filters again to see if it will work on the existing host that
    the instance is running on.

    As a result, conductor started passing 'scheduled_node' to the
    compute which was using it for logic to tell if a claim should be
    attempted. We don't need to do a claim for a rebuild since we're
    on the same host.

    This removes the scheduled_node logic from the claim code, as we
    should only ever attempt a claim if we're evacuating, which we
    can determine based on the 'recreate' parameter.

Conflicts:
nova/compute/manager.py

NOTE(mriedem): The conflict is due to change
I0883c2ba1989c5d5a46e23bcbcda53598707bcbc in Queens.

    Change-Id: I7fde8ce9dea16679e76b0cb2db1427aeeec0c222
    Closes-Bug: #1750618
    (cherry picked from commit a39029076c7997236a7f999682fb1e998c474204)
    (cherry picked from commit 3c5e519a8875a9766a0d3a06cb76cceba26634e6)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-04-02: Fix included in openstack/nova 17.0.2

#12

This issue was fixed in the openstack/nova 17.0.2 release.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-04-02: Fix included in openstack/nova 16.1.1

#13

This issue was fixed in the openstack/nova 16.1.1 release.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-04-19: Fix merged to nova (stable/ocata)

#14

Reviewed: https://review.openstack.org/550560
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=286dd2c23c7c361c6538be65941e3c19e83e6d52
Submitter: Zuul
Branch: stable/ocata

commit 286dd2c23c7c361c6538be65941e3c19e83e6d52
Author: Matt Riedemann <email address hidden>
Date: Tue Feb 20 13:48:12 2018 -0500

Only attempt a rebuild claim for an evacuation to a new host

    Change I11746d1ea996a0f18b7c54b4c9c21df58cc4714b changed the
    behavior of the API and conductor when rebuilding an instance
    with a new image such that the image is run through the scheduler
    filters again to see if it will work on the existing host that
    the instance is running on.

    As a result, conductor started passing 'scheduled_node' to the
    compute which was using it for logic to tell if a claim should be
    attempted. We don't need to do a claim for a rebuild since we're
    on the same host.

    This removes the scheduled_node logic from the claim code, as we
    should only ever attempt a claim if we're evacuating, which we
    can determine based on the 'recreate' parameter.

Conflicts:
nova/tests/functional/test_servers.py

    NOTE(mriedem): test_rebuild_with_new_image does not exist in
    Ocata and does not apply to Ocata since it is primarily
    testing allocations getting created in Placement via the
    FilterScheduler, which was new in Pike. As a result the change
    to that test is not part of this backport, but a similar assertion
    is added to an existing rebuild unit test.

    Change-Id: I7fde8ce9dea16679e76b0cb2db1427aeeec0c222
    Closes-Bug: #1750618
    (cherry picked from commit a39029076c7997236a7f999682fb1e998c474204)
    (cherry picked from commit 3c5e519a8875a9766a0d3a06cb76cceba26634e6)
    (cherry picked from commit 9890f3f69622489a0fd57cb1df354d7aa60161e0)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-04-20: Fix included in openstack/nova 18.0.0.0b1

#15

This issue was fixed in the openstack/nova 18.0.0.0b1 development milestone.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-05-02: Fix included in openstack/nova 15.1.1

#16

This issue was fixed in the openstack/nova 15.1.1 release.

OpenStack Compute (nova)

rebuild to same host with a different image results in erroneously doing a Claim

Bug Description

Duplicates of this bug

Other bug subscribers

Remote bug watches

	Status	Importance	Assigned to
OpenStack Compute (nova)	Fix Released	High	Matt Riedemann
Ocata	Fix Committed	High	Tony Breeds
Pike	Fix Committed	High	Matt Riedemann
Queens	Fix Released	High	Matt Riedemann