Bug #1658070 “Failed SR_IOV evacuation with host” : Series ocata : Bugs : OpenStack Compute (nova)

Revision history for this message

Roman Podoliaka (rpodolyaka) wrote on 2017-01-20:

#1

Kristina, could you please attach a diagnostic snapshot or at the very least nova-compute logs?

Changed in mos:
status:	New → Incomplete
assignee:	MOS Nova (mos-nova) → Kristina Berezovskaia (kkuznetsova)

Revision history for this message

Kristina Berezovskaia (kkuznetsova) wrote on 2017-01-20:

#2

nova-compute_logs.tar Edit (2.4 MiB, application/x-tar)

The message for error vm:
Port update failed for port cfc947be-1975-42f3-95bd-f261a2eccec0: Unable to correlate PCI slot 0000:81:11.7

in nova code: https://github.com/openstack/nova/blob/065cd6a8d69c1ec862e5b402a3150131f35b2420/nova/network/neutronv2/api.py#L2411

For evacuation with host pci_mapping is empty

Changed in mos:
assignee:	Kristina Berezovskaia (kkuznetsova) → MOS Nova (mos-nova)
status:	Incomplete → New

Denis Meltsaykin (dmeltsaykin) on 2017-01-20

Changed in mos:
status:	New → Confirmed
importance:	Undecided → Medium
milestone:	9.2 → 9.3

Sergey Nikitin (snikitin) on 2017-01-23

Changed in mos:
assignee:	MOS Nova (mos-nova) → Sergey Nikitin (snikitin)

Revision history for this message

Matt Riedemann (mriedem) wrote on 2017-05-17:

#3

Which release of nova is this?

Changed in mos:
assignee:	Sergey Nikitin (snikitin) → nobody

Revision history for this message

Matt Riedemann (mriedem) wrote on 2017-05-17:

#4

I guess this would be Mitaka?

Revision history for this message

Matt Riedemann (mriedem) wrote on 2017-05-17:

#5

Could this be related to bug https://bugs.launchpad.net/nova/+bug/1512880 which was fixed in nova in newton?

Revision history for this message

Matt Riedemann (mriedem) wrote on 2017-05-17:

#6

It could be that when you're specifying a host and bypassing the scheduler, the target host isn't actually setup properly for the migration (evacuation to the target host) and when you don't specify the host, the scheduler is picking a proper host based on the filters.

Revision history for this message

Matt Riedemann (mriedem) wrote on 2017-05-17:

#7

(12:43:57 PM) gyee_: I am curious why we are using NopClaim if host is specified https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2771
(12:44:22 PM) gyee_: does specifying a host implies force evacuate?
(12:49:08 PM) gyee_: meanwhile if SR-IOV is enabled, we are trying to get the PCI mapping via the migration context, which is only populated at rebuild_claim. https://github.com/openstack/nova/blob/master/nova/network/neutronv2/api.py#L2462
(12:51:32 PM) mriedem: https://github.com/openstack/nova/commit/dc0221d7240326a2d1b467e2a367bebb7e764e61 added that code in the compute manager about the nop claim, which implies resources were already claimed, but i'd have to dig into that

The rebuild_instance method in the nova compute manager handles both rebuilds on the same host and evacuates to another host. If you force the evacuate to a specific host, we bypass some code in the nova-api and nova-conductor services and call directly into the compute on the target host and hit this code which makes it so we don't do a resource claim (which sets up the pci mappings on the instance.migration_context as part of the move claim):

https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2767-L2771

That ^ code is really there for rebuild operations where we don't need a resource claim because we've already claimed resources on the original host, but it got confused with this forced evacuate to a target host scenario, and we never end up claiming resources on the target host.

I think all we have to do is modify that conditional to be:

if scheduled_node is not None or recreate:

Revision history for this message

Matt Riedemann (mriedem) wrote on 2017-05-17:

#8

The force evacuate to target host change in the API was added in the 2.29 microversion which goes back to newton:

https://github.com/openstack/nova/commit/86706785ff251b841dff3590dc60f6b4834d7b7e

Changed in nova:
status:	New → Triaged

Revision history for this message

Matt Riedemann (mriedem) wrote on 2017-05-17:

#9

I'm not totally sure how this happens if the --force option isn't specified with the nova evacuate command, because of this code in the API:

https://github.com/openstack/nova/blob/8d492c76d53f3fcfacdd945a277446bdfe6797b0/nova/compute/api.py#L4115

So if a host is specified by force is not, then the API sets a requested destination to that host for the scheduler and nulls out the host value, which is checked in conductor here:

https://github.com/openstack/nova/blob/8d492c76d53f3fcfacdd945a277446bdfe6797b0/nova/conductor/manager.py#L743

Which would call the scheduler to pick a host and set the node variable here:

https://github.com/openstack/nova/blob/8d492c76d53f3fcfacdd945a277446bdfe6797b0/nova/conductor/manager.py#L770

Which gets passed to the compute as the scheduled_node variable which is used to determine if we should do a claim or not.

Revision history for this message

Matt Riedemann (mriedem) wrote on 2017-05-17:

#10

Ah gyee pointed out in IRC that if you're using microversion<2.29 then force is passed to the compute API code as None:

https://github.com/openstack/nova/blob/stable/newton/nova/api/openstack/compute/evacuate.py#L92

And then this fails because force is not False, it's None:

https://github.com/openstack/nova/blob/stable/newton/nova/compute/api.py#L3784

Changed in nova:
importance:	Undecided → High

Revision history for this message

Matt Riedemann (mriedem) wrote on 2017-05-17:

#11

The regression was introduced in Newton:

https://github.com/openstack/nova/commit/86706785ff251b841dff3590dc60f6b4834d7b7e

Revision history for this message

Jay Pipes (jaypipes) wrote on 2017-05-18:

#12

Eli Qiao (taget-9) is looking into this and will push a fix. Assigning to him.

Changed in nova:
assignee:	nobody → Eli Qiao (taget-9)

OpenStack Infra (hudson-openstack) on 2017-05-18

Changed in nova:
status:	Triaged → In Progress

Revision history for this message

Guang Yee (guang-yee) wrote on 2017-05-18:

#13

Eli Qiao, will you be pushing a patch anytime soon? Otherwise, I can handle it. Please let me known.

Revision history for this message

Eli Qiao (taget-9) wrote on 2017-05-19:

#14

@guang-yee, I'v submited the patch already https://review.openstack.org/#/c/465895/
not sure why launch pad doens't link it here.

Revision history for this message

Jay Pipes (jaypipes) wrote on 2017-05-19:

#15

The original poster's version of Nova was 13.1.2, which is Mitaka. Therefore all the --force stuff and microversions discussion is irrelevant (since microversion 2.29 was added in Newton).

So, the bug here is the migrate routine is not accounting for a change in PCI address on the destination host and is effectively attempting to launch the workload on the destination host with the source host's PCI addresses, which clearly will not work unless, by luck, the destination host has the exact same PCI addresses and those addresses are available.

The solution is to backport the patch that Matt referred to above that fixes this issue in Newton and beyond:

https://bugs.launchpad.net/nova/+bug/1512880

Marking this as a duplicate of 1512880.

Revision history for this message

Matt Riedemann (mriedem) wrote on 2017-05-19:

#16

Jay, see comment 7, I think there is also a problem here with evacuate when a host is specified:

https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2767-L2771

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-05-19: Change abandoned on nova (master)

#17

Change abandoned by Eli Qiao (<email address hidden>) on branch: master
Review: https://review.openstack.org/465895
Reason: Please the comments of the bug which is made by Jay

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-05-19: Fix proposed to nova (master)

#18

Fix proposed to branch: master
Review: https://review.openstack.org/466143

Changed in nova:
assignee:	Eli Qiao (taget-9) → Guang Yee (guang-yee)

Revision history for this message

Guang Yee (guang-yee) wrote on 2017-05-25:

#19

I don't understand why this bug is marked as a duplicate of bug 1512880. Evacuation still failed when SR-IOV is enabled and host is specified. I tested my patch on the nodes with SR-IOV enabled and it seem to work fine.

Revision history for this message

Matt Riedemann (mriedem) wrote on 2017-05-25:

#20

I agree with Guang Yee; via code inspection in the compute manager's rebuild method there is an obvious issue with not doing a rebuild claim during evacuate in certain cases.

OpenStack Infra (hudson-openstack) on 2017-05-25

Changed in nova:
assignee:	Guang Yee (guang-yee) → Matt Riedemann (mriedem)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-05-26: Fix proposed to nova (stable/ocata)

#21

Fix proposed to branch: stable/ocata
Review: https://review.openstack.org/468219

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-05-26: Fix proposed to nova (stable/newton)

#22

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/468227

Matt Riedemann (mriedem) on 2017-05-26

Changed in nova:
assignee:	Matt Riedemann (mriedem) → Guang Yee (guang-yee)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-05-26: Fix merged to nova (master)

#23

Reviewed: https://review.openstack.org/466143
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=a2b0824aca5cb4a2ae579f625327c51ed0414d35
Submitter: Jenkins
Branch: master

commit a2b0824aca5cb4a2ae579f625327c51ed0414d35
Author: Guang Yee <email address hidden>
Date: Thu May 18 16:38:16 2017 -0700

make sure to rebuild claim on recreate

    On recreate where the instance is being evacuated to a different node,
    we should be rebuilding the claim so the migration context is available
    when rebuilding the instance.

Change-Id: I53bdcf8edf640e97b4632ef7a093f14a6e3845e4
Closes-Bug: 1658070

Changed in nova:
status:	In Progress → Fix Released

Denis Meltsaykin (dmeltsaykin) on 2017-05-29

Changed in mos:
assignee:	nobody → MOS Maintenance (mos-maintenance)

Revision history for this message

Alexander Rubtsov (arubtsov) wrote on 2017-05-30:

#24

A customer has hit the same issue in MOS 9.2
Could you please backport the fix for Mitaka series?

tags:

added: customer-found sla1

Revision history for this message

Alexander Rubtsov (arubtsov) wrote on 2017-05-30:

#25

sla1 for 9.0-updates

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-06-08: Fix included in openstack/nova 16.0.0.0b2

#26

This issue was fixed in the openstack/nova 16.0.0.0b2 development milestone.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-06-12: Fix merged to nova (stable/ocata)

#27

Reviewed: https://review.openstack.org/468219
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=e221784560c71eeab7c5eeeb42e7c6e910d29340
Submitter: Jenkins
Branch: stable/ocata

commit e221784560c71eeab7c5eeeb42e7c6e910d29340
Author: Guang Yee <email address hidden>
Date: Thu May 18 16:38:16 2017 -0700

make sure to rebuild claim on recreate

    On recreate where the instance is being evacuated to a different node,
    we should be rebuilding the claim so the migration context is available
    when rebuilding the instance.

    Change-Id: I53bdcf8edf640e97b4632ef7a093f14a6e3845e4
    Closes-Bug: 1658070
    (cherry picked from commit a2b0824aca5cb4a2ae579f625327c51ed0414d35)

Revision history for this message

Fuel Devops McRobotson (fuel-devops-robot) wrote on 2017-06-13: Fix proposed to openstack/nova (9.0/mitaka)

#28

Fix proposed to branch: 9.0/mitaka
Change author: Denis V. Meltsaykin <email address hidden>
Review: https://review.fuel-infra.org/35612

Revision history for this message

Fuel Devops McRobotson (fuel-devops-robot) wrote on 2017-06-15: Fix merged to openstack/nova (9.0/mitaka)

#29

Reviewed: https://review.fuel-infra.org/35612
Submitter: Pkgs Jenkins <email address hidden>
Branch: 9.0/mitaka

Commit: 06bb2fcb21b1531b7f434f7502b50a9751ef93aa
Author: Guang Yee <email address hidden>
Date: Thu Jun 15 11:13:34 2017

make sure to rebuild claim on recreate

On recreate where the instance is being evacuated to a different node,
we should be rebuilding the claim so the migration context is available
when rebuilding the instance.

Conflicts:
      nova/compute/manager.py
      nova/tests/unit/compute/test_compute.py
      nova/tests/unit/compute/test_compute_mgr.py

NOTE(mriedem): There are a few issues here:

1. I5aaa869f2e6155964827e659d18e2bcaad9d866b changed the LOG.info
   method to not pass a context in Ocata.
2. I57233259065d887b38a79850a05177fcbbdfb8c3 changed some tests in
   test_compute_manager in Ocata, but is irrelevant here.
3. The bigger change isn't a merge conflict but in Ocata the compute
   manager code was all refactored so that the _get_resource_tracker
   method no longer needed a nodename passed to it. In Newton, however,
   if we're force evacuating (scenario 3) then we don't have a scheduled_node
   passed to the rebuild_instance method and in this case we need to
   lookup the nodename for the host we're currently on. To resolve this,
   some existing code that handles this case is moved up where it is
   needed to get the resource tracker so we can get the rebuild_claim method.
   We let any ComputeHostNotFound exception raise up in this case rather than
   log it because without the compute node we can't make the rebuild claim and
   we need to fail. Tests are adjusted accordingly for this.
4. The fake instances in Mitaka are still created manually, so node
   field needs to be added explicitly.

Change-Id: I53bdcf8edf640e97b4632ef7a093f14a6e3845e4
Closes-Bug: 1658070
(cherry picked from commit a2b0824aca5cb4a2ae579f625327c51ed0414d35)
(cherry picked from commit ea90c60b07534a46541c55432389f2d50b5b7d0a)
(cherry picked from commit 0f2d87416eff1e96c0fbf0f4b08bf6b6b22246d5)

Reviewed: https://review.fuel-infra.org/35612
Submitter: Pkgs Jenkins <devops+pkgs-ci@mirantis.com>
Branch: 9.0/mitaka

Commit: 06bb2fcb21b1531b7f434f7502b50a9751ef93aa
Author: Guang Yee <guang.yee@suse.com>
Date: Thu Jun 15 11:13:34 2017

make sure to rebuild claim on recreate

On recreate where the instance is being evacuated to a different node,
we should be rebuilding the claim so the migration context is available
when rebuilding the instance.

Conflicts:
      nova/compute/manager.py
      nova/tests/unit/compute/test_compute.py
      nova/tests/unit/compute/test_compute_mgr.py

NOTE(mriedem): There are a few issues here:

1. I5aaa869f2e6155964827e659d18e2bcaad9d866b changed the LOG.info
   method to not pass a context in Ocata.
2. I57233259065d887b38a79850a05177fcbbdfb8c3 changed some tests in
   test_compute_manager in Ocata, but is irrelevant here.
3. The bigger change isn't a merge conflict but in Ocata the compute
   manager code was all refactored so that the _get_resource_tracker
   method no longer needed a nodename passed to it. In Newton, however,
   if we're force evacuating (scenario 3) then we don't have a scheduled_node
   passed to the rebuild_instance method and in this case we need to
   lookup the nodename for the host we're currently on. To resolve this,
   some existing code that handles this case is moved up where it is
   needed to get the resource tracker so we can get the rebuild_claim method.
   We let any ComputeHostNotFound exception raise up in this case rather than
   log it because without the compute node we can't make the rebuild claim and
   we need to fail. Tests are adjusted accordingly for this.
4. The fake instances in Mitaka are still created manually, so node
   field needs to be added explicitly.

Change-Id: I53bdcf8edf640e97b4632ef7a093f14a6e3845e4
Closes-Bug: 1658070
(cherry picked from commit a2b0824aca5cb4a2ae579f625327c51ed0414d35)
(cherry picked from commit ea90c60b07534a46541c55432389f2d50b5b7d0a)
(cherry picked from commit 0f2d87416eff1e96c0fbf0f4b08bf6b6b22246d5)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-06-19: Fix included in openstack/nova 15.0.6

#30

This issue was fixed in the openstack/nova 15.0.6 release.

Revision history for this message

Alexander Rubtsov (arubtsov) wrote on 2017-06-26:

#31

After applying https://review.fuel-infra.org/#/c/35612/6/ the issue still persists in Mitaka.

The following details are observed:
1) After a VM is deleted or moved to another host, the MAC address associated to specific VF is not getting cleaned up.
http://paste.openstack.org/show/ZisERfh0iDMxv5bBX8HW/

2) On performing nova migration or nova evacuation multiple times the same mac address is getting mapped to multiple VF’s on the compute which might be due to improper cleaning of mac from old VF’s.
http://paste.openstack.org/show/zJL6RJvlDz7ukQcomy6q/

Revision history for this message

Alexander Rubtsov (arubtsov) wrote on 2017-06-27:

#32

Please disregard the previous comment - I misunderstood the verification status.
The patch has resolved the issue related to evacuation.
Therefore, I'm setting the status back to "Fix Committed"

Regarding the mentioned MAC-addresses issue, I've filed a separate bug report:
https://bugs.launchpad.net/mos/+bug/1700702

Revision history for this message

Ilya Bumarskov (ibumarskov) wrote on 2017-06-27:

#33

Can't reproduce bug on our test environment due to lack of appropriate HW. As I understand, fix was verified, so I move bug to "Fix released".

Denis Meltsaykin (dmeltsaykin) on 2017-07-04

Changed in mos:
milestone:	9.x-updates → 10.0
status:	Confirmed → Won't Fix

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-07-06: Fix merged to nova (stable/newton)

#34

Reviewed: https://review.openstack.org/468227
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=0f2d87416eff1e96c0fbf0f4b08bf6b6b22246d5
Submitter: Jenkins
Branch: stable/newton

commit 0f2d87416eff1e96c0fbf0f4b08bf6b6b22246d5
Author: Guang Yee <email address hidden>
Date: Thu May 18 16:38:16 2017 -0700

make sure to rebuild claim on recreate

    On recreate where the instance is being evacuated to a different node,
    we should be rebuilding the claim so the migration context is available
    when rebuilding the instance.

    Conflicts:
          nova/compute/manager.py
          nova/tests/unit/compute/test_compute_mgr.py

NOTE(mriedem): There are a few issues here:

    1. I5aaa869f2e6155964827e659d18e2bcaad9d866b changed the LOG.info
       method to not pass a context in Ocata.
    2. I57233259065d887b38a79850a05177fcbbdfb8c3 changed some tests in
       test_compute_manager in Ocata, but is irrelevant here.
    3. The bigger change isn't a merge conflict but in Ocata the compute
       manager code was all refactored so that the _get_resource_tracker
       method no longer needed a nodename passed to it. In Newton, however,
       if we're force evacuating (scenario 3) then we don't have a scheduled_node
       passed to the rebuild_instance method and in this case we need to
       lookup the nodename for the host we're currently on. To resolve this,
       some existing code that handles this case is moved up where it is
       needed to get the resource tracker so we can get the rebuild_claim method.
       We let any ComputeHostNotFound exception raise up in this case rather than
       log it because without the compute node we can't make the rebuild claim and
       we need to fail. Tests are adjusted accordingly for this.

    Change-Id: I53bdcf8edf640e97b4632ef7a093f14a6e3845e4
    Closes-Bug: 1658070
    (cherry picked from commit a2b0824aca5cb4a2ae579f625327c51ed0414d35)
    (cherry picked from commit ea90c60b07534a46541c55432389f2d50b5b7d0a)

Reviewed:  https://review.openstack.org/468227
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=0f2d87416eff1e96c0fbf0f4b08bf6b6b22246d5
Submitter: Jenkins
Branch:    stable/newton

commit 0f2d87416eff1e96c0fbf0f4b08bf6b6b22246d5
Author: Guang Yee <guang.yee@suse.com>
Date:   Thu May 18 16:38:16 2017 -0700

make sure to rebuild claim on recreate
    
    On recreate where the instance is being evacuated to a different node,
    we should be rebuilding the claim so the migration context is available
    when rebuilding the instance.
    
    Conflicts:
          nova/compute/manager.py
          nova/tests/unit/compute/test_compute_mgr.py
    
    NOTE(mriedem): There are a few issues here:
    
    1. I5aaa869f2e6155964827e659d18e2bcaad9d866b changed the LOG.info
       method to not pass a context in Ocata.
    2. I57233259065d887b38a79850a05177fcbbdfb8c3 changed some tests in
       test_compute_manager in Ocata, but is irrelevant here.
    3. The bigger change isn't a merge conflict but in Ocata the compute
       manager code was all refactored so that the _get_resource_tracker
       method no longer needed a nodename passed to it. In Newton, however,
       if we're force evacuating (scenario 3) then we don't have a scheduled_node
       passed to the rebuild_instance method and in this case we need to
       lookup the nodename for the host we're currently on. To resolve this,
       some existing code that handles this case is moved up where it is
       needed to get the resource tracker so we can get the rebuild_claim method.
       We let any ComputeHostNotFound exception raise up in this case rather than
       log it because without the compute node we can't make the rebuild claim and
       we need to fail. Tests are adjusted accordingly for this.
    
    Change-Id: I53bdcf8edf640e97b4632ef7a093f14a6e3845e4
    Closes-Bug: 1658070
    (cherry picked from commit a2b0824aca5cb4a2ae579f625327c51ed0414d35)
    (cherry picked from commit ea90c60b07534a46541c55432389f2d50b5b7d0a)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-08-28: Fix included in openstack/nova 14.0.8

#35

This issue was fixed in the openstack/nova 14.0.8 release.

Revision history for this message

Fuel Devops McRobotson (fuel-devops-robot) wrote on 2017-08-30: Fix proposed to openstack/nova (mcp/1.0/mitaka)

#36

Fix proposed to branch: mcp/1.0/mitaka
Change author: Guang Yee <email address hidden>
Review: https://review.fuel-infra.org/36367

Revision history for this message

Fuel Devops McRobotson (fuel-devops-robot) wrote on 2017-08-30: Change abandoned on openstack/nova (mcp/1.0/mitaka)

#37

Change abandoned by Vladyslav Drok <email address hidden> on branch: mcp/1.0/mitaka
Review: https://review.fuel-infra.org/36367

Revision history for this message

Fuel Devops McRobotson (fuel-devops-robot) wrote on 2017-08-31: Change restored on openstack/nova (mcp/1.0/mitaka)

#38

Change restored by Vladyslav Drok <email address hidden> on branch: mcp/1.0/mitaka
Review: https://review.fuel-infra.org/36367

Revision history for this message

Fuel Devops McRobotson (fuel-devops-robot) wrote on 2017-09-01: Fix merged to openstack/nova (mcp/1.0/mitaka)

#39

Reviewed: https://review.fuel-infra.org/36367
Submitter: Pkgs Jenkins <email address hidden>
Branch: mcp/1.0/mitaka

Commit: 72b644f37b7f52d1b80cb1f774ee7b991a903167
Author: Guang Yee <email address hidden>
Date: Thu Aug 31 17:38:34 2017

make sure to rebuild claim on recreate

On recreate where the instance is being evacuated to a different node,
we should be rebuilding the claim so the migration context is available
when rebuilding the instance.

Conflicts:
      nova/compute/manager.py
      nova/tests/unit/compute/test_compute.py
      nova/tests/unit/compute/test_compute_mgr.py

NOTE(mriedem): There are a few issues here:

1. I5aaa869f2e6155964827e659d18e2bcaad9d866b changed the LOG.info
   method to not pass a context in Ocata.
2. I57233259065d887b38a79850a05177fcbbdfb8c3 changed some tests in
   test_compute_manager in Ocata, but is irrelevant here.
3. The bigger change isn't a merge conflict but in Ocata the compute
   manager code was all refactored so that the _get_resource_tracker
   method no longer needed a nodename passed to it. In Newton, however,
   if we're force evacuating (scenario 3) then we don't have a scheduled_node
   passed to the rebuild_instance method and in this case we need to
   lookup the nodename for the host we're currently on. To resolve this,
   some existing code that handles this case is moved up where it is
   needed to get the resource tracker so we can get the rebuild_claim method.
   We let any ComputeHostNotFound exception raise up in this case rather than
   log it because without the compute node we can't make the rebuild claim and
   we need to fail. Tests are adjusted accordingly for this.
4. The fake instances in Mitaka are still created manually, so node
   field needs to be added explicitly.

PROD ticket: https://mirantis.jira.com/browse/PROD-14350

Change-Id: I53bdcf8edf640e97b4632ef7a093f14a6e3845e4
Closes-Bug: 1658070
(cherry picked from commit a2b0824aca5cb4a2ae579f625327c51ed0414d35)
(cherry picked from commit ea90c60b07534a46541c55432389f2d50b5b7d0a)
(cherry picked from commit 0f2d87416eff1e96c0fbf0f4b08bf6b6b22246d5)

Reviewed: https://review.fuel-infra.org/36367
Submitter: Pkgs Jenkins <devops+pkgs-ci@mirantis.com>
Branch: mcp/1.0/mitaka

Commit: 72b644f37b7f52d1b80cb1f774ee7b991a903167
Author: Guang Yee <guang.yee@suse.com>
Date: Thu Aug 31 17:38:34 2017

make sure to rebuild claim on recreate

On recreate where the instance is being evacuated to a different node,
we should be rebuilding the claim so the migration context is available
when rebuilding the instance.

Conflicts:
      nova/compute/manager.py
      nova/tests/unit/compute/test_compute.py
      nova/tests/unit/compute/test_compute_mgr.py

NOTE(mriedem): There are a few issues here:

1. I5aaa869f2e6155964827e659d18e2bcaad9d866b changed the LOG.info
   method to not pass a context in Ocata.
2. I57233259065d887b38a79850a05177fcbbdfb8c3 changed some tests in
   test_compute_manager in Ocata, but is irrelevant here.
3. The bigger change isn't a merge conflict but in Ocata the compute
   manager code was all refactored so that the _get_resource_tracker
   method no longer needed a nodename passed to it. In Newton, however,
   if we're force evacuating (scenario 3) then we don't have a scheduled_node
   passed to the rebuild_instance method and in this case we need to
   lookup the nodename for the host we're currently on. To resolve this,
   some existing code that handles this case is moved up where it is
   needed to get the resource tracker so we can get the rebuild_claim method.
   We let any ComputeHostNotFound exception raise up in this case rather than
   log it because without the compute node we can't make the rebuild claim and
   we need to fail. Tests are adjusted accordingly for this.
4. The fake instances in Mitaka are still created manually, so node
   field needs to be added explicitly.

PROD ticket: https://mirantis.jira.com/browse/PROD-14350

Change-Id: I53bdcf8edf640e97b4632ef7a093f14a6e3845e4
Closes-Bug: 1658070
(cherry picked from commit a2b0824aca5cb4a2ae579f625327c51ed0414d35)
(cherry picked from commit ea90c60b07534a46541c55432389f2d50b5b7d0a)
(cherry picked from commit 0f2d87416eff1e96c0fbf0f4b08bf6b6b22246d5)

OpenStack Compute (nova)

Failed SR_IOV evacuation with host

Bug Description

Other bug subscribers

Bug attachments

Remote bug watches

	Status	Importance	Assigned to	Milestone
Mirantis OpenStack	Won't Fix	Medium	MOS Maintenance	Mirantis OpenStack 10.0
9.x	Fix Released	High	Denis Meltsaykin	Mirantis OpenStack 9.2-mu-2
OpenStack Compute (nova)	Fix Released	High	Guang Yee
Newton	Fix Committed	High	Matt Riedemann
Ocata	Fix Released	High	Matt Riedemann