Failed SR_IOV evacuation with host

Bug #1658070 reported by Kristina Berezovskaia
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
Won't Fix
Medium
MOS Maintenance
9.x
Fix Released
High
Denis Meltsaykin
OpenStack Compute (nova)
Fix Released
High
Guang Yee
Newton
Fix Committed
High
Matt Riedemann
Ocata
Fix Released
High
Matt Riedemann

Bug Description

When we try evacuate SR-IOV vm on concret host the VM is in ERROR state

Steps to reproduce:
1) Download trusty image
2) Create image
3) Create vf port:
neutron port-create <net> --binding:vnic-type direct --device_owner nova-compute --name sriov
4) Boot vm on this port:
nova boot vm --flavor m1.small --image 1ff0759c-ea85-4909-a711-70fd6b71ad23 --nic port-id=cfc947be-1975-42f3-95bd-f261a2eccec0 --key-name vm_key
5) Sgut down node with vm
6) Evacuate vm:
nova evacuate vm node-5.test.domain.local
Expected result:
 VM evacuates on the 5th node
Actual result:
 VM in error state

Workaround:
We can evacuate without pointing the host just nova evacuate vm

Environment:
#785 snap
2 controllers? 2 compute with SR-IOV

Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

Kristina, could you please attach a diagnostic snapshot or at the very least nova-compute logs?

Changed in mos:
status: New → Incomplete
assignee: MOS Nova (mos-nova) → Kristina Berezovskaia (kkuznetsova)
Revision history for this message
Kristina Berezovskaia (kkuznetsova) wrote :

The message for error vm:
Port update failed for port cfc947be-1975-42f3-95bd-f261a2eccec0: Unable to correlate PCI slot 0000:81:11.7

in nova code: https://github.com/openstack/nova/blob/065cd6a8d69c1ec862e5b402a3150131f35b2420/nova/network/neutronv2/api.py#L2411

For evacuation with host pci_mapping is empty

Changed in mos:
assignee: Kristina Berezovskaia (kkuznetsova) → MOS Nova (mos-nova)
status: Incomplete → New
Changed in mos:
status: New → Confirmed
importance: Undecided → Medium
milestone: 9.2 → 9.3
Changed in mos:
assignee: MOS Nova (mos-nova) → Sergey Nikitin (snikitin)
Revision history for this message
Matt Riedemann (mriedem) wrote :

Which release of nova is this?

Changed in mos:
assignee: Sergey Nikitin (snikitin) → nobody
Revision history for this message
Matt Riedemann (mriedem) wrote :

I guess this would be Mitaka?

Revision history for this message
Matt Riedemann (mriedem) wrote :

Could this be related to bug https://bugs.launchpad.net/nova/+bug/1512880 which was fixed in nova in newton?

Revision history for this message
Matt Riedemann (mriedem) wrote :

It could be that when you're specifying a host and bypassing the scheduler, the target host isn't actually setup properly for the migration (evacuation to the target host) and when you don't specify the host, the scheduler is picking a proper host based on the filters.

Revision history for this message
Matt Riedemann (mriedem) wrote :

(12:43:57 PM) gyee_: I am curious why we are using NopClaim if host is specified https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2771
(12:44:22 PM) gyee_: does specifying a host implies force evacuate?
(12:49:08 PM) gyee_: meanwhile if SR-IOV is enabled, we are trying to get the PCI mapping via the migration context, which is only populated at rebuild_claim. https://github.com/openstack/nova/blob/master/nova/network/neutronv2/api.py#L2462
(12:51:32 PM) mriedem: https://github.com/openstack/nova/commit/dc0221d7240326a2d1b467e2a367bebb7e764e61 added that code in the compute manager about the nop claim, which implies resources were already claimed, but i'd have to dig into that

The rebuild_instance method in the nova compute manager handles both rebuilds on the same host and evacuates to another host. If you force the evacuate to a specific host, we bypass some code in the nova-api and nova-conductor services and call directly into the compute on the target host and hit this code which makes it so we don't do a resource claim (which sets up the pci mappings on the instance.migration_context as part of the move claim):

https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2767-L2771

That ^ code is really there for rebuild operations where we don't need a resource claim because we've already claimed resources on the original host, but it got confused with this forced evacuate to a target host scenario, and we never end up claiming resources on the target host.

I think all we have to do is modify that conditional to be:

if scheduled_node is not None or recreate:

Revision history for this message
Matt Riedemann (mriedem) wrote :

The force evacuate to target host change in the API was added in the 2.29 microversion which goes back to newton:

https://github.com/openstack/nova/commit/86706785ff251b841dff3590dc60f6b4834d7b7e

Changed in nova:
status: New → Triaged
Revision history for this message
Matt Riedemann (mriedem) wrote :

I'm not totally sure how this happens if the --force option isn't specified with the nova evacuate command, because of this code in the API:

https://github.com/openstack/nova/blob/8d492c76d53f3fcfacdd945a277446bdfe6797b0/nova/compute/api.py#L4115

So if a host is specified by force is not, then the API sets a requested destination to that host for the scheduler and nulls out the host value, which is checked in conductor here:

https://github.com/openstack/nova/blob/8d492c76d53f3fcfacdd945a277446bdfe6797b0/nova/conductor/manager.py#L743

Which would call the scheduler to pick a host and set the node variable here:

https://github.com/openstack/nova/blob/8d492c76d53f3fcfacdd945a277446bdfe6797b0/nova/conductor/manager.py#L770

Which gets passed to the compute as the scheduled_node variable which is used to determine if we should do a claim or not.

Revision history for this message
Matt Riedemann (mriedem) wrote :

Ah gyee pointed out in IRC that if you're using microversion<2.29 then force is passed to the compute API code as None:

https://github.com/openstack/nova/blob/stable/newton/nova/api/openstack/compute/evacuate.py#L92

And then this fails because force is not False, it's None:

https://github.com/openstack/nova/blob/stable/newton/nova/compute/api.py#L3784

Changed in nova:
importance: Undecided → High
Revision history for this message
Matt Riedemann (mriedem) wrote :
Revision history for this message
Jay Pipes (jaypipes) wrote :

Eli Qiao (taget-9) is looking into this and will push a fix. Assigning to him.

Changed in nova:
assignee: nobody → Eli Qiao (taget-9)
Changed in nova:
status: Triaged → In Progress
Revision history for this message
Guang Yee (guang-yee) wrote :

Eli Qiao, will you be pushing a patch anytime soon? Otherwise, I can handle it. Please let me known.

Revision history for this message
Eli Qiao (taget-9) wrote :

@guang-yee, I'v submited the patch already https://review.openstack.org/#/c/465895/
not sure why launch pad doens't link it here.

Revision history for this message
Jay Pipes (jaypipes) wrote :

The original poster's version of Nova was 13.1.2, which is Mitaka. Therefore all the --force stuff and microversions discussion is irrelevant (since microversion 2.29 was added in Newton).

So, the bug here is the migrate routine is not accounting for a change in PCI address on the destination host and is effectively attempting to launch the workload on the destination host with the source host's PCI addresses, which clearly will not work unless, by luck, the destination host has the exact same PCI addresses and those addresses are available.

The solution is to backport the patch that Matt referred to above that fixes this issue in Newton and beyond:

 https://bugs.launchpad.net/nova/+bug/1512880

Marking this as a duplicate of 1512880.

Revision history for this message
Matt Riedemann (mriedem) wrote :

Jay, see comment 7, I think there is also a problem here with evacuate when a host is specified:

https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2767-L2771

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Eli Qiao (<email address hidden>) on branch: master
Review: https://review.openstack.org/465895
Reason: Please the comments of the bug which is made by Jay

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/466143

Changed in nova:
assignee: Eli Qiao (taget-9) → Guang Yee (guang-yee)
Revision history for this message
Guang Yee (guang-yee) wrote :

I don't understand why this bug is marked as a duplicate of bug 1512880. Evacuation still failed when SR-IOV is enabled and host is specified. I tested my patch on the nodes with SR-IOV enabled and it seem to work fine.

Revision history for this message
Matt Riedemann (mriedem) wrote :

I agree with Guang Yee; via code inspection in the compute manager's rebuild method there is an obvious issue with not doing a rebuild claim during evacuate in certain cases.

Changed in nova:
assignee: Guang Yee (guang-yee) → Matt Riedemann (mriedem)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/ocata)

Fix proposed to branch: stable/ocata
Review: https://review.openstack.org/468219

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/468227

Matt Riedemann (mriedem)
Changed in nova:
assignee: Matt Riedemann (mriedem) → Guang Yee (guang-yee)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/466143
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=a2b0824aca5cb4a2ae579f625327c51ed0414d35
Submitter: Jenkins
Branch: master

commit a2b0824aca5cb4a2ae579f625327c51ed0414d35
Author: Guang Yee <email address hidden>
Date: Thu May 18 16:38:16 2017 -0700

    make sure to rebuild claim on recreate

    On recreate where the instance is being evacuated to a different node,
    we should be rebuilding the claim so the migration context is available
    when rebuilding the instance.

    Change-Id: I53bdcf8edf640e97b4632ef7a093f14a6e3845e4
    Closes-Bug: 1658070

Changed in nova:
status: In Progress → Fix Released
Changed in mos:
assignee: nobody → MOS Maintenance (mos-maintenance)
Revision history for this message
Alexander Rubtsov (arubtsov) wrote :

A customer has hit the same issue in MOS 9.2
Could you please backport the fix for Mitaka series?

tags: added: customer-found sla1
Revision history for this message
Alexander Rubtsov (arubtsov) wrote :

sla1 for 9.0-updates

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 16.0.0.0b2

This issue was fixed in the openstack/nova 16.0.0.0b2 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/ocata)

Reviewed: https://review.openstack.org/468219
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=e221784560c71eeab7c5eeeb42e7c6e910d29340
Submitter: Jenkins
Branch: stable/ocata

commit e221784560c71eeab7c5eeeb42e7c6e910d29340
Author: Guang Yee <email address hidden>
Date: Thu May 18 16:38:16 2017 -0700

    make sure to rebuild claim on recreate

    On recreate where the instance is being evacuated to a different node,
    we should be rebuilding the claim so the migration context is available
    when rebuilding the instance.

    Change-Id: I53bdcf8edf640e97b4632ef7a093f14a6e3845e4
    Closes-Bug: 1658070
    (cherry picked from commit a2b0824aca5cb4a2ae579f625327c51ed0414d35)

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix proposed to openstack/nova (9.0/mitaka)

Fix proposed to branch: 9.0/mitaka
Change author: Denis V. Meltsaykin <email address hidden>
Review: https://review.fuel-infra.org/35612

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix merged to openstack/nova (9.0/mitaka)

Reviewed: https://review.fuel-infra.org/35612
Submitter: Pkgs Jenkins <email address hidden>
Branch: 9.0/mitaka

Commit: 06bb2fcb21b1531b7f434f7502b50a9751ef93aa
Author: Guang Yee <email address hidden>
Date: Thu Jun 15 11:13:34 2017

make sure to rebuild claim on recreate

On recreate where the instance is being evacuated to a different node,
we should be rebuilding the claim so the migration context is available
when rebuilding the instance.

Conflicts:
      nova/compute/manager.py
      nova/tests/unit/compute/test_compute.py
      nova/tests/unit/compute/test_compute_mgr.py

NOTE(mriedem): There are a few issues here:

1. I5aaa869f2e6155964827e659d18e2bcaad9d866b changed the LOG.info
   method to not pass a context in Ocata.
2. I57233259065d887b38a79850a05177fcbbdfb8c3 changed some tests in
   test_compute_manager in Ocata, but is irrelevant here.
3. The bigger change isn't a merge conflict but in Ocata the compute
   manager code was all refactored so that the _get_resource_tracker
   method no longer needed a nodename passed to it. In Newton, however,
   if we're force evacuating (scenario 3) then we don't have a scheduled_node
   passed to the rebuild_instance method and in this case we need to
   lookup the nodename for the host we're currently on. To resolve this,
   some existing code that handles this case is moved up where it is
   needed to get the resource tracker so we can get the rebuild_claim method.
   We let any ComputeHostNotFound exception raise up in this case rather than
   log it because without the compute node we can't make the rebuild claim and
   we need to fail. Tests are adjusted accordingly for this.
4. The fake instances in Mitaka are still created manually, so node
   field needs to be added explicitly.

Change-Id: I53bdcf8edf640e97b4632ef7a093f14a6e3845e4
Closes-Bug: 1658070
(cherry picked from commit a2b0824aca5cb4a2ae579f625327c51ed0414d35)
(cherry picked from commit ea90c60b07534a46541c55432389f2d50b5b7d0a)
(cherry picked from commit 0f2d87416eff1e96c0fbf0f4b08bf6b6b22246d5)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 15.0.6

This issue was fixed in the openstack/nova 15.0.6 release.

Revision history for this message
Alexander Rubtsov (arubtsov) wrote :

After applying https://review.fuel-infra.org/#/c/35612/6/ the issue still persists in Mitaka.

The following details are observed:
1) After a VM is deleted or moved to another host, the MAC address associated to specific VF is not getting cleaned up.
http://paste.openstack.org/show/ZisERfh0iDMxv5bBX8HW/

2) On performing nova migration or nova evacuation multiple times the same mac address is getting mapped to multiple VF’s on the compute which might be due to improper cleaning of mac from old VF’s.
http://paste.openstack.org/show/zJL6RJvlDz7ukQcomy6q/

Revision history for this message
Alexander Rubtsov (arubtsov) wrote :

Please disregard the previous comment - I misunderstood the verification status.
The patch has resolved the issue related to evacuation.
Therefore, I'm setting the status back to "Fix Committed"

Regarding the mentioned MAC-addresses issue, I've filed a separate bug report:
https://bugs.launchpad.net/mos/+bug/1700702

Revision history for this message
Ilya Bumarskov (ibumarskov) wrote :

Can't reproduce bug on our test environment due to lack of appropriate HW. As I understand, fix was verified, so I move bug to "Fix released".

Changed in mos:
milestone: 9.x-updates → 10.0
status: Confirmed → Won't Fix
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/newton)

Reviewed: https://review.openstack.org/468227
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=0f2d87416eff1e96c0fbf0f4b08bf6b6b22246d5
Submitter: Jenkins
Branch: stable/newton

commit 0f2d87416eff1e96c0fbf0f4b08bf6b6b22246d5
Author: Guang Yee <email address hidden>
Date: Thu May 18 16:38:16 2017 -0700

    make sure to rebuild claim on recreate

    On recreate where the instance is being evacuated to a different node,
    we should be rebuilding the claim so the migration context is available
    when rebuilding the instance.

    Conflicts:
          nova/compute/manager.py
          nova/tests/unit/compute/test_compute_mgr.py

    NOTE(mriedem): There are a few issues here:

    1. I5aaa869f2e6155964827e659d18e2bcaad9d866b changed the LOG.info
       method to not pass a context in Ocata.
    2. I57233259065d887b38a79850a05177fcbbdfb8c3 changed some tests in
       test_compute_manager in Ocata, but is irrelevant here.
    3. The bigger change isn't a merge conflict but in Ocata the compute
       manager code was all refactored so that the _get_resource_tracker
       method no longer needed a nodename passed to it. In Newton, however,
       if we're force evacuating (scenario 3) then we don't have a scheduled_node
       passed to the rebuild_instance method and in this case we need to
       lookup the nodename for the host we're currently on. To resolve this,
       some existing code that handles this case is moved up where it is
       needed to get the resource tracker so we can get the rebuild_claim method.
       We let any ComputeHostNotFound exception raise up in this case rather than
       log it because without the compute node we can't make the rebuild claim and
       we need to fail. Tests are adjusted accordingly for this.

    Change-Id: I53bdcf8edf640e97b4632ef7a093f14a6e3845e4
    Closes-Bug: 1658070
    (cherry picked from commit a2b0824aca5cb4a2ae579f625327c51ed0414d35)
    (cherry picked from commit ea90c60b07534a46541c55432389f2d50b5b7d0a)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 14.0.8

This issue was fixed in the openstack/nova 14.0.8 release.

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix proposed to openstack/nova (mcp/1.0/mitaka)

Fix proposed to branch: mcp/1.0/mitaka
Change author: Guang Yee <email address hidden>
Review: https://review.fuel-infra.org/36367

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Change abandoned on openstack/nova (mcp/1.0/mitaka)

Change abandoned by Vladyslav Drok <email address hidden> on branch: mcp/1.0/mitaka
Review: https://review.fuel-infra.org/36367

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Change restored on openstack/nova (mcp/1.0/mitaka)

Change restored by Vladyslav Drok <email address hidden> on branch: mcp/1.0/mitaka
Review: https://review.fuel-infra.org/36367

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix merged to openstack/nova (mcp/1.0/mitaka)

Reviewed: https://review.fuel-infra.org/36367
Submitter: Pkgs Jenkins <email address hidden>
Branch: mcp/1.0/mitaka

Commit: 72b644f37b7f52d1b80cb1f774ee7b991a903167
Author: Guang Yee <email address hidden>
Date: Thu Aug 31 17:38:34 2017

make sure to rebuild claim on recreate

On recreate where the instance is being evacuated to a different node,
we should be rebuilding the claim so the migration context is available
when rebuilding the instance.

Conflicts:
      nova/compute/manager.py
      nova/tests/unit/compute/test_compute.py
      nova/tests/unit/compute/test_compute_mgr.py

NOTE(mriedem): There are a few issues here:

1. I5aaa869f2e6155964827e659d18e2bcaad9d866b changed the LOG.info
   method to not pass a context in Ocata.
2. I57233259065d887b38a79850a05177fcbbdfb8c3 changed some tests in
   test_compute_manager in Ocata, but is irrelevant here.
3. The bigger change isn't a merge conflict but in Ocata the compute
   manager code was all refactored so that the _get_resource_tracker
   method no longer needed a nodename passed to it. In Newton, however,
   if we're force evacuating (scenario 3) then we don't have a scheduled_node
   passed to the rebuild_instance method and in this case we need to
   lookup the nodename for the host we're currently on. To resolve this,
   some existing code that handles this case is moved up where it is
   needed to get the resource tracker so we can get the rebuild_claim method.
   We let any ComputeHostNotFound exception raise up in this case rather than
   log it because without the compute node we can't make the rebuild claim and
   we need to fail. Tests are adjusted accordingly for this.
4. The fake instances in Mitaka are still created manually, so node
   field needs to be added explicitly.

PROD ticket: https://mirantis.jira.com/browse/PROD-14350

Change-Id: I53bdcf8edf640e97b4632ef7a093f14a6e3845e4
Closes-Bug: 1658070
(cherry picked from commit a2b0824aca5cb4a2ae579f625327c51ed0414d35)
(cherry picked from commit ea90c60b07534a46541c55432389f2d50b5b7d0a)
(cherry picked from commit 0f2d87416eff1e96c0fbf0f4b08bf6b6b22246d5)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.