Evacuation fails for instances with PCI devices due to missing migration

Bug #1703629 reported by Steven Webster
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Steven Webster
Ocata
Fix Committed
Medium
Matt Riedemann
Pike
Fix Committed
Medium
Matt Riedemann

Bug Description

Description
===========

The fix for bug https://bugs.launchpad.net/nova/+bug/1677621 enforced a requirement for a migration object to be present in the call to update_port_binding_for_instance() in order to do any mapping from old PCI devices to new PCI devices when an instance is migrated/resized/evacuated.

During an evacuation, a migration is created, but never passed down to update_port_binding_for_instance().

This can cause an instance to be spawned on the new host with an incorrect (PCI) port binding.

This can happen even with the proposed fix to related bug #1630698.

Steps to reproduce
==================

Two node setup
- Launch an instance with PCI-PT or SR-IOV port bindings
- Stop nova-compute on the destination host
- nova evacuate <instance>

Expected result
===============

The instance should migrate to a new host (provided resources are available) with an updated port binding using PCI device(s) on the new host.

Actual result
=============

Instance launched using port bindings from the old host.

Environment
===========

2. Which hypervisor did you use?
   libvirt

3. Which networking type did you use?
   - Affects neutron with openvswitch

Changed in nova:
assignee: nobody → Steven Webster (swebster-wr)
Revision history for this message
Matt Riedemann (mriedem) wrote :

Which release is this tested against? Or is it master code (currently Pike)?

Revision history for this message
Steven Webster (swebster-wr) wrote :

This is tested against Pike, but should be present in Newton and Ocata as well.

Note that successful evacuation of an instance with PCI devices will also depend on the fix for https://bugs.launchpad.net/nova/+bug/1630698, which currently has a review https://review.openstack.org/#/c/382853/.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/484381

Changed in nova:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/484381
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=b930336854bffec1bb81b6d67079a4df59e0af19
Submitter: Zuul
Branch: master

commit b930336854bffec1bb81b6d67079a4df59e0af19
Author: Steven Webster <email address hidden>
Date: Mon Jun 12 17:10:03 2017 -0400

    Fix instance evacuation with PCI devices

    update_port_binding_for_instance() now checks that a valid migration
    object exists as a parameter before any mapping between old/new PCI
    devices can occur. A migration should be present in the case of a
    cold migration, resize, or evacuation.

    An evacuation (being a special case of a rebuild) however, will not
    pass a migration to update_port_binding_for_instance, as it
    is called directly from setup_instance_network(). This calling function
    does not currently take a migration parameter, even though one will
    certainly exist for an evacuation.

    This commit adds an optional migration parameter to
    setup_instance_network_on_host() and passes any migration object to
    the port update routine.

    Closes-Bug: #1703629
    Related-Bug: #1677621
    Related-Bug: #1630698

    Change-Id: I4e394c8d275995eac4b049a7b1329ea90f2394be

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 17.0.0.0b3

This issue was fixed in the openstack/nova 17.0.0.0b3 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.openstack.org/590059

Matt Riedemann (mriedem)
Changed in nova:
importance: Undecided → Medium
no longer affects: nova/queens
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/ocata)

Fix proposed to branch: stable/ocata
Review: https://review.openstack.org/605881

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/pike)

Reviewed: https://review.openstack.org/590059
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=b9c1a58dd033fa8feeb1175956d57dc90aa55acd
Submitter: Zuul
Branch: stable/pike

commit b9c1a58dd033fa8feeb1175956d57dc90aa55acd
Author: Steven Webster <email address hidden>
Date: Mon Jun 12 17:10:03 2017 -0400

    Fix instance evacuation with PCI devices

    update_port_binding_for_instance() now checks that a valid migration
    object exists as a parameter before any mapping between old/new PCI
    devices can occur. A migration should be present in the case of a
    cold migration, resize, or evacuation.

    An evacuation (being a special case of a rebuild) however, will not
    pass a migration to update_port_binding_for_instance, as it
    is called directly from setup_instance_network(). This calling function
    does not currently take a migration parameter, even though one will
    certainly exist for an evacuation.

    This commit adds an optional migration parameter to
    setup_instance_network_on_host() and passes any migration object to
    the port update routine.

    Closes-Bug: #1703629
    Related-Bug: #1677621
    Related-Bug: #1630698

    Change-Id: I4e394c8d275995eac4b049a7b1329ea90f2394be
    (cherry picked from commit b930336854bffec1bb81b6d67079a4df59e0af19)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/ocata)

Reviewed: https://review.openstack.org/605881
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=8b3b8e5657ffe0f12e1ccfbe6d978e62a2bfdc89
Submitter: Zuul
Branch: stable/ocata

commit 8b3b8e5657ffe0f12e1ccfbe6d978e62a2bfdc89
Author: Steven Webster <email address hidden>
Date: Mon Jun 12 17:10:03 2017 -0400

    Fix instance evacuation with PCI devices

    update_port_binding_for_instance() now checks that a valid migration
    object exists as a parameter before any mapping between old/new PCI
    devices can occur. A migration should be present in the case of a
    cold migration, resize, or evacuation.

    An evacuation (being a special case of a rebuild) however, will not
    pass a migration to update_port_binding_for_instance, as it
    is called directly from setup_instance_network(). This calling function
    does not currently take a migration parameter, even though one will
    certainly exist for an evacuation.

    This commit adds an optional migration parameter to
    setup_instance_network_on_host() and passes any migration object to
    the port update routine.

    Closes-Bug: #1703629
    Related-Bug: #1677621
    Related-Bug: #1630698

    Change-Id: I4e394c8d275995eac4b049a7b1329ea90f2394be
    (cherry picked from commit b930336854bffec1bb81b6d67079a4df59e0af19)
    (cherry picked from commit b9c1a58dd033fa8feeb1175956d57dc90aa55acd)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 16.1.6

This issue was fixed in the openstack/nova 16.1.6 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 15.1.5

This issue was fixed in the openstack/nova 15.1.5 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.