https://review.openstack.org/#/c/91722 breaks icehouse->juno live migration

Bug #1402813 reported by Lars Kellogg-Stedman
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Critical
Dan Smith
Juno
Fix Released
Critical
Matt Riedemann

Bug Description

The changes introduced in https://review.openstack.org/#/c/91722 break live migration from Icehouse compute nodes to Juno compute nodes. This complicates the upgrade process since it means origin compute nodes must be upgraded at the same time as target compute nodes, which in some situations may involve kernel upgrades (or hardware upgrades!) for which people would rather evacuate the system in advance.

The current error reporting for this failure is also undesireable (the client requesting the migration does not receive an error indicating why the live migration process failed).

Revision history for this message
Pádraig Brady (p-draigbrady) wrote :
Revision history for this message
Lars Kellogg-Stedman (larsks) wrote :

It's not clear to me if that fixes the problem, or simply provides better error reporting in the event of a failure.

Revision history for this message
Lars Kellogg-Stedman (larsks) wrote :

Oh, wait, that's what you said. Sorry, apparently I need more coffee before reading bugs this morning.

Matt Riedemann (mriedem)
Changed in nova:
importance: Undecided → Critical
milestone: none → kilo-2
tags: added: migrate
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/145292

Changed in nova:
assignee: nobody → Dan Smith (danms)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/145292
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=5477faab6740f1d8a4fcb4c28779dfc4fd316afe
Submitter: Jenkins
Branch: master

commit 5477faab6740f1d8a4fcb4c28779dfc4fd316afe
Author: Dan Smith <email address hidden>
Date: Tue Jan 6 10:41:41 2015 -0800

    Fix live migration RPC compatibility with older versions

    In commit bc45c56f102cdef58840e02b609a89f5278e8cce, the live migration
    RPC APIs were changed such that they intentionally wouldn't communicate
    with older versions that don't provide the extra parameters that were
    added. This breaks people using live migration to move workloads between
    icehouse and juno compute nodes during an upgrade. It also generally
    runs counter to our policies regarding RPC API compatibility.

    The original bug only affected shared block storage users, which means
    a large portion of users aren't even affected. Thus, this patch restores
    compatibility with the older versions in all cases, but logs weighty
    warning messages for the operators when a migration is performed that
    looks to be affected by the bug. If we have enough information to
    determine that the migration is not affected, we avoid the warning, but
    otherwise err on the side of caution. If an operator is not actually
    affected by the bug, they will see the warnings while the RPC API
    version cap is in place (i.e. during the upgrade window) and then
    the warnings will stop once it is removed.

    UpgradeImpact: This will resolve upgrade issues from Icehouse->Juno
    when using live migration.

    DocImpact: Documenting the potential for data loss when migrating from
    Icehouse to Juno when using live migration is something operators should
    be aware of.

    Change-Id: I5651fb7ba95f38e2e2f8a48a98ff04072c6bb885
    Closes-Bug: #1402813

Changed in nova:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/juno)

Fix proposed to branch: stable/juno
Review: https://review.openstack.org/151775

Thierry Carrez (ttx)
Changed in nova:
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/juno)

Reviewed: https://review.openstack.org/151775
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=e7828d91aa3ec5a781a2d8dbd411e5d4210aa5c0
Submitter: Jenkins
Branch: stable/juno

commit e7828d91aa3ec5a781a2d8dbd411e5d4210aa5c0
Author: Dan Smith <email address hidden>
Date: Tue Jan 6 10:41:41 2015 -0800

    Fix live migration RPC compatibility with older versions

    In commit bc45c56f102cdef58840e02b609a89f5278e8cce, the live migration
    RPC APIs were changed such that they intentionally wouldn't communicate
    with older versions that don't provide the extra parameters that were
    added. This breaks people using live migration to move workloads between
    icehouse and juno compute nodes during an upgrade. It also generally
    runs counter to our policies regarding RPC API compatibility.

    The original bug only affected shared block storage users, which means
    a large portion of users aren't even affected. Thus, this patch restores
    compatibility with the older versions in all cases, but logs weighty
    warning messages for the operators when a migration is performed that
    looks to be affected by the bug. If we have enough information to
    determine that the migration is not affected, we avoid the warning, but
    otherwise err on the side of caution. If an operator is not actually
    affected by the bug, they will see the warnings while the RPC API
    version cap is in place (i.e. during the upgrade window) and then
    the warnings will stop once it is removed.

    UpgradeImpact: This will resolve upgrade issues from Icehouse->Juno
    when using live migration.

    DocImpact: Documenting the potential for data loss when migrating from
    Icehouse to Juno when using live migration is something operators should
    be aware of.

    Conflicts:
            nova/compute/rpcapi.py
            nova/tests/unit/compute/test_rpcapi.py
            nova/tests/unit/virt/libvirt/test_driver.py

    NOTE(mriedem): The rpcapi conflict was due to jsonutils not being
    on master. The test conflicts were due to the modules being moved
    on master.

    Change-Id: I5651fb7ba95f38e2e2f8a48a98ff04072c6bb885
    Closes-Bug: #1402813
    (cherry picked from commit 5477faab6740f1d8a4fcb4c28779dfc4fd316afe)

tags: added: in-stable-juno
Thierry Carrez (ttx)
Changed in nova:
milestone: kilo-2 → 2015.1.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.