live migration failed due to no shared storage when using rbd imagebackend

Bug #1250751 reported by Jiajun Liu
74
This bug affects 13 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Dmitry Borodaenko
Nominated for Icehouse by Yaguang Tang

Bug Description

I am using rbd image backend for my instance. When I perform live migratin there are some error as follows:
======================================================
2013-11-13 16:09:01.084 2610 ERROR nova.openstack.common.rpc.amqp [req-01304e20-7986-4982-925b-56e62dd07499 a043bf5308314731b6bc60a523b5c803 03b91ca8deb044ec9b32d4aff61df
973] Exception during message handling
2013-11-13 16:09:01.084 2610 TRACE nova.openstack.common.rpc.amqp Traceback (most recent call last):
2013-11-13 16:09:01.084 2610 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/amqp.py", line 426, in _process_data
2013-11-13 16:09:01.084 2610 TRACE nova.openstack.common.rpc.amqp **args)
2013-11-13 16:09:01.084 2610 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/dispatcher.py", line 172, in dispatch
2013-11-13 16:09:01.084 2610 TRACE nova.openstack.common.rpc.amqp result = getattr(proxyobj, method)(ctxt, **kwargs)
2013-11-13 16:09:01.084 2610 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/exception.py", line 99, in wrapped
2013-11-13 16:09:01.084 2610 TRACE nova.openstack.common.rpc.amqp temp_level, payload)
2013-11-13 16:09:01.084 2610 TRACE nova.openstack.common.rpc.amqp File "/usr/lib64/python2.6/contextlib.py", line 23, in __exit__
2013-11-13 16:09:01.084 2610 TRACE nova.openstack.common.rpc.amqp self.gen.next()
2013-11-13 16:09:01.084 2610 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/exception.py", line 76, in wrapped
2013-11-13 16:09:01.084 2610 TRACE nova.openstack.common.rpc.amqp return f(self, context, *args, **kw)
2013-11-13 16:09:01.084 2610 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 3609, in check_can_live_migrate_
destination
2013-11-13 16:09:01.084 2610 TRACE nova.openstack.common.rpc.amqp dest_check_data)
2013-11-13 16:09:01.084 2610 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/rpcapi.py", line 274, in check_can_live_migrate_so
urce
2013-11-13 16:09:01.084 2610 TRACE nova.openstack.common.rpc.amqp instance))
2013-11-13 16:09:01.084 2610 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/proxy.py", line 126, in call
2013-11-13 16:09:01.084 2610 TRACE nova.openstack.common.rpc.amqp result = rpc.call(context, real_topic, msg, timeout)
2013-11-13 16:09:01.084 2610 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/__init__.py", line 140, in call
2013-11-13 16:09:01.084 2610 TRACE nova.openstack.common.rpc.amqp return _get_impl().call(CONF, context, topic, msg, timeout)
2013-11-13 16:09:01.084 2610 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/impl_kombu.py", line 824, in call
2013-11-13 16:09:01.084 2610 TRACE nova.openstack.common.rpc.amqp rpc_amqp.get_connection_pool(conf, Connection))
2013-11-13 16:09:01.084 2610 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/amqp.py", line 539, in call
2013-11-13 16:09:01.084 2610 TRACE nova.openstack.common.rpc.amqp rv = list(rv)
2013-11-13 16:09:01.084 2610 TRACE nova.openstack.common.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/openstack/common/rpc/amqp.py", line 504, in __iter__
2013-11-13 16:09:01.084 2610 TRACE nova.openstack.common.rpc.amqp raise result
2013-11-13 16:09:01.084 2610 TRACE nova.openstack.common.rpc.amqp InvalidSharedStorage_Remote: ymy-r1-7 is not on shared storage: Live migration can not be used without s
hared storage.
======================================================

I use only one rbd storage pool for my instances. There are shared storage actually. But the code in nova for checking shared storage do not make sence for rbd image backend.

Jiajun Liu (ljjjustin)
Changed in nova:
assignee: nobody → Jiajun Liu (ljjjustin)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/56527

Changed in nova:
status: New → In Progress
Josh Durgin (jdurgin)
tags: added: ceph rbd
Yaguang Tang (heut2008)
Changed in nova:
assignee: Jiajun Liu (ljjjustin) → Yaguang Tang (heut2008)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/84916

Revision history for this message
Dmitry Borodaenko (angdraug) wrote :

A more generic fix is proposed in this review:
https://review.openstack.org/91722

Revision history for this message
Dmitry Borodaenko (angdraug) wrote :
Yaguang Tang (heut2008)
Changed in nova:
importance: Undecided → Medium
Revision history for this message
Taras (shapovalovts) wrote :

Is the fix going to be merged to icehouse?

Revision history for this message
Shuquan Huang (shuquan) wrote :

when will the patch be fixed?

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Yaguang Tang (<email address hidden>) on branch: master
Review: https://review.openstack.org/84916

Revision history for this message
ling-yun (zengyunling) wrote :

hi yaguang tang, do you still work on this patch?
If don't, could you assign it to me. I am interested in ceph rbd problem.
Looking forward to your reply.

Revision history for this message
Yaguang Tang (heut2008) wrote :

another patch fixes this issue https://review.openstack.org/#/c/91722/

Changed in nova:
assignee: Yaguang Tang (heut2008) → nobody
Tracy Jones (tjones-i)
Changed in nova:
status: In Progress → Triaged
Changed in nova:
assignee: nobody → Dmitry Borodaenko (dborodaenko)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/91722
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=bc45c56f102cdef58840e02b609a89f5278e8cce
Submitter: Jenkins
Branch: master

commit bc45c56f102cdef58840e02b609a89f5278e8cce
Author: Dmitry Borodaenko <email address hidden>
Date: Thu Nov 21 16:05:19 2013 -0800

    Improve shared storage checks for live migration

    Due to an assumption that libvirt live migrations work only when both
    instance path and disk data is shared between source and destination
    hosts (e.g. libvirt instances directory is on NFS), instance disks are
    removed from shared storage when instance path is not shared (e.g. Ceph
    RBD backend is enabled).

    Distinguish cases that require shared instance drive and shared libvirt
    instance directory. Reflect the fact that RBD backed instances have
    shared instance drive (and no shared libvirt instance directory) in the
    relevant conditionals.

    UpgradeImpact: Live migrations from or to a compute host running a
    version of Nova pre-dating this commit are disabled in order to
    eliminate possibility of data loss. Upgrade Nova on both the source and
    the target node before attempting a live migration.

    Closes-bug: 1250751
    Closes-bug: 1314526
    Co-authored-by: Ryan Moe <email address hidden>
    Co-authored-by: Yaguang Tang <email address hidden>
    Signed-off-by: Dmitry Borodaenko <email address hidden>
    Change-Id: I2755c59b4db736151000dae351fd776d3c15ca39

Changed in nova:
status: In Progress → Fix Committed
Changed in nova:
milestone: none → juno-2
status: Fix Committed → Fix Released
Yaguang Tang (heut2008)
tags: added: icehouse-backport-potential
Revision history for this message
Jon Proulx (jproulx) wrote :

I'm running icehouse and this bug affects my ability to upgrade to juno (nova nodes are running ubuntu 12.04 so need to vacate nodes and reinstall for upgrade path)

I'd really love if someone could do the backport. My neive attempt to simply cherry-pick the commit into stable/icehouse didn't go so well. I'm unfamiliar with the code and likely made some poor choices resolving conflicts, but doesn't look like it should be much work for someone with slightly more clue.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/icehouse)

Fix proposed to branch: stable/icehouse
Review: https://review.openstack.org/124161

Revision history for this message
Jon Proulx (jproulx) wrote :

I tell a lie, I had a config issue in my test env. teh cherry pick did seem to solve the issue on icehouse for me.

Ante Karamatić (ivoks)
tags: added: cts unified-objects
Thierry Carrez (ttx)
Changed in nova:
milestone: juno-2 → 2014.2
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (stable/icehouse)

Change abandoned by Sean Dague (<email address hidden>) on branch: stable/icehouse
Review: https://review.openstack.org/124161
Reason: This review is > 4 weeks without comment and currently blocked by a core reviewer with a -2. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and contacting the reviewer with the -2 on this review to ensure you address their concerns.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.