Comment 22 for bug 1452641

Revision history for this message
Tyler Stachecki (tstachecki) wrote :

We have also been bitten by this. Apologies if this does not help solve the bug, but this issue has been floating for quite awhile and the following may help future cloud operators...

In our case, we trying to re-IP ALL of our Ceph Mons. As Corey mentioned, this bug report is for *Cinder volumes*... but note that all of our instances were observed to make use of RBD-backed configuration drives which suffered the same problem as the images... so you may suffer from both problems even if you exclusively boot all instances from volume!

* RBD config drives AND Glance/image-based RBD volumes DID NOT have their Ceph Mon addresses updated as part of a live-migration, even with the patch in #9. The Ceph Mon addresses for these types in volumes IN PARTICULAR are NOT stored anywhere in a database and rather seem to be derived as needed when certain actions occur and otherwise carted around from hyp to hyp by way of the libvirt domain XML. Again, see the other LP bug for this.

* Trying to 'fix up' the Ceph Mon addresses via 'virsh edit' or comparable and then trying to live-migrate an instance to have those changes reflected is futile, because the Ceph Mon address changes are not reflected until a hard bounce of the VMM for that instance AND nova-compute uses the running copy of libvirt domain XML when shipping a copy to a destination hypervisor, NOT the copy on disk.

What we may end up doing (that worked in a lab environment) is to respin a patch off #9 that is applied to all worknode. It searches for all instances of './devices/disk/source' in the XML document which have an 'rbd' protocol. For each entry, we replace the current host subelements with our new Ceph Mon addresses. Then live-migrate every VM exactly once.

This works for all kinds of RBD volumes and, unlike 'virsh edit', works because the in-memory libvirt domain XML is rewritten prior to the VMM starting up on the destination host. Note that while you are doing the LMs and updating the domain XMLs, you must keep at least one of the old and new Ceph Mons accessible at all times.