Comment 25 for bug 1671422

Revision history for this message
David Ames (thedac) wrote :

Updating this bug. We may decide to move this elsewhere it at some point.

We have a deployment that was upgraded through to pike at which point it was noticed that nova instances with ceph backed volumes would not start.

The cinder key was manually added to the nova-compute nodes in /etc/ceph and with:
sudo virsh secret-define --file /tmp/cinder.secret

However, this did not resolve the problem. It appeared libvirt was trying to use a mixed pair of usernames and keys. It was using the cinder username but the nova-compute key.

Looking at nova's code it falls back to nova.conf when it does not have a secret_uuid from cinder but it was not setting the username correctly.
https://github.com/openstack/nova/blob/stable/pike/nova/virt/libvirt/volume/net.py#L74

The following seems to mitigate this as a temporary fix on nova-compute until we can come up with a complete plan:

https://pastebin.ubuntu.com/p/tGm7C7fpXT/

diff --git a/nova/virt/libvirt/volume/net.py b/nova/virt/libvirt/volume/net.py
index cec43ce93b..8b0148df0b 100644
--- a/nova/virt/libvirt/volume/net.py
+++ b/nova/virt/libvirt/volume/net.py
@@ -71,6 +71,7 @@ class LibvirtNetVolumeDriver(libvirt_volume.LibvirtBaseVolumeDriver):
             else:
                 LOG.debug('Falling back to Nova configuration for RBD auth '
                           'secret_uuid value.')
+ conf.auth_username = CONF.libvirt.rbd_user
                 conf.auth_secret_uuid = CONF.libvirt.rbd_secret_uuid
             # secret_type is always hard-coded to 'ceph' in cinder
             conf.auth_secret_type = netdisk_properties['secret_type']

Apply to /usr/lib/python2.7/dist-packages/nova/virt/libvirt/volume/net.py

We still need a migration plan to get from the topology with nova-compute directly related to ceph to the topology with cinder-ceph related to nova-compute using ceph-access which would populate cinder's secret_uuid.
It is possible we will need to carry the patch for existing instances. It may be worth getting that upstream as master has the same problem.