charms: nova/cinder/ceph rbd integration broken on Ocata
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
| OpenStack Charm Guide |
High
|
James Page | ||
| OpenStack cinder charm |
Critical
|
Liam Young | ||
| OpenStack cinder-ceph charm |
Critical
|
James Page | ||
| OpenStack nova-compute charm |
Critical
|
James Page |
Bug Description
https:/
As a result, its not possible to attach ceph block devices in instances in a charm deployed Ocata; the secret_uuid configuration is not populated in the cinder configuration file, and in any case the username on the compute units won't match the username for ceph being used on the cinder units (as compute and cinder units get different keys created) so I don't think the key created on the compute units will actually work with the username provided from cinder.
I'm not 100% convinced this is a great change in behaviour; the cinder and nova keys have much the same permissions for correct operation (rwx on images, volumes and vms groups) however it does mean that the nova-compute units have to have the same keys as the cinder units. A key disclosure/
Changed in charm-cinder-ceph: | |
importance: | Undecided → Critical |
Changed in charm-nova-compute: | |
importance: | Undecided → Critical |
Changed in charm-cinder-ceph: | |
status: | New → Triaged |
Changed in charm-nova-compute: | |
status: | New → Triaged |
summary: |
- nova/cinder/ceph rbd integration broken on Ocata + charms: nova/cinder/ceph rbd integration broken on Ocata |
description: | updated |
James Page (james-page) wrote : | #1 |
James Page (james-page) wrote : | #2 |
Error from a boot from volume check:
Details: {'message': 'internal error: process exited while connecting to monitor: 2017-03-
James Page (james-page) wrote : | #3 |
(note this is from a transient test environment so keys not sensitive)
James Page (james-page) wrote : | #4 |
cross referencing with bug 1635008
James Page (james-page) wrote : | #5 |
Resolution of this in the charms might look something like this:
1) addition of new relation between cinder-ceph and nova-compute
cinder-ceph will need to provide its cephx key + UUID that it will use in its configuration files for this purpose; this cannot be fixed (as in nova-compute) as its possible multiple backends will be in use, so the UUID must be specific to the backend (so some complexity in HA deployments with regards to which unit will generate the UUID and how the other units will observe and consume the UUID - via leader storage).
2) updates to nova-compute
consumption of the new interface, storage of secret in libvirt using the UUID provided from the cinder-ceph charm.
I'm still not hugely keen on having the compute units share keys with the cinder-ceph units.
James Page (james-page) wrote : | #6 |
(and to confirm - updating the secret on compute nodes to use the cinder-ceph key results in a functioning cloud - but that's not a fix - just to confirm username and key must match).
description: | updated |
Changed in charm-cinder-ceph: | |
milestone: | none → 17.05 |
Changed in charm-nova-compute: | |
milestone: | none → 17.05 |
Changed in charm-cinder-ceph: | |
assignee: | nobody → James Page (james-page) |
Changed in charm-nova-compute: | |
assignee: | nobody → James Page (james-page) |
Changed in charm-cinder-ceph: | |
status: | Triaged → In Progress |
Changed in charm-nova-compute: | |
status: | Triaged → In Progress |
Fix proposed to branch: master
Review: https:/
Fix proposed to branch: master
Review: https:/
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: master
commit 1467cbb1b3883cf
Author: James Page <email address hidden>
Date: Thu Mar 9 12:51:25 2017 +0000
Fix support for cinder ceph rbd in Ocata
As of Ocata, the ceph key used to access a specific Cinder
Ceph backend must match the name of the key used by cinder,
with an appropriate secret configured for libvirt use with
the cephx key used by the cinder-ceph charm.
Add support for the new ceph-access relation to allow
nova-compute units to communicate with multiple ceph
backends using different cephx keys and user names.
The side effect of this change is that nova-compute will
have a key for use with its own ephemeral backend ceph
access, and a key for each cinder ceph backend configured
in the deployment.
Change-Id: I638473fc46c99a
Closes-Bug: 1671422
Changed in charm-nova-compute: | |
status: | In Progress → Fix Committed |
Fix proposed to branch: stable/17.02
Review: https:/
Changed in charm-guide: | |
status: | New → In Progress |
importance: | Undecided → High |
assignee: | nobody → James Page (james-page) |
milestone: | none → 17.05 |
Fix proposed to branch: master
Review: https:/
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: master
commit 62613456e7a04ac
Author: James Page <email address hidden>
Date: Thu Mar 9 12:59:06 2017 +0000
Fix support for cinder ceph rbd on Ocata
As of Ocata, the ceph key used to access a specific Cinder
Ceph backend must match the name of the key used by cinder,
with an appropriate secret configured for libvirt use with
the cephx key used by the cinder-ceph charm.
Add support for the new ceph-access relation to allow
nova-compute units to communicate with multiple ceph
backends using different cephx keys and user names.
The lead cinder-ceph unit will generate a UUID for use in
the cinder configuration file, and for use by the remote
nova-compute units when configuring libvirt secrets,
ensuring that both ends of the integration match up.
The side effect of this change is that nova-compute will
have a key for use with its own ephemeral backend ceph
access, and a key for each cinder ceph backend configured
in the deployment.
Change-Id: I974ecb39132fed
Closes-Bug: 1671422
Changed in charm-cinder-ceph: | |
status: | In Progress → Fix Committed |
Fix proposed to branch: stable/17.02
Review: https:/
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: master
commit bb9cac0a1c25292
Author: James Page <email address hidden>
Date: Tue Mar 14 08:29:44 2017 +0000
Add additional release note for cinder-ceph storage
A new relation is required to support key sharing between
the cinder-ceph and nova-compute charms, providing better
support for use of multiple storage backends.
Add a release note to this effect.
Change-Id: Idc32c75593c0ac
Closes-Bug: 1671422
Changed in charm-guide: | |
status: | In Progress → Fix Released |
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: stable/17.02
commit fcd1afbe8b8d0c9
Author: James Page <email address hidden>
Date: Thu Mar 9 12:59:06 2017 +0000
Fix support for cinder ceph rbd on Ocata
As of Ocata, the ceph key used to access a specific Cinder
Ceph backend must match the name of the key used by cinder,
with an appropriate secret configured for libvirt use with
the cephx key used by the cinder-ceph charm.
Add support for the new ceph-access relation to allow
nova-compute units to communicate with multiple ceph
backends using different cephx keys and user names.
The lead cinder-ceph unit will generate a UUID for use in
the cinder configuration file, and for use by the remote
nova-compute units when configuring libvirt secrets,
ensuring that both ends of the integration match up.
The side effect of this change is that nova-compute will
have a key for use with its own ephemeral backend ceph
access, and a key for each cinder ceph backend configured
in the deployment.
Change-Id: I974ecb39132fed
Closes-Bug: 1671422
(cherry picked from commit 62613456e7a04ac
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: stable/17.02
commit e0c187cb7aa87fa
Author: James Page <email address hidden>
Date: Thu Mar 9 12:51:25 2017 +0000
Fix support for cinder ceph rbd in Ocata
As of Ocata, the ceph key used to access a specific Cinder
Ceph backend must match the name of the key used by cinder,
with an appropriate secret configured for libvirt use with
the cephx key used by the cinder-ceph charm.
Add support for the new ceph-access relation to allow
nova-compute units to communicate with multiple ceph
backends using different cephx keys and user names.
The side effect of this change is that nova-compute will
have a key for use with its own ephemeral backend ceph
access, and a key for each cinder ceph backend configured
in the deployment.
Change-Id: I638473fc46c99a
Closes-Bug: 1671422
(cherry picked from commit 1467cbb1b3883cf
Changed in charm-cinder-ceph: | |
status: | Fix Committed → Fix Released |
Changed in charm-nova-compute: | |
status: | Fix Committed → Fix Released |
no longer affects: | nova |
Ryan Beisner (1chb1n) wrote : | #17 |
Need to revisit this for the scenario where the cinder-ceph subordinate is not in use, and the cinder charm is used with the ceph* charms directly.
Darin Arrick (darinavbt) wrote : | #18 |
I think I've also run into this. I deployed a MAAS+Autopilot cloud a couple of weeks ago. Everything seems to work but attaching volumes. Speaking with David B. from Canonical, he suggested I post here, as well.
Environment: new deployment, based on https:/
"juju status" on controller: https:/
nova-compute.log from the compute node in question: https:/
Two things:
1) How do I prove that my issue is this bug? The lack of rbd_secret_uuid somewhere?
2) What's the workaround/fix? My deployment is new and strictly for testing at this point, so I can do whatever is needed.
Nobuto Murata (nobuto) wrote : | #19 |
It would be nice if openstack-base bundle in the charm store has the "ceph-access" relation added in this bug as a reference for everyone.
The current revision is Newton, not Ocata.
https:/
However, development one with Ocata does not have the newly added relation yet:
https:/
Frode Nordahl (fnordahl) wrote : | #20 |
PR already up here: https:/
This will currently only work with the next charms for cinder-ceph and nova-compute. The necessary commits are subject for release in the upcoming charm release.
Nobuto Murata (nobuto) wrote : | #21 |
> PR already up here: https:/
Nice!
> This will currently only work with the next charms for cinder-ceph and nova-compute. The necessary commits are subject for release in the upcoming charm release.
Hmm, stable charms have "ceph-access" relations already? Looks like the fix has been backported to 17.02 branch.
https:/
https:/
Frode Nordahl (fnordahl) wrote : | #22 |
The relation is there, but last I checked it did not contain the required data for it to work. If it was intended to be backported I'll check again and try to track down what's missing if it does not work.
Frode Nordahl (fnordahl) wrote : | #23 |
This is indeed backported to stable but in some circumstances the ceph-access relation never completes both in stable and in master.
I have filed bug 1711642.
Changed in charm-cinder: | |
status: | New → Won't Fix |
Chris Sanders (chris.sanders) wrote : | #24 |
Subscribed field-critical.
The 'wont fix' for the cinder charm seems in conflict with https:/
A cloud running the cinder charm, then upgraded to Ocata does not currently appear to have a way to use it's cinder volumes. No know work around currently.
Changed in charm-cinder: | |
status: | Won't Fix → Confirmed |
importance: | Undecided → Critical |
assignee: | nobody → David Ames (thedac) |
milestone: | none → 19.04 |
David Ames (thedac) wrote : | #25 |
Updating this bug. We may decide to move this elsewhere it at some point.
We have a deployment that was upgraded through to pike at which point it was noticed that nova instances with ceph backed volumes would not start.
The cinder key was manually added to the nova-compute nodes in /etc/ceph and with:
sudo virsh secret-define --file /tmp/cinder.secret
However, this did not resolve the problem. It appeared libvirt was trying to use a mixed pair of usernames and keys. It was using the cinder username but the nova-compute key.
Looking at nova's code it falls back to nova.conf when it does not have a secret_uuid from cinder but it was not setting the username correctly.
https:/
The following seems to mitigate this as a temporary fix on nova-compute until we can come up with a complete plan:
https:/
diff --git a/nova/
index cec43ce93b.
--- a/nova/
+++ b/nova/
@@ -71,6 +71,7 @@ class LibvirtNetVolum
else:
+ conf.auth_username = CONF.libvirt.
# secret_type is always hard-coded to 'ceph' in cinder
Apply to /usr/lib/
We still need a migration plan to get from the topology with nova-compute directly related to ceph to the topology with cinder-ceph related to nova-compute using ceph-access which would populate cinder's secret_uuid.
It is possible we will need to carry the patch for existing instances. It may be worth getting that upstream as master has the same problem.
Frode Nordahl (fnordahl) wrote : | #26 |
Referencing some other issues that makes the cinder -> cinder-ceph migration more complicated:
- bug 1727184
- bug 1768922
- bug 1773800
Corey Bryant (corey.bryant) wrote : | #27 |
@thedac, I've created bug 1809454 to track your fix from comment 25.
Frode Nordahl (fnordahl) wrote : | #28 |
As for the charm migration path we need to provide the means for enabling a administrator to morph a existing model without the `cinder-ceph` subordinate to a model with the `cinder-ceph` subordinate.
What I would propose we do is to document the existence and use of the `rename-
In addition to that we would need to either have the proposed Nova fallback fix landed or find another way to have Nova update the block_device_
Frode Nordahl (fnordahl) wrote : | #29 |
FWIW; here is a bundle useful for testing the scenario: https:/
David Ames (thedac) wrote : | #30 |
It would seem the upgrade to Ocata changes the auth_username to cinder in the database and leaves secret_uuid Null. This may be because cinder did not already have a rbd_secret_uuid set during the upgrade. Adding cinder-ceph to the equation adds this but does not on its own update the nova DB. (more testing needed)
The patch [0] and the package updates [1] will be required for the fall back to nova's rbd_username and rbd_secret_uuid for existing volume backed instances.
The path forward:
When [1] packages are available update packages on nodes. This will handle all existing instances.
Add cinder ceph to the model. New instances will use the cinder-ceph credentials.
Needs further testing:
Remove relation between cinder and ceph-mon
Test non nova cinder volumes after the topology change
To future proof against the fallback being removed:
Either update the DB similar to [2]
Or create an action similar to [3] that does this for us.
[0] https:/
[1] https:/
[2] https:/
[3] https:/
Changed in charm-cinder: | |
assignee: | David Ames (thedac) → nobody |
tags: | added: canonical-bootstack |
David Ames (thedac) wrote : | #31 |
Packages from https:/
This should enable the upgrade from Pike (with the cowboy patch) to Queens proposed with the fix in the packages.
juju config nova-cloud-
juju config nova-compute cloud:xenial-
Then run the openstack-upgrade action. Note: the rest of the cloud can use cloud:xenial-queens and will need to be upgraded as well.
I have run through a quick and dirty upgrade test (nova only) from newton to queens. Confirmed the problem in ocata and pike and the fix in queens proposed.
The fix was introduced in Stein and therefor will be available for the foreseeable future. This means the fallback to the nova configured ceph authentication will be available until we can confirm a complete migration path from nova-compute<
David Ames (thedac) wrote : | #32 |
WARNING
Potentially as soon as the upgrade from newton to ocata occurred and confirmed
after the bellow topology change, live migrations and snapshots are not working.
Existing instances can be stopped and started but not snapshotted or migrated.
New instances can do all of the above.
Toplogy migration path
juju deploy cs:cinder-ceph
juju add-relation cinder cinder-ceph
juju add-relation cinder-ceph ceph-mon
juju remove-relation cinder ceph-mon
juju add-relation cinder-ceph nova-compute
# If [CEPH] block still in /etc/cinder/
juju config cinder debug=True
New instances will have auth_username cinder-ceph and secret-uuid populated.
David Ames (thedac) wrote : | #33 |
The problem with live migration et al seems to occur at the time of upgrade from newton to ocata. Not necessarily with the cinder-ceph topology change:
For example an instance created at newton, then the cloud is upgraded to ocata. Attempting to live migrate:
Source host:
2019-01-14 19:18:01.258 4742 ERROR nova.compute.
2019-01-14 19:18:01.775 4742 ERROR root [req-5e9c7fd1-
', ' File "/usr/lib/
block_
', ' File "/usr/lib/
disk=disk, migrate_
', ' File "/usr/lib/
retry=
', ' File "/usr/lib/
timeout=
', ' File "/usr/lib/
retry=retry)
', ' File "/usr/lib/
raise result
', 'RemoteError: Remote error: ClientException Internal Server Error (HTTP 500)
[u\'Traceback (most recent call last):\
\', u\' File "/usr/lib/
res = self.dispatcher
\', u\' File "/usr/lib/
return self._do_
\', u\' File "/usr/lib/
result = func(ctxt, **new_args)\
\', u\' File "/usr/lib/
function_name, call_dict, binary)\
\', u\' File "/usr/lib/
self.
\', u\' File "/usr/lib/
six.
\', u\' File "/usr/lib/
return f(self, context, *args, **kw)\
\', u\' File "/usr/lib/
return function(self, context, *args, **kwargs)\
\', u\' File "/usr/lib/
kwargs[
\', u\' File "/usr/lib/
self.
\', u\' File "/usr/lib/
six.
\', u\' File "/usr/lib/
Changed in charm-cinder: | |
assignee: | nobody → Liam Young (gnuoy) |
Liam Young (gnuoy) wrote : | #34 |
The charm guide now contains instructions for migrating to the cinder-ceph charm (https:/
Changed in charm-cinder: | |
status: | Confirmed → Invalid |
Liam Young (gnuoy) wrote : | #35 |
Spoke with xavpaice and chris.sanders and they agreed that the field-crit tag can be removed.
As a quick fix I've tried adding the uuid for the nova-compute created secret to cinder (this is a global constant for the charms):
<disk type="network" device="disk"> ceph/volume- bdff2036- c0da-438d- aa95-d882d408df 92"> "cinder- ceph"> 8cbe-11e2- 9c52-3bc8c78194 72"/> bdff2036- c0da-438d- aa95-d882d408df 92</serial>
<driver name="qemu" type="raw" cache="none"/>
<source protocol="rbd" name="cinder-
<host name="10.5.25.226" port="6789"/>
<host name="10.5.25.227" port="6789"/>
<host name="10.5.25.229" port="6789"/>
</source>
<auth username=
<secret type="ceph" uuid="514c9fca-
</auth>
<target bus="virtio" dev="vdb"/>
<serial>
</disk>
results in the correct XML, however the username mismatches with the keys so the attach fails.