On ceph-relation-changed the n-c charm sets a ceph key as a secret for libvirt. If the unit gets destroyed, this secret is left in place. If a unit gets reinstalled, the charm detects that a secret with same uuid has already been set. However, it tries to run secret-define anyway, which then fails
IMHO it might be desirable to clean up secrets on unit destruction from a sanitation POV. Failing that, the charm should not try to re-define the secret (just setting it to a new value should be ok)
2017-04-03 09:36:50 INFO ceph-relation-changed Traceback (most recent call last):
2017-04-03 09:36:50 INFO ceph-relation-changed File "/var/lib/juju/agents/unit-compute-only-17/charm/hooks/ceph-relation-changed", line 501, in <module>
2017-04-03 09:36:50 INFO ceph-relation-changed main()
2017-04-03 09:36:50 INFO ceph-relation-changed File "/var/lib/juju/agents/unit-compute-only-17/charm/hooks/ceph-relation-changed", line 494, in main
2017-04-03 09:36:50 INFO ceph-relation-changed hooks.execute(sys.argv)
2017-04-03 09:36:50 INFO ceph-relation-changed File "/var/lib/juju/agents/unit-compute-only-17/charm/hooks/charmhelpers/core/hookenv.py", line 715, in execute
2017-04-03 09:36:50 INFO ceph-relation-changed self._hooks[hook_name]()
2017-04-03 09:36:50 INFO ceph-relation-changed File "/var/lib/juju/agents/unit-compute-only-17/charm/hooks/charmhelpers/contrib/openstack/utils.py", line 1817, in wrapped_f
2017-04-03 09:36:50 INFO ceph-relation-changed restart_functions)
2017-04-03 09:36:50 INFO ceph-relation-changed File "/var/lib/juju/agents/unit-compute-only-17/charm/hooks/charmhelpers/core/host.py", line 524, in restart_on_change_helper
2017-04-03 09:36:50 INFO ceph-relation-changed r = lambda_f()
2017-04-03 09:36:50 INFO ceph-relation-changed File "/var/lib/juju/agents/unit-compute-only-17/charm/hooks/charmhelpers/contrib/openstack/utils.py", line 1816, in <lambda>
2017-04-03 09:36:50 INFO ceph-relation-changed (lambda: f(*args, **kwargs)), restart_map, stopstart,
2017-04-03 09:36:50 INFO ceph-relation-changed File "/var/lib/juju/agents/unit-compute-only-17/charm/hooks/ceph-relation-changed", line 364, in ceph_changed
2017-04-03 09:36:50 INFO ceph-relation-changed secret_uuid=CEPH_SECRET_UUID, key=key)
2017-04-03 09:36:50 INFO ceph-relation-changed File "/var/lib/juju/agents/unit-compute-only-17/charm/hooks/nova_compute_utils.py", line 657, in create_libvirt_secret
2017-04-03 09:36:50 INFO ceph-relation-changed check_call(cmd)
2017-04-03 09:36:50 INFO ceph-relation-changed File "/usr/lib/python2.7/subprocess.py", line 540, in check_call
2017-04-03 09:36:50 INFO ceph-relation-changed raise CalledProcessError(retcode, cmd)
2017-04-03 09:36:50 INFO ceph-relation-changed subprocess.CalledProcessError: Command '['virsh', '-c', 'qemu:///system', 'secret-define', '--file', '/etc/ceph/secret.xml']' returned non-zero exit status 1
2017-04-03 09:36:50 ERROR juju.worker.uniter.operation runhook.go:107 hook "ceph-relation-changed" failed: exit status 1
The problem with this issue, that we have multiple clouds (from different customers) where we have to change the 'flavor' of machines on a more or less regular basis.
With flavor in this case it refers to the physical host being setup to use vnf / sriov / hugepages or not (or even a mix of these settings).
Since the 'name' used to identify the secret is derived from the juju application name, we got hit by this every time we 'remove' one juju application flavor and add the other, even while the ceph is the same, and thus the actual secret does not change, just the label.
Added field-medium to this ticket as the only 'work around' involves manually going to the machine and issuing some command to remove the secret, which all risks involved.