hook failed: "ceph-access-relation-changed" after removing and re-adding cinder-ceph application

Bug #1859869 reported by Vladimir Grevtsev
24
This bug affects 5 people
Affects Status Importance Assigned to Milestone
OpenStack Nova Compute Charm
Triaged
Medium
Unassigned

Bug Description

Nova-compute charm does not handle properly the situation, when cinder-ceph application is redeployed and re-related with compute charm; hook is failing complaining about virsh secret already existing.

Steps to reproduce:

ubuntu@OrangeBox84 ~ » juju remove-application cinder-ceph
removing application cinder-ceph

...

ubuntu@OrangeBox84 ~ » juju deploy cs:cinder-ceph
Located charm "cs:cinder-ceph-251".
Deploying charm "cs:cinder-ceph-251".
ubuntu@OrangeBox84 ~ » juju add-relation cinder-ceph cinder
ubuntu@OrangeBox84 ~ » juju add-relation cinder-ceph ceph-mon-cinder
ubuntu@OrangeBox84 ~ » juju add-relation cinder-ceph nova-compute

...

nova-compute/0 error idle 1 172.27.84.202 hook failed: "ceph-access-relation-changed"

All of the units are showing the following:

unit-nova-compute-0: 19:44:35 INFO unit.nova-compute/0.juju-log ceph-access:68: Defining new libvirt secret for uuid b8a3167f-6b26-4fef-aba9-4b34e8827909.
unit-nova-compute-0: 19:44:35 DEBUG unit.nova-compute/0.ceph-access-relation-changed error: Failed to set attributes from /etc/ceph/secret-cinder-ceph.xml
unit-nova-compute-0: 19:44:35 DEBUG unit.nova-compute/0.ceph-access-relation-changed error: internal error: a secret with UUID 919fa72e-68bd-4150-aaf4-65b8568eb8ea already defined for use with client.cinder-ceph secret
unit-nova-compute-0: 19:44:35 DEBUG unit.nova-compute/0.ceph-access-relation-changed
unit-nova-compute-0: 19:44:35 DEBUG unit.nova-compute/0.ceph-access-relation-changed Traceback (most recent call last):
unit-nova-compute-0: 19:44:35 DEBUG unit.nova-compute/0.ceph-access-relation-changed File "/var/lib/juju/agents/unit-nova-compute-0/charm/hooks/ceph-access-relation-changed", line 761, in <module>
unit-nova-compute-0: 19:44:35 DEBUG unit.nova-compute/0.ceph-access-relation-changed main()
unit-nova-compute-0: 19:44:35 DEBUG unit.nova-compute/0.ceph-access-relation-changed File "/var/lib/juju/agents/unit-nova-compute-0/charm/hooks/ceph-access-relation-changed", line 754, in main
unit-nova-compute-0: 19:44:35 DEBUG unit.nova-compute/0.ceph-access-relation-changed hooks.execute(sys.argv)
unit-nova-compute-0: 19:44:35 DEBUG unit.nova-compute/0.ceph-access-relation-changed File "/var/lib/juju/agents/unit-nova-compute-0/charm/hooks/charmhelpers/core/hookenv.py", line 914, in execute
unit-nova-compute-0: 19:44:35 DEBUG unit.nova-compute/0.ceph-access-relation-changed self._hooks[hook_name]()
unit-nova-compute-0: 19:44:35 DEBUG unit.nova-compute/0.ceph-access-relation-changed File "/var/lib/juju/agents/unit-nova-compute-0/charm/hooks/ceph-access-relation-changed", line 675, in ceph_access
unit-nova-compute-0: 19:44:35 DEBUG unit.nova-compute/0.ceph-access-relation-changed key=key)
unit-nova-compute-0: 19:44:35 DEBUG unit.nova-compute/0.ceph-access-relation-changed File "/var/lib/juju/agents/unit-nova-compute-0/charm/hooks/nova_compute_utils.py", line 707, in create_libvirt_secret
unit-nova-compute-0: 19:44:35 DEBUG unit.nova-compute/0.ceph-access-relation-changed check_call(cmd)
unit-nova-compute-0: 19:44:35 DEBUG unit.nova-compute/0.ceph-access-relation-changed File "/usr/lib/python3.5/subprocess.py", line 581, in check_call
unit-nova-compute-0: 19:44:35 DEBUG unit.nova-compute/0.ceph-access-relation-changed raise CalledProcessError(retcode, cmd)
unit-nova-compute-0: 19:44:35 DEBUG unit.nova-compute/0.ceph-access-relation-changed subprocess.CalledProcessError: Command '['virsh', '-c', 'qemu:///system', 'secret-define', '--file', '/etc/ceph/secret-cinder-ceph.xml']' returned non-zero exit status 1
unit-nova-compute-0: 19:44:35 ERROR juju.worker.uniter.operation hook "ceph-access-relation-changed" failed: exit status 1

The charm was able to proceed and turned to the "active/idle" state after I removed the secret from virsh on each nova-compute host:

ubuntu@OrangeBox84 ~/fce-demo (fce-on-orange-box *%) » juju run 'virsh secret-undefine 919fa72e-68bd-4150-aaf4-65b8568eb8ea' --application 'nova-compute'
- Stdout: |+
    Secret 919fa72e-68bd-4150-aaf4-65b8568eb8ea deleted

  UnitId: nova-compute/0
- Stdout: |+
    Secret 919fa72e-68bd-4150-aaf4-65b8568eb8ea deleted

  UnitId: nova-compute/1
- Stdout: |+
    Secret 919fa72e-68bd-4150-aaf4-65b8568eb8ea deleted

  UnitId: nova-compute/2

ubuntu@OrangeBox84 ~/fce-demo (fce-on-orange-box *%) » juju resolve nova-compute/0
ubuntu@OrangeBox84 ~/fce-demo (fce-on-orange-box *%) » juju resolve nova-compute/1
ubuntu@OrangeBox84 ~/fce-demo (fce-on-orange-box *%) » juju resolve nova-compute/2

ubuntu@OrangeBox84 ~/fce-demo (fce-on-orange-box *%) » juju status nova-compute
Model Controller Cloud/Region Version SLA Timestamp
openstack foundations-maas maas_cloud 2.7.0 unsupported 19:55:40Z

App Version Status Scale Charm Store Rev OS Notes
ceph-osd waiting 0 ceph-osd jujucharms 294 ubuntu
neutron-openvswitch 12.1.0 active 3 neutron-openvswitch jujucharms 269 ubuntu
nova-compute 17.0.12 active 3 nova-compute jujucharms 309 ubuntu
ntp 4.2.8p4+dfsg active 3 ntp jujucharms 35 ubuntu

Unit Workload Agent Machine Public address Ports Message
nova-compute/0 active idle 1 172.27.84.202 Unit is ready
  neutron-openvswitch/1 active idle 172.27.84.202 Unit is ready
  ntp/2 active idle 172.27.84.202 123/udp ntp: Ready
nova-compute/1* active idle 2 172.27.84.203 Unit is ready
  neutron-openvswitch/2 active idle 172.27.84.203 Unit is ready
  ntp/3 active idle 172.27.84.203 123/udp ntp: Ready
nova-compute/2 active idle 3 172.27.84.204 Unit is ready
  neutron-openvswitch/0* active idle 172.27.84.204 Unit is ready
  ntp/0* active idle 172.27.84.204 123/udp ntp: Ready

full juju status: https://pastebin.canonical.com/p/MkqbBCMqVw/

Tags: scaleback
Changed in charm-nova-compute:
status: New → Triaged
importance: Undecided → Medium
tags: added: scaleback
Changed in charm-nova-compute:
milestone: none → 20.04
Revision history for this message
David Coronel (davecore) wrote :

I just hit this bug in a xenial-queens openstack when relating to a new ceph cluster (bionic-queens luminous) using ceph-proxy.

Removing the nova-compute to cinder-ceph relation did not remove the libvirt secret.

My workaround was to remove <uuid>.base64 and <uuid>.xml in /etc/libvirt/secrets and also run:

sudo virsh secret-undefine <uuid>

Revision history for this message
macchese (max-liccardo) wrote :
Download full text (5.2 KiB)

can't resolve using your workaroud

unit-nova-compute-2: 00:20:42 INFO juju.worker.uniter awaiting error resolution for "relation-changed" hook
unit-nova-compute-3: 00:21:15 INFO juju.worker.uniter awaiting error resolution for "relation-changed" hook
unit-nova-compute-3: 00:21:16 INFO unit.nova-compute/3.juju-log ceph-access:122: Making dir /var/lib/charm/nova-compute root:root 555
unit-nova-compute-3: 00:21:16 INFO unit.nova-compute/3.juju-log ceph-access:122: Making dir /etc/ceph root:root 555
unit-nova-compute-3: 00:21:17 INFO unit.nova-compute/3.juju-log ceph-access:122: Registered config file: /etc/libvirt/qemu.conf
unit-nova-compute-3: 00:21:17 INFO unit.nova-compute/3.juju-log ceph-access:122: Registered config file: /etc/default/qemu-kvm
unit-nova-compute-3: 00:21:17 INFO unit.nova-compute/3.juju-log ceph-access:122: Registered config file: /etc/libvirt/libvirtd.conf
unit-nova-compute-3: 00:21:17 INFO unit.nova-compute/3.juju-log ceph-access:122: Registered config file: /etc/default/libvirt-bin
unit-nova-compute-3: 00:21:17 INFO unit.nova-compute/3.juju-log ceph-access:122: Registered config file: /etc/init/libvirt-bin.override
unit-nova-compute-3: 00:21:17 INFO unit.nova-compute/3.juju-log ceph-access:122: Registered config file: /etc/nova/nova.conf
unit-nova-compute-3: 00:21:17 INFO unit.nova-compute/3.juju-log ceph-access:122: Registered config file: /etc/nova/vendor_data.json
unit-nova-compute-3: 00:21:17 INFO unit.nova-compute/3.juju-log ceph-access:122: Registered config file: /etc/apparmor.d/usr.bin.nova-compute
unit-nova-compute-3: 00:21:17 INFO unit.nova-compute/3.juju-log ceph-access:122: Registered config file: /etc/nova/nova-compute.conf
unit-nova-compute-3: 00:21:17 INFO unit.nova-compute/3.juju-log ceph-access:122: Registered config file: /etc/ceph/secret.xml
unit-nova-compute-3: 00:21:17 INFO unit.nova-compute/3.juju-log ceph-access:122: Registered config file: /var/lib/charm/nova-compute/ceph.conf
unit-nova-compute-3: 00:21:22 WARNING unit.nova-compute/3.ceph-access-relation-changed ERROR cannot read settings for unit "cinder-ceph/1" in relation "nova-compute:ceph-access cinder-ceph:ceph-access": unit "cinder-ceph/1": settings not found
unit-nova-compute-3: 00:21:22 WARNING unit.nova-compute/3.ceph-access-relation-changed Traceback (most recent call last):
unit-nova-compute-3: 00:21:22 WARNING unit.nova-compute/3.ceph-access-relation-changed File "/var/lib/juju/agents/unit-nova-compute-3/charm/hooks/ceph-access-relation-changed", line 997, in <module>
unit-nova-compute-3: 00:21:22 WARNING unit.nova-compute/3.ceph-access-relation-changed main()
unit-nova-compute-3: 00:21:22 WARNING unit.nova-compute/3.ceph-access-relation-changed File "/var/lib/juju/agents/unit-nova-compute-3/charm/hooks/ceph-access-relation-changed", line 990, in main
unit-nova-compute-3: 00:21:22 WARNING unit.nova-compute/3.ceph-access-relation-changed hooks.execute(sys.argv)
unit-nova-compute-3: 00:21:22 WARNING unit.nova-compute/3.ceph-access-relation-changed File "/var/lib/juju/agents/unit-nova-compute-3/charm/hooks/charmhelpers/core/hookenv.py", line 963, in execute
unit-nova-compute-3: 00:21:22 WARNING unit.nova-compute/3.ceph-acce...

Read more...

Revision history for this message
macchese (max-liccardo) wrote :

using juju debug-log it seems to me that nova-compute is missing settings from cinder-ceph because the unit is wrong ( I deleted cinder and then redeployed it)

 unit-nova-compute-2: 16:15:31 WARNING unit.nova-compute/2.ceph-access-relation-changed ERROR cannot read settings for unit "cinder-ceph/1" in relation "nova-compute:ceph-access cinder-ceph:ceph-access": unit "cinder-ceph/1": settings not found

in fact the cinder-ceph unit is 4 and not 1

App Version Status Scale Charm Channel Rev Exposed Message
cinder 23.0.0 active 1 cinder 2023.2/stable 663 no Unit is ready
cinder-ceph 23.0.0 waiting 1 cinder-ceph 2023.2/stable 528 no Incomplete relations: nova-compute
cinder-mysql-router 8.0.35 active 1 mysql-router 8.0/stable 111 no Unit is ready
nova-cloud-controller 28.0.0 active 1 nova-cloud-controller 2023.2/stable 717 no Unit is ready
nova-compute 28.0.0 error 2 nova-compute 2023.2/stable 703 no hook failed: "ceph-access-relation-changed"
nova-mysql-router 8.0.35 active 1 mysql-router 8.0/stable 111 no Unit is ready
ntp 4.2 active 2 ntp stable 50 no chrony: Ready
ovn-chassis 23.09.0 active 2 ovn-chassis 23.09/stable 178 no Unit is ready

Unit Workload Agent Machine Public address Ports Message
cinder/1* active idle 1/lxd/12 192.168.70.20 8776/tcp Unit is ready
  cinder-ceph/4* waiting idle 192.168.70.20 Incomplete relations: nova-compute
  cinder-mysql-router/1* active idle 192.168.70.20 Unit is ready
nova-cloud-controller/1* active idle 0/lxd/10 192.168.70.30 8774-8775/tcp Unit is ready
  nova-mysql-router/1* active idle 192.168.70.30 Unit is ready
nova-compute/2 error idle 0 192.168.6.101 hook failed: "ceph-access-relation-changed"
  ntp/3 active idle 192.168.6.101 123/udp chrony: Ready
  ovn-chassis/6 active idle 192.168.6.101 Unit is ready
nova-compute/3* error idle 1 192.168.6.110 hook failed: "ceph-access-relation-changed"
  ntp/2* active idle 192.168.6.110 123/udp chrony: Ready
  ovn-chassis/5* active idle 192.168.6.110 Unit is ready

Revision history for this message
Faiz Ahmed (faizahmedfarooqui) wrote :

I removed and redeployed apps cinder, cinder-ceph and cinder-backup and since then all nova-compute units are throwing the same hook error ie. "ceph-access-relation-changed".

I tried using "juju debug-hooks nova-compute/0" and reran the same hook ie. ceph-access-relation-changed - I have shared below the error.

```
root@nova-compute:/var/lib/juju/agents/unit-nova-compute-0/charm# ./hooks/ceph-access-relation-changed ./hooks/ceph-access-relation-changed
error: Failed to set attributes from /etc/ceph/secret-cinder-ceph.xml
error: internal error: a secret with UUID 9dbed743-60b0-4544-9817-ce732fb4de61 already defined for use with client.cinder-ceph secret

Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-nova-compute-0/charm/./hooks/ceph-access-relation-changed", line 1022, in <module>
    main()
  File "/var/lib/juju/agents/unit-nova-compute-0/charm/./hooks/ceph-access-relation-changed", line 1015, in main
    hooks.execute(sys.argv)
  File "/var/lib/juju/agents/unit-nova-compute-0/charm/hooks/charmhelpers/core/hookenv.py", line 963, in execute
    self._hooks[hook_name]()
  File "/var/lib/juju/agents/unit-nova-compute-0/charm/./hooks/ceph-access-relation-changed", line 902, in ceph_access
    _configure_keyring(remote_service_name(rid), key, uuid)
  File "/var/lib/juju/agents/unit-nova-compute-0/charm/./hooks/ceph-access-relation-changed", line 883, in _configure_keyring
    create_libvirt_secret(secret_file=secrets_filename,
  File "/var/lib/juju/agents/unit-nova-compute-0/charm/hooks/nova_compute_utils.py", line 816, in create_libvirt_secret
    check_call(cmd)
  File "/usr/lib/python3.10/subprocess.py", line 369, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['virsh', '-c', 'qemu:///system', 'secret-define', '--file', '/etc/ceph/secret-cinder-ceph.xml']' returned non-zero exit status 1.
```

Changed in charm-nova-compute:
milestone: 20.04 → none
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.