Failed to restart nova-compute.service: Unit var-lib-nova-instances.mount not found

Bug #2060395 reported by macchese
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Nova Compute Charm
Triaged
Undecided
Unassigned

Bug Description

channel: 2023.2/stable
ubuntu: jammy

after scaling back a nova-compute unit according to https://docs.openstack.org/charm-guide/2023.2/admin/ops-scale-back-nova-compute.html, I tried to scale out the same unit using https://docs.openstack.org/charm-guide/2023.2/admin/ops-scale-out-nova-compute.html.
so after running juju add-unit nova-compute --to x and I noticed into the juju log:

WARNING unit.nova-compute/40.amqp-relation-changed Failed to restart nova-compute.service: Unit var-lib-nova-instances.mount not found.

SO I ssh into the host and executed:

# systemctl status nova-compute.service
○ nova-compute.service - OpenStack Compute
     Loaded: loaded (/lib/systemd/system/nova-compute.service; enabled; vendor preset: enabled)
    Drop-In: /etc/systemd/system/nova-compute.service.d
             └─99-mount.conf
     Active: inactive (dead)

and then:
systemctl start nova-compute.service
Failed to start nova-compute.service: Unit var-lib-nova-instances.mount not found.

from the juju client:
juju status nova-compute
Model Controller Cloud/Region Version SLA Timestamp
openstack maas-controller1 maas-cloud/default 3.2.4 unsupported 18:04:41Z

App Version Status Scale Charm Channel Rev Exposed Message
nova-compute 28.0.1 blocked 1 nova-compute 2023.2/stable 718 no Services not running that should be: nova-compute
ovn-chassis-gw active 0 ovn-dedicated-chassis 23.09/stable 132 no Unit is ready

Unit Workload Agent Machine Public address Ports Message
nova-compute/40* blocked idle 8 192.168.6.110 Services not running that should be: nova-compute

Machine State Address Inst id Base AZ Message
8 started 192.168.6.110 op1 ubuntu@22.04 default Deployed

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

I suspect that there's something not being cleaned up properly, as part of the scale back of the nova-compute unit.

It's fairly unusual to scale back from a machine and then try to scale out using the same machine, which is probably why this bug hasn't surfaced before. Please could you let us know the use case for this? i.e. are you trying to solve some other issue and scaling back was a proposed solution? Thanks.

Changed in charm-nova-compute:
status: New → Triaged
Revision history for this message
macchese (max-liccardo) wrote (last edit ):

I scaled back a nova-node with a ovn-dedicated-chassis because I changed the node hostname and the dns resolvers.
The node I scaled back was not also a ceph osd node so I resolved purging al the .deb related to nova and ceph:

apt -y purge nova-common neutron-ovn-metadata-agent neutron-common qemu-system-common qemu-block-extra qemu-system-data qemu-system-gui qemu-system-x86 qemu-utils libvirt-daemon-driver-qemu ipxe-qemu-256k-compat-efi-roms ipxe-qemu libvirt-clients libvirt-daemon-config-network libvirt-daemon-config-nwfilter libvirt-daemon-system-systemd openstack-release python3-openstacksdk python3-oslo.config

rm -fr /var/lib/libvirt /var/lib/neutron/ /var/lib/nova /var/log/nova /var/lib/charm/nova-compute /etc/systemd/system/nova-compute.service.d /etc/nova /var/cache/nova

systemctl unmask nova-compute.service

apt -y purge ceph-common

And then I re-rerun:

juju add-unit nova-compute --to x

now it works

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.