charm nova-compute doesn't deregister the service on unit removal

Bug #1691998 reported by Patrizio Bassi
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Nova Compute Charm
Fix Released
Low
Martin Kalcok

Bug Description

Openstack is deployed with a openstack-base bundle via MAAS provider

juju add-unit nova-compute
<a new bare metal machine is deployed>
openstack and nova see the new compute node, and it's working
(nova service-list shows the new machine)

juju remove-unit nova-compute/X does remove the unit, the software and even releases the machine to maas, but does't deregister it self from nova service catalog

so you have to manually issue
nova service-list to get the id
nova service-remove <id>

it would be good to have this in the stop/destroy event

Tags: bootstack stop
Revision history for this message
Patrizio Bassi (patrizio-bassi) wrote :

i found that even neutron-openvswitch-agent is still registered

it should be cleaned up
(now i must issue neutron agent-delete <id>)

Revision history for this message
John A Meinel (jameinel) wrote :

I'm pretty sure this should be considered a bug in the charm itself.

Changed in juju:
status: New → Invalid
Revision history for this message
James Page (james-page) wrote :

I know this to be the case; its somewhat of a historic issue as very early juju versions never actually executed the stop hook, so no charm ever implemented them as they where useless with no way to test.

That said I think we could do a better job on service destruction with regards to cleanup; on a nova-compute unit this would involve:

a) destroying any active instances (or maybe migrating them?)
b) disabling/removal of nova-compute service entries in nova (via an API call)
c) removal of nova-compute related packages

this is a general pattern that could be applied across all of the charms, but a) and c) have most value on physical machines where multiple applications may be deployed directly onto the hardware.

no longer affects: juju
Changed in charm-nova-compute:
status: New → Triaged
importance: Undecided → Low
tags: added: stop
Revision history for this message
Wouter van Bommel (woutervb) wrote :

When removing a unit, case should be taken for aggregate de-registration and the neutron stuff that is related to the host instance.

tags: added: bootstack
Changed in charm-nova-compute:
assignee: nobody → Martin Kalcok (martin-kalcok)
status: Triaged → In Progress
Revision history for this message
Martin Kalcok (martin-kalcok) wrote :

After the talk with the Openstack team, we determined that due to the async nature and no guaranteed order of hooks execution after the `juju remove-unit` command is executed, there will have to be a new action implemented that prepares `nova-compute` unit for departure.

In my proposal I added 4 new actions:
* disable - Disables `nova-cloud-controller` from running new VMs on the unit
* enable - Enables `nova-cloud-controller` to run new VMs on the unit
* remove-from-cloud - Stops `nova compute`, `neutron agent` and unregisters them from the nova and neutron services
* register-to-cloud - Reverts effects of the `remove-from-cloud` action

There's a new section added to README.md that explains how to perform unit removal (https://review.opendev.org/c/openstack/charm-nova-compute/+/763795/1/README.md). But basic worklof goes like this:

* Perform `disable` action on the `nova-compute` unit
* Ensure that there are no VMs running on the unit and either destroy or migrate them away.
* Perform `remove-from-cloud` action on the `nova-compute` unit
* Run `juju remove-unit` commnand

Revision history for this message
Martin Kalcok (martin-kalcok) wrote :
Changed in charm-nova-compute:
status: In Progress → Fix Committed
Changed in charm-nova-compute:
milestone: none → 21.04
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.