load balancers created for Kube API via the loadbalancer endpoint are not deleted

Bug #1874864 reported by Jeff Hillman
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
AWS Integrator Charm
Triaged
Medium
Unassigned
Canonical Juju
Incomplete
High
Unassigned
OpenStack Octavia Charm
Invalid
Undecided
Unassigned
Openstack Integrator Charm
Triaged
Medium
Unassigned

Bug Description

Using openstack-integrator in place of kubeapi-load-balancer.

The load balancer is created properly, and the appropriate members (k8s-masters) are joined to the load balancer.

When the kubernetes model is destroyed, the load balancer is still present in openstack.

Trying to manually delete the load balancer shows that it still has members.

trying to delete the pool associated with the load balancer (in an effort to manually clean it up), gives the following error:

---

$ openstack loadbalancer pool delete e045da82-a95c-44f7-9efc-dd5c42aa7c55
Load Balancer f6f01eee-c6c4-4a60-984b-48d57fde2bdf is immutable and cannot be updated. (HTTP 409) (Request-ID: req-77bae993-2148-4a46-8c97-f37ded4bb0fd)

---

Jeff Hillman (jhillman)
summary: - load balancers created via the loadbalancer endpoint are not deleted
+ load balancers created for Kube API via the loadbalancer endpoint are
+ not deleted
Revision history for this message
Cory Johns (johnsca) wrote :

There is explicit cleanup logic in the charm ([1] and [2]) but it seems like the stop hook may not be having a chance to complete that. Options there are to move the cleanup earlier, to the relation hooks, and to provide an action to do an explicit manual cleanup, like the AWS integrator charm has. If it's a hook race condition during model destroy, then it may still require a manual intervention in the teardown to ensure that the cleanup is run before the destroy-model command is run, but it should be escalated up to the Juju team if that's the case.

Alternatively, if the cleanup is failing, we need to figure out why. As discussed on IRC, you're going to try running the specific cleanup command that the charm uses (openstack loadbalancer delete --cascade $lb_name) to see if that same "immutable" error occurs.

[1]: https://github.com/juju-solutions/charm-openstack-integrator/blob/master/reactive/openstack.py#L125-L128

[2]: https://github.com/juju-solutions/charm-openstack-integrator/blob/master/lib/charms/layer/openstack.py#L140-L154

Revision history for this message
Jeff Hillman (jhillman) wrote :

Running 'openstack loadbalancer delete --cascade <$lb-id>' worked.

It removed the old loadbalancer.

Revision history for this message
Jeff Hillman (jhillman) wrote :

It should be noted, that after the loadbalancer is deleted with --cascade, the security group for this loadbalancer still exists.

---

$ openstack security group list
...
| 3c677ea8-cad7-4c53-aa0c-91ae07f5e8db | openstack-integrator-e2bfb0599b95-kubernetes-master | openstack-integrator-e2bfb0599b95-kubernetes-master | 4c204b4cf3e141c68fbb60aaadb2264b | [] |
...

---

This is still requiring a manual clean up.

Revision history for this message
Cory Johns (johnsca) wrote :

Hrm. It seems that the cleanup code originally didn't cleanup the SG because it was global rather than per-LB [1]. However, it seems that is no longer the case and a new SG is used for each LB [2]. That does need to be fixed.

[1]: https://github.com/juju-solutions/charm-openstack-integrator/blob/master/lib/charms/layer/openstack.py#L141-L142

[2]: https://github.com/juju-solutions/charm-openstack-integrator/blob/master/lib/charms/layer/openstack.py#L419

Revision history for this message
Cory Johns (johnsca) wrote :

The fact that the LB itself is not being cleaned up indicates that the stop hook [1] is either not running or being interrupted, during model destruction. I'm attaching Juju to this bug because my expectation is that the destruction would wait until the stop hook has run.

Just to confirm, Jeff: I trust that you are using neither --force nor --no-wait when destroying the model?

[1]: https://github.com/juju-solutions/charm-openstack-integrator/blob/master/reactive/openstack.py#L125-L128

Revision history for this message
Jeff Hillman (jhillman) wrote :

No switches are used. Simply 'juju destroy-model kubernetes'

Revision history for this message
Cory Johns (johnsca) wrote :

Adding the AWS integrator as well because it has a similar cleanup stop hook [1] which is also known to not get run during model destruction (which is why we added the purge-iam-entities to that charm).

[1]: https://github.com/juju-solutions/charm-aws-integrator/blob/master/reactive/aws.py#L154-L161

Revision history for this message
Cory Johns (johnsca) wrote :

I should note that when trying to debug this in the past, I had issues with `juju debug-logs` terminating output before the model was completely destroyed. At the time, it was not entirely clear if that was just a communication issue or not, but I think the implementation of debug-logs has improved since then so we might be able to get some better info.

Revision history for this message
Jeff Hillman (jhillman) wrote :

openstack-integrator log grabbed by 'juju debug-log --include openstack-integrator'

It starts right before the destroy-model command was issues (with no switches).

Revision history for this message
Ian Booth (wallyworld) wrote :

We'll investigate any Juju issue post 2.8. Adding to the 2.8.1 milestone initially so the bug doesn't get lost.

Changed in juju:
milestone: none → 2.8.1
status: New → Triaged
importance: Undecided → High
Changed in charm-aws-integrator:
status: New → Triaged
Changed in charm-openstack-integrator:
status: New → Triaged
Changed in charm-aws-integrator:
importance: Undecided → Medium
Changed in charm-openstack-integrator:
importance: Undecided → Medium
Tim Penhey (thumper)
Changed in juju:
status: Triaged → Incomplete
milestone: 2.8.1 → none
Revision history for this message
Drew Freiberger (afreiberger) wrote :

With the loadbalancer not being deleted, this ended up triggering a bug in Octavia which I've filed upstream, hoping for an Ussuri backport if there's a fix in master.

Tagging charm-octavia to monitor https://storyboard.openstack.org/#!/story/2009128 which provides details of what I've found which relates to this bug.

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

This is not an Octavia charm issue, unless it is believed that the charm is configuring the octavia application incorrectly. If this *is* the case, then please re-open the bug on the octavia charm.

Changed in charm-octavia:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.