juju destroy-controller doesn't properly remove models

Bug #1843331 reported by Kenneth Koski
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Triaged
Low
Unassigned

Bug Description

I ran `juju destroy-controller cdkkf --yes --destroy-all-models --destroy-storage`, and it brought down a few applications in the model I had in the controller, and then hung at repeating `Waiting on 1 model, 22 applications, 3 volumes`. I was able to Ctrl-C out of it, but retrying the operation also hung. No error message appeared for either of these attempts, but when I tried running `juju destroy-model cdkkf:kubeflow --yes --destroy-storage --force`, I got this error message:

The following errors were encountered during destroying the model.
You can fix the problem causing the errors and run destroy-model again.

Resource Id Message
Volume 0 destroying volume: getting volume pvc-5a58e6b6-4223-4676-a006-2ab55f5e42c1 to delete: Get https://1.2.3.4:443/api/v1/persistentvolumes/pvc-5a58e6b6-4223-4676-a006-2ab55f5e42c1?includeUninitialized=true: dial tcp 1.2.3.4:443: i/o timeout
          1 destroying volume: getting volume pvc-4c464c80-f139-4bfb-8319-078ba46aa53f to delete: Get https://1.2.3.4:443/api/v1/persistentvolumes/pvc-4c464c80-f139-4bfb-8319-078ba46aa53f?includeUninitialized=true: dial tcp 1.2.3.4:443: i/o timeout
          2 destroying volume: getting volume pvc-dcb44512-5a97-4a16-b726-38fc7d6ab427 to delete: Get https://1.2.3.4:443/api/v1/persistentvolumes/pvc-dcb44512-5a97-4a16-b726-38fc7d6ab427?includeUninitialized=true: dial tcp 1.2.3.4:443: i/o timeout

It looks like it's attempting to talk to the Charmed Kubernetes cluster to tear down the Kubeflow model, but that entire cluster was destroyed first, and Juju doesn't know that one model is layered on top of the other.

Is there a way to inform Juju that two models are layered and should be destroyed in a certain order?

Revision history for this message
Ian Booth (wallyworld) wrote :

We don't model dependencies between models to allow an ordered destroy to occur. But it's something we need to look at doing to support scenarios such as the one in this bug. For now, the best approach is to destroy any models based on a CDK cluster hosted by the some controller first.

Changed in juju:
importance: Undecided → Medium
status: New → Triaged
tags: added: destroy-model k8s
Revision history for this message
Tim McNamara (tim-clicks) wrote :

The destroy-controller command does not take a --force option to propagate through to any destroy model commands. Hook errors have blocked my models from being removed on 2.8-rc2 using a traditional cloud.

Revision history for this message
Cory Johns (johnsca) wrote :

It seems like it should better handle the case where the cluster is no longer reachable for whatever reason. Even when using `juju destroy-model --force --no-wait k8s-model`, it just hangs there pretty much indefinitely, since I guess it's waiting for each individual operation to timeout? Maybe at the start of the model destruction, it could do a basic connectivity test and, especially if it's already been given `--force`, just skip trying to do the individual operations.

Revision history for this message
Canonical Juju QA Bot (juju-qa-bot) wrote :

This bug has not been updated in 2 years, so we're marking it Low importance. If you believe this is incorrect, please update the importance.

Changed in juju:
importance: Medium → Low
tags: added: expirebugs-bot
Revision history for this message
Niklas Larsson (unixinfo) wrote :

100 users complaining about a malware called juju causing infinite loop on our systems. It's hard to remove malware sometimes.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.