juju destroy-model --force times out when trying to remove k8s model

Bug #1910810 reported by David Coronel
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
High
Ian Booth

Bug Description

I'm trying to juju destroy-model --force --no-wait (and tried with --destroy-storage and --release-storage too) of a Kubernetes model but the model is stuck in the 'destroying' status. The k8s model is completely empty. I'm not sure what Juju is waiting for.

$ juju status -m k8s-model
Model Controller Cloud/Region Version SLA Timestamp Notes
k8s-model foundations-maas k8s-cloud/default 2.8.7 unsupported 19:15:56Z tearing down cloud environment

Model "k8s-model" is empty.

$ juju models
Controller: foundations-maas

Model Cloud/Region Type Status Machines Cores Units Access Last connection
[...]
k8s-model k8s-cloud/default kubernetes destroying 0 - - admin 1 hour ago
[...]

I'm following https://ubuntu.com/kubernetes/docs/cdk-addons#coredns. I need to add a k8s model and deploy the coredns k8s charm in it.

When I first tried to add a k8s model to my existing controller, the juju controllers couldn't talk to the k8s api on the kubeapi-load-balancer units and failed to create the model. This was caused by assymetric routing issues. As a workaround, I added static routes on the kubeapi-load-balancer units. This allowed the juju controllers to talk to k8s api and I was able to add the new k8s model. However I noticed that the juju status showed that juju lost connectivity to the agents on those units because of those static routes. I now plan to revert the static routes and use the advanced-routing charm to do policy routing instead. So I tried to destroy the k8s model and that's how I ended up in this situation.

After it was stuck, I tried to delete the static routes to see if it would fix it but it didn't change anything.

This is with Juju version 2.8.7 on Ubuntu 18.04.

Tags: cpe-onsite
Revision history for this message
David Coronel (davecore) wrote :

subscribed ~field-medium

Revision history for this message
David Coronel (davecore) wrote :

Unsubscribing field-medium due to confusion around sla

tags: added: cpe-onsite
Revision history for this message
Ian Booth (wallyworld) wrote :

When juju destroys a k8s model, it makes an api call to k8s to delete the namespace hosting the model resources. Juju will then wait until the namespace is removed. I've seen k8s hold the namespace in "Terminating" state for many minutes. That's why your destroy model operation is not finishing. As to why k8s takes so long to remove a namespace, I'm not sure.

Revision history for this message
David Coronel (davecore) wrote :

I have seen namespaces in Kubernetes get stuck from time to time too and never really understood why. My workaround is usually to follow the steps from https://craignewtondev.medium.com/how-to-fix-kubernetes-namespace-deleting-stuck-in-terminating-state-5ed75792647e

1) kubectl get namespace mynamespace -o json > mynamespace.json

2) Remove "kubernetes" from the finalizers array

3) kubectl replace --raw "/api/v1/namespaces/mynamespace/finalize" -f ./mynamespace.json

Could juju skip the phase to wait for the namespace deletion if --force is specified to destroy-model?

Revision history for this message
Ian Booth (wallyworld) wrote :

I think it's worth having -force terminate after a set time even if the model resources are not fully cleaned up (with a suitable message to the user).

Changed in juju:
milestone: none → 2.9.1
importance: Undecided → High
status: New → Triaged
Revision history for this message
Ian Booth (wallyworld) wrote :

This PR will allow the destroy-model timeout to abort the deletion of cloud resources (in this case the k8s namespace). And if the supplied timeout is 0, then the juju model is removed even if the cloud resources are not fully cleaned up

https://github.com/juju/juju/pull/12955

Changed in juju:
assignee: nobody → Ian Booth (wallyworld)
status: Triaged → In Progress
Ian Booth (wallyworld)
Changed in juju:
status: In Progress → Fix Committed
Changed in juju:
milestone: 2.9.1 → 2.9.2
Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.