juju destroy-model --force does not work with stale volume instances in the db

Bug #1854893 reported by Jose Guedez
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Triaged
Low
Unassigned

Bug Description

Context
=======

version: 2.6.10
cloud: openstack train

Had already destroyed applications/machines, but missed volumes. Attempted to destroy the model with --destroy-storage but due to credentials issues it got 401 authentication problems. Fixed the issue but the error stayed on the db and it did not recover. Then deleted the volumes manually using the openstack cli successfully, but the juju model remained stuck and could not be deleted even with --force.

With help from babbageclunk, had to issue a series of mongo queries to delete the volume entries. After that tried destroy-model --force again but it continued to be stuck (no error this time, just stuck). Finally had to reboot the controller (all other models were working fine), and after the reboot the model was gone without further user intervention.

Ideally destroy-model --force should get rid of the model within the juju systems (especially the db), without error-prone manual intervention/db surgery and/or a controller reboot. No cloud-specific cleanup is expected in this case (OpenStack), the user would have already given up on that when invoking --force.

Command output:

ubuntu@mybox$ juju destroy-model --force mymodel
WARNING! This command will destroy the "mymodel" model.
This includes all machines, applications, data and other resources.

Continue [y/N]? y
Destroying model
Waiting for model to be removed, 6 error(s), 6 volume(s)........................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
....................................................................
The following errors were encountered during destroying the model.
You can fix the problem causing the errors and run destroy-model again.

Resource Id Message
Volume 0 destroying volume: cannot release volume "a038e68b-deb4-49db-9ada-e47a7ffac1e6": getting volume: Unauthorised URL http://10.245.161.160:87
76/v2/58976c7d41d34bce96e0ed07fb7511fe/volumes/a038e68b-deb4-49db-9ada-e47a7ffac1e6
caused by: request (http://10.245.161.160:8776/v2/58976c7d41d34bce96e0ed07fb7511fe/volumes/a038e68b-deb4-49db-9ada-e47a7ffac1e6) returned unexpected sta
tus: 401; error info: Failed: 401 error: The request you have made requires authentication.
  1 destroying volume: cannot release volume "68ed84a3-2186-4c6c-bbb6-201a3a36ab2b": getting volume: Unauthorised URL http://10.245.161.160:8776/v2/589
76c7d41d34bce96e0ed07fb7511fe/volumes/68ed84a3-2186-4c6c-bbb6-201a3a36ab2b
caused by: request (http://10.245.161.160:8776/v2/58976c7d41d34bce96e0ed07fb7511fe/volumes/68ed84a3-2186-4c6c-bbb6-201a3a36ab2b) returned unexpected sta
tus: 401; error info: Failed: 401 error: The request you have made requires authentication.
  2 destroying volume: cannot release volume "0bff96c8-0217-4528-be39-f4a275814d8a": getting volume: Unauthorised URL http://10.245.161.160:8776/v2/589
76c7d41d34bce96e0ed07fb7511fe/volumes/0bff96c8-0217-4528-be39-f4a275814d8a
caused by: request (http://10.245.161.160:8776/v2/58976c7d41d34bce96e0ed07fb7511fe/volumes/0bff96c8-0217-4528-be39-f4a275814d8a) returned unexpected sta
tus: 401; error info: Failed: 401 error: The request you have made requires authentication.
  3 destroying volume: cannot release volume "96dc8380-29c3-432d-92ec-d59f38e91bb8": getting volume: Unauthorised URL http://10.245.161.160:8776/v2/589
76c7d41d34bce96e0ed07fb7511fe/volumes/96dc8380-29c3-432d-92ec-d59f38e91bb8
caused by: request (http://10.245.161.160:8776/v2/58976c7d41d34bce96e0ed07fb7511fe/volumes/96dc8380-29c3-432d-92ec-d59f38e91bb8) returned unexpected sta
tus: 401; error info: Failed: 401 error: The request you have made requires authentication.
  4 destroying volume: cannot release volume "b38757ab-8b45-41ad-b6da-bf6a1612a2dc": getting volume: Unauthorised URL http://10.245.161.160:8776/v2/589
76c7d41d34bce96e0ed07fb7511fe/volumes/b38757ab-8b45-41ad-b6da-bf6a1612a2dc
caused by: request (http://10.245.161.160:8776/v2/58976c7d41d34bce96e0ed07fb7511fe/volumes/b38757ab-8b45-41ad-b6da-bf6a1612a2dc) returned unexpected sta
tus: 401; error info: Failed: 401 error: The request you have made requires authentication.
  5 destroying volume: cannot release volume "ac575230-3806-4f49-aa97-c281670d4595": getting volume: Unauthorised URL http://10.245.161.160:8776/v2/589
76c7d41d34bce96e0ed07fb7511fe/volumes/ac575230-3806-4f49-aa97-c281670d4595
caused by: request (http://10.245.161.160:8776/v2/58976c7d41d34bce96e0ed07fb7511fe/volumes/ac575230-3806-4f49-aa97-c281670d4595) returned unexpected sta
tus: 401; error info: Failed: 401 error: The request you have made requires authentication.
ERROR timeout after 30m0s timeout

Changed in juju:
status: New → Triaged
importance: Undecided → High
tags: added: destroy-model storage
Revision history for this message
Ian Booth (wallyworld) wrote :

This should be addressed in Juju 2.8.0 where --force was enhanced to also deal with state storage.
I'll mark as Incomplete and we can close unless you can reproduce with 2.8.

Changed in juju:
status: Triaged → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for juju because there has been no activity for 60 days.]

Changed in juju:
status: Incomplete → Expired
Revision history for this message
Hua Zhang (zhhuabj) wrote :
Download full text (5.0 KiB)

I hit the same error in juju 2.8, actually I have deleted 1 machine and 1 volume, but it still says there is when running 'juju destroy-model' with --force option

$ juju destroy-model ssl-queens --force -y
Destroying model
Waiting for model to be removed, 1 machine(s), 1 volume(s)........^Cctrl+c detected, aborting...

$ juju destroy-model ssl-queens --force --no-wait --destroy-storage -y
Destroying model
Waiting for model to be removed, 1 machine(s), 1 volume(s)......^Cctrl+c detected, aborting...

$ juju version
2.8.6-focal-amd64

$ juju show-model ssl-queens |grep life -A 12
  life: dying
  status:
    current: destroying
    message: 'attempt 23 to destroy model failed (will retry): model not empty, found
      1 machine, 1 volume (model not empty)'
    since: "2020-11-26"
  users:
    admin:
      display-name: admin
      access: admin
      last-connection: 2 minutes ago
  sla: unsupported
  agent-version: 2.8.5

Here are logs from controllers

2020-11-27 08:13:28 WARNING juju.state cleanup.go:212 cleanup failed in model e9932e84-9288-423f-8223-54582e823299 for machine("8"): machine 8 has attachments [volume-0]
2020-11-27 08:13:28 WARNING juju.state cleanup.go:212 cleanup failed in model e9932e84-9288-423f-8223-54582e823299 for machine("8"): machine 8 has attachments [volume-0]
2020-11-27 08:13:28 WARNING juju.state cleanup.go:212 cleanup failed in model e9932e84-9288-423f-8223-54582e823299 for forceStorage("e9932e84-9288-423f-8223-54582e823299"): removing volume 0: volume is not dead

Here are dirty records from monodb although I have deleted instances and volumes

juju:PRIMARY> db.machines.find({"model-uuid":"e9932e84-9288-423f-8223-54582e823299"},{"volumes": 0})
{ "_id" : "e9932e84-9288-423f-8223-54582e823299:8", "machineid" : "8", "model-uuid" : "e9932e84-9288-423f-8223-54582e823299", "nonce" : "machine-0:19a52aeb-733e-4380-8fa8-4afabfef0903", "series" : "xenial", "containertype" : "", "principals" : [ ], "life" : 1, "jobs" : [ 1 ], "passwordhash" : "YMpZm4yG+8+IrIOU1PE6hTrI", "clean" : false, "force-destroyed" : true, "addresses" : [ { "value" : "10.5.2.155", "addresstype" : "ipv4", "networkscope" : "local-cloud", "origin" : "provider", "spaceid" : "0" } ], "machineaddresses" : [ { "value" : "10.5.2.155", "addresstype" : "ipv4", "networkscope" : "local-cloud", "origin" : "machine" }, { "value" : "252.2.155.1", "addresstype" : "ipv4", "networkscope" : "local-fan", "origin" : "machine" }, { "value" : "127.0.0.1", "addresstype" : "ipv4", "networkscope" : "local-machine", "origin" : "machine" }, { "value" : "::1", "addresstype" : "ipv6", "networkscope" : "local-machine", "origin" : "machine" } ], "preferredpublicaddress" : { "value" : "10.5.2.155", "addresstype" : "ipv4", "networkscope" : "local-cloud", "origin" : "provider", "spaceid" : "0" }, "preferredprivateaddress" : { "value" : "10.5.2.155", "addresstype" : "ipv4", "networkscope" : "local-cloud", "origin" : "provider", "spaceid" : "0" }, "supportedcontainersknown" : true, "txn-revno" : NumberLong(20), "txn-queue" : [ "5fc0a458b3abc8364068ed7c_6b015ea3", "5fc0a731b3abc8364069d2cc_b550ab65", "5fc0a7fab3abc836406a1689_d3b48390", "5fc0b3b6b3abc836406d8755_fdf3c5db", "5fc0b3e0b3abc...

Read more...

Changed in juju:
status: Expired → Confirmed
Pen Gale (pengale)
Changed in juju:
milestone: none → 2.8-next
Ian Booth (wallyworld)
Changed in juju:
milestone: 2.8-next → 2.8.10
Changed in juju:
milestone: 2.8.10 → 2.8.11
Revision history for this message
John A Meinel (jameinel) wrote :

Just removing the objects is not sufficient, because there would be a 'reference count' to other objects from those to indicate that more work needs to be done to clean up.

Also, even with 'juju destroy-model --force' it still does take some time to actually clean up records.

Changed in juju:
importance: High → Medium
milestone: 2.8.11 → none
status: Confirmed → Triaged
Revision history for this message
Canonical Juju QA Bot (juju-qa-bot) wrote :

This Medium-priority bug has not been updated in 60 days, so we're marking it Low importance. If you believe this is incorrect, please update the importance.

Changed in juju:
importance: Medium → Low
tags: added: expirebugs-bot
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.