juju destroy-model stuck with state changing too quickly

Bug #1980270 reported by Mark Beierl
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Triaged
High
Unassigned

Bug Description

Environment:
Juju 2.9.32-ubuntu-amd64
Microk8s: v1.23.6

Two VMs, one with Juju and Microk8s, the secibd with only Microk8s

Deployed a custom bundle using Juju on machine 1 to a model created on Microk8s cloud for machine 2.
Deployment failed to complete with pods repeatedly stuck in CrashLoopBackoff.
juju deploy hello-kubecon also failed with CrashLoopBackoff.

At this point I decided something must be wrong, and I attempted to destroy the model:

juju destroy-model 6c804230-87a5-4143-a5c6-156a7348635e --destroy-storage

This did not complete, so I then proceeded to issue:

juju destroy-model magma-orc-kdu-354d953a-0f67-4fb7-9f8b-4218f9f0ac22 --destroy-storage --force --no-wait

This also hung at the same point. I decided to get aggressive and

juju destroy-controller --destroy-all-models --model-timeout=0m15s --destroy-storage --force osm-vca

This too hung on the same model. I then looked at juju debug-log and saw what we all don't like to see:

updating units: state changing too quickly; try again soon

Turned on tracing: juju model-config logging-config="<root>=DEBUG;juju.state.txn=TRACE"

Attached you will find:
1. juju debug-log --replay
2. juju dump-db
3. Text snippet of a failed txn

Revision history for this message
Mark Beierl (mbeierl) wrote :
Revision history for this message
Mark Beierl (mbeierl) wrote :

JUJU_DEV_FEATURE_FLAGS=developer-mode juju dump-db > juju_dump-db.txt

Revision history for this message
Mark Beierl (mbeierl) wrote :
John A Meinel (jameinel)
Changed in juju:
importance: Undecided → High
milestone: none → 2.9-next
status: New → Triaged
Revision history for this message
John A Meinel (jameinel) wrote :

First glance at the failing transaction, it has a weird repeated stanza. It first asserts the content of the `statuses` document, (without doing anything to it), and then later has an operation to remove that document.

```
    {
        C: "statuses",
        Id: "52cc2288-4f0c-4a4d-8aba-701e56f6cb0d:u#orc8r-metricsd/0",
        Assert: bson.D{
            {
                Name: "status",
                Value: "allocating",
            },
        },
        Insert: nil,
        Update: nil,
        Remove: false,
    },
...
    {
        C: "statuses",
        Id: "52cc2288-4f0c-4a4d-8aba-701e56f6cb0d:u#orc8r-metricsd/0",
        Assert: nil,
        Insert: nil,
        Update: nil,
        Remove: true,
    },
```

Revision history for this message
Joseph Phillips (manadart) wrote :

First off, thanks for the comprehensive bug report.

From the DB dump, it looks like the errant transaction is for a clean-up that at the time of the dump, has been actioned already.

The controller has no units, and the model appears to have already been removed.

Ian Booth (wallyworld)
Changed in juju:
milestone: 2.9-next → none
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.