[2.6-rc1] model deletion got stuck

Bug #1827461 reported by Dmitrii Shcherbakov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Expired
High
Unassigned

Bug Description

Model deletion got stuck with the following message printed constantly:

Waiting on model to be removed, 8 application(s)...

juju status -m controller
Model Controller Cloud/Region Version SLA Timestamp
controller samaas samaas 2.6-rc1 unsupported 03:47:17Z

Machine State DNS Inst id Series AZ Message
0 started 10.232.1.60 8tk7tc bionic default Deployed

IIRC the model was upgraded from 2.5.4 to 2.6-rc1 via:

juju upgrade-juju -m controller --agent-version=2.6-rc1

I can see a lot of error messages in the log:

https://private-fileshare.canonical.com/~dima/juju-dumps/03-05-2019-machine-0.log

2019-05-03 03:39:31 ERROR juju.apiserver.common resource.go:118 error stopping *apiserver.pingTimeout resource: ping timeout
2019-05-03 03:39:31 TRACE juju.state.pool.txnwatcher txnwatcher.go:322 txn watcher 0xc420231040 starting sync
2019-05-03 03:39:36 TRACE juju.state.pool.txnwatcher txnwatcher.go:322 txn watcher 0xc420231040 starting sync
2019-05-03 03:39:38 TRACE juju.state.watcher hubwatcher.go:457 0xc4209163c0 got request: watcher.reqUnwatch{key:watcher.watchKey{c:"controllers", id:"apiHostPortsForAgents"}, ch:(chan<- watcher.Change)(0xc428230600)}
2019-05-03 03:39:38 TRACE juju.state.watcher hubwatcher.go:457 0xc4209163c0 got request: watcher.reqUnwatch{key:watcher.watchKey{c:"settings", id:"55f0db45-5427-47f6-82a2-c070130f969b:e"}, ch:(chan<- watcher.Change)(0xc428230660)}
2019-05-03 03:39:38 TRACE juju.state.watcher hubwatcher.go:457 0xc4209163c0 got request: watcher.reqUnwatch{key:watcher.watchKey{c:"settings", id:"55f0db45-5427-47f6-82a2-c070130f969b:e"}, ch:(chan<- watcher.Change)(0xc4269cfbc0)}
2019-05-03 03:39:38 TRACE juju.state.watcher hubwatcher.go:457 0xc4209163c0 got request: watcher.reqUnwatch{key:watcher.watchKey{c:"controllers", id:"apiHostPortsForAgents"}, ch:(chan<- watcher.Change)(0xc4268cc600)}
2019-05-03 03:39:38 TRACE juju.state.watcher hubwatcher.go:457 0xc4209163c0 got request: watcher.reqUnwatch{key:watcher.watchKey{c:"migrations.status", id:interface {}(nil)}, ch:(chan<- watcher.Change)(0xc4269cf380)}
2019-05-03 03:39:38 TRACE juju.state.watcher hubwatcher.go:457 0xc4209163c0 got request: watcher.reqUnwatch{key:watcher.watchKey{c:"migrations.status", id:interface {}(nil)}, ch:(chan<- watcher.Change)(0xc42addfb60)}
2019-05-03 03:39:38 ERROR juju.apiserver.common resource.go:118 error stopping *apiserver.pingTimeout resource: ping timeout

juju_pubsub_report
Querying @jujud-machine-0 introspection socket: /pubsub
PubSub Report:

Source: machine-0

juju_presence_report
Querying @jujud-machine-0 introspection socket: /presence/
[20042bd3-d8f2-4fde-8507-a37d701c9597]

AGENT SERVER CONN ID STATUS
machine-0 machine-0 1 alive
machine-0 machine-0 3 alive

[55f0db45-5427-47f6-82a2-c070130f969b]

AGENT SERVER CONN ID STATUS
machine-0 (controller) machine-0 4 alive

juju_engine_report https://paste.ubuntu.com/p/3kHYnMTZ2N/

juju_goroutines https://paste.ubuntu.com/p/YKFw55xBRM/

juju_statepool_report https://paste.ubuntu.com/p/p7vVRPW969/

juju_statetracker_report https://paste.ubuntu.com/p/TCmvQbhV4J/

Tags: cpe-onsite
Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

Setting to incomplete per Tim's comment in IRC. I will need to reproduce the issue to provide a juju-db dump.

Changed in juju:
status: New → Opinion
status: Opinion → Incomplete
Revision history for this message
Anastasia (anastasia-macmood) wrote :

@Dmitrii Shcherbakov (dmitriis),

When you mange to reproduce and once you get a dump, could you please also try a workaround?

We have found that running 'remove-application --force' or 'remove-machine --force' in a separate console on a lingering entities gets model-destruction unstuck..

For historical references, this is what Tim was requesting on IRC:
[14:58:19] <thumper> the details that you have appended aren't enough to help diagnose fully
[14:58:41] <thumper> clearly there is another model there that has applications, can you get a status of that
[14:58:53] <thumper> also for the model that isn't dying, a dump-db would be useful
[14:59:06] <thumper> you need the 'developer-mode' feature flag set to capture that

Revision history for this message
Anastasia (anastasia-macmood) wrote :

Also this could be related to bug # 1827285

Ian Booth (wallyworld)
Changed in juju:
milestone: none → 2.6.1
importance: Undecided → High
Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

Hmm, tried the following:

1) bootstrap a clean 2.5.4 controller;
2) deploy an openstack model;
3) upgrade to 2.6-rc1;
3) scale the model;
4) destroy the model (successful).

I still see a lot of similar errors in the controller log after destroy-model which eventually stop:

2019-05-03 10:43:02 TRACE juju.state.txn database.go:359 ran transaction in 0.022s (retries: 0) []txn.Op{

# ...

                    {
                        Name: "model-uuid",
                        Value: "2108fac7-ea12-402c-84e3-cd7e1e15a762",
                    },
                    {
                        Name: "status",
                        Value: "destroying",
                    },

# ...

2019-05-03 10:44:59 ERROR juju.apiserver.common resource.go:118 error stopping *apiserver.pingTimeout resource: ping timeout
2019-05-03 10:44:59 TRACE juju.state.watcher hubwatcher.go:457 0xc000e1a0f0 got request: watcher.reqUnwatch{key:watcher.watchKey{c:"migrations.status", id:interface {
}(nil)}, ch:(chan<- watcher.Change)(0xc00ddccd20)}
2019-05-03 10:44:59 TRACE juju.state.watcher hubwatcher.go:373 loop finished
2019-05-03 10:44:59 DEBUG juju.state open.go:219 closed state without error

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

Attached a mongodump for the comment above (no repro unfortunately).

https://private-fileshare.canonical.com/~dima/juju-dumps/03-05-2019-mongodump-1556882140.tar.gz

Changed in juju:
milestone: 2.6.1 → 2.6.2
Changed in juju:
milestone: 2.6.2 → 2.6.3
Changed in juju:
milestone: 2.6.3 → 2.6.4
Changed in juju:
milestone: 2.6.4 → 2.6.5
Changed in juju:
milestone: 2.6.5 → 2.6.6
Changed in juju:
milestone: 2.6.6 → 2.6.7
Changed in juju:
milestone: 2.6.7 → none
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for juju because there has been no activity for 60 days.]

Changed in juju:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.