resource cleanups fail if resource is already gone (preventing migration)

Bug #1705352 reported by james beedy
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
High
Christian Muirhead

Bug Description

`juju migrate` fails with

$ juju --version
2.2.2-xenial-amd64

$ which juju
/snap/bin/juju

$ juju migrate view-nonprod creativedrive --debug

18:00:54 INFO juju.cmd supercommand.go:63 running juju [2.2.2 gc go1.8]
18:00:54 DEBUG juju.cmd supercommand.go:64 args: []string{"/snap/juju/2142/bin/juju", "migrate", "view-nonprod", "creativedrive", "--debug"}
18:00:54 INFO juju.juju api.go:67 connecting to API addresses: [13.59.133.96:17070 172.100.0.24:17070]
18:00:54 DEBUG juju.api apiclient.go:863 successfully dialed "wss://13.59.133.96:17070/api"
18:00:54 INFO juju.api apiclient.go:617 connection established to "wss://13.59.133.96:17070/api"
18:00:54 DEBUG juju.api monitor.go:35 RPC connection died
18:00:54 INFO juju.juju api.go:67 connecting to API addresses: [34.201.66.228:17070 10.20.0.51:17070]
18:00:54 DEBUG juju.api apiclient.go:863 successfully dialed "wss://34.201.66.228:17070/api"
18:00:54 INFO juju.api apiclient.go:617 connection established to "wss://34.201.66.228:17070/api"
ERROR source prechecks failed: cleanup needed
18:00:54 DEBUG cmd supercommand.go:459 error stack:
source prechecks failed: cleanup needed
github.com/juju/juju/api/controller/controller.go:288:

Revision history for this message
Christian Muirhead (2-xtian) wrote :

Hi James, could you get us logs from the source controller? juju debug-log -m controller --replay

It sounds like there are deferred cleanup actions hanging around (these are generally created when units/machines/relations are removed) that should have been handled and deleted. So maybe the worker that handle them is hitting an error.

Revision history for this message
james beedy (jamesbeedy) wrote :

@2-xtian http://paste.ubuntu.com/25133859/

Yeah, I have some stuck machines that won't die in a model on that controller, but not the one I'm trying to migrate ...

Revision history for this message
Christian Muirhead (2-xtian) wrote :

I can see the following lines appearing periodically in the log:

machine-0: 17:30:01 ERROR juju.state cleanup failed for resourceBlob("application-view-main/resources/cloudfront-private-key-dc2d9d11-ba7d-492f-820a-67b9642acb1b"): resource at path "buckets/136132b3-e8e8-4c70-839f-79d23ab24fec/application-view-main/resources/cloudfront-private-key-dc2d9d11-ba7d-492f-820a-67b9642acb1b" not found
machine-0: 17:30:01 ERROR juju.state cleanup failed for resourceBlob("application-postgresql/resources/wal-e"): resource at path "buckets/129a3d11-4d70-4504-86b6-c8442c95ae12/application-postgresql/resources/wal-e" not found

The part after "buckets/" in the path is the model UUID - if either of those lines is for the model you're migrating then that would be the problem.

I'm not sure why the resources have already been deleted from blob storage, but it seems like those cleanups should succeed (maybe logging but probably not) in that case. I'll change that.

Changed in juju:
status: New → In Progress
assignee: nobody → Christian Muirhead (2-xtian)
milestone: none → 2.2.3
summary: - `juju migrate` fails
+ resource cleanups fail if resource is already gone (preventing
+ migration)
Revision history for this message
Christian Muirhead (2-xtian) wrote :

PR for 2.2: https://github.com/juju/juju/pull/7664

Fixes the underlying problem of multiple cleanups for one storage path being scheduled as well as making the resourceBlob cleanup more tolerant/idempotent.

Changed in juju:
status: In Progress → Fix Released
importance: Undecided → High
status: Fix Released → Fix Committed
Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.