model migration fails while removing from original controller

Bug #1611391 reported by John A Meinel
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
High
Menno Finlay-Smits

Bug Description

I was testing out model migration status messages, but it appears a migration got stuck at:
  successful: removing model from source controller

Looking at the debug log it is full of lines like:
  ERROR juju.worker.dependency engine.go:539 "migration-master" manifold worker returned unexpected error: can't remove model: model not being exported for migration

My comments

1) I'm guessing one controller finished a bit faster than the other expected, and stopped exporting the model before the other realized it was gone. Maybe something crashed in the meantime, I'm not really sure.

2) Do we need a way to tell "yes I had that at some point in the past, but it has been removed", or can we just treat the above error as "it must have already been removed".

3) The fact that the migration-master was actually in a critically failing state was not relayed at all to "juju status", and juju status just says forever "I'm successful, and just not done yet."

4) Because the migration is still in progress, I can't do anything like try to migrate back-to the other controller. At this point, I can only really destroy the model and bring it back up. Do we need something to tell Juju "no, you really are happy with the migration now"?

5) I am *also* unable to "juju destroy-model A:foo" but it doesn't give me an error. It just is still there the next time I do "juju status -m A:foo". (oddly, calling destroy-model 2 times in a row has the second one fail with model not found, but calling status inbetween makes the second one think it is working.)
I pulled out the issues with 'destroy-*' into https://bugs.launchpad.net/juju-core/+bug/1611404

John A Meinel (jameinel)
description: updated
John A Meinel (jameinel)
description: updated
description: updated
John A Meinel (jameinel)
description: updated
Changed in juju-core:
assignee: nobody → Menno Smits (menno.smits)
Revision history for this message
Menno Finlay-Smits (menno.smits) wrote :

I'm pretty sure this has something to do with starting a migration out of a model while it is still be migrating into it (or similar). This won't be possible once the API lockdown and migration prechecks are in place. I'd still like to be more certain however.

While trying to reproduce, I ran a newer issue which is always preventing A -> B -> A migrations (not just occasionally). I'll deal with that first and come back to this. This is bug 1612500.

Changed in juju-core:
milestone: none → 2.0-beta16
Revision history for this message
Menno Finlay-Smits (menno.smits) wrote :

I *think* I've been able to replicate this by starting a migration for a model as it's still being migrated into the controller. In my case, the resulting outcomes weren't quite as disastrous though - the migration attempt aborted cleanly and both controllers were able to be destroyed. This is probably down to timing though. I can imagine situations where things could end up in a bad state due to this.

It certainly shouldn't be possible to start a migration for a model that is already involved in a migration. The precheck work - which is currently in progress - will block this.

I'll leave this ticket open as a reminder until the prechecks are in place.

Changed in juju-core:
status: Triaged → In Progress
affects: juju-core → juju
Changed in juju:
milestone: 2.0-beta16 → none
milestone: none → 2.0-beta16
Curtis Hovey (sinzui)
Changed in juju:
milestone: 2.0-beta16 → 2.0-beta17
Curtis Hovey (sinzui)
Changed in juju:
milestone: 2.0-beta17 → 2.0-beta18
Revision history for this message
Menno Finlay-Smits (menno.smits) wrote :

I figured out how to replicate the problem reliably even with prechecks in place, and have a solution here: https://github.com/juju/juju/pull/6159. The PR description explains the issue in more detail.

Changed in juju:
status: In Progress → Fix Committed
Curtis Hovey (sinzui)
Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.