Canonical Juju

migration: REAP failure handling issues

Bug #1667162 reported by Menno Finlay-Smits on 2017-02-23

6

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Canonical Juju	Fix Released	High	Christian Muirhead	Canonical Juju 2.3-beta2

Bug Description

REAPFAILED and an error are returned when the API call to Reap fails. This results in the worker restarting instead of ending up in REAPFAILED.

Also, consider the possibility of the migrationmaster being killed because of model doc removal, before it gets to set the phase to REAPFAILED or DONE (insert a sleep to check).

Tags:

Anastasia (anastasia-macmood) on 2017-02-23

Changed in juju:
importance:	Undecided → High

Revision history for this message

Christian Muirhead (2-xtian) wrote on 2017-07-17:

#1

I don't understand this bug - the code for `doREAP` returns `REAPFAILED, nil` on error (and always has as far as I can tell).

I haven't tried provoking the worker being killed by the model removal yet.

Revision history for this message

Christian Muirhead (2-xtian) wrote on 2017-07-17:

#2

D'oh sorry - it didn't always, but it has since December 15th 2016, before this bug was created.

Revision history for this message

Menno Finlay-Smits (menno.smits) wrote on 2017-07-17:

#3

Sorry for the confusion. I suspect this ticket was created from a list I was maintaining and I probably forgot that I'd already dealt with the first part.

The second part is still a possible issue though. doREAP ends up removing the model doc which means there's a race where `w.killed()` might return true before the migration phase gets set to DONE or REAPFAILED. I suspect the fix is to check w.killed() *after* the `SetPhase` call. Thoughts?

Revision history for this message

Christian Muirhead (2-xtian) wrote on 2017-07-17:

#4

I'm trying it now but I don't think that'll work - the migrationmaster facade is a model-specific one, so calling SetPhase after the model has gone will probably fail.

Christian Muirhead (2-xtian) on 2017-07-18

Changed in juju:
status:	Triaged → In Progress
assignee:	nobody → Christian Muirhead (2-xtian)

Revision history for this message

Christian Muirhead (2-xtian) wrote on 2017-07-18:

#5

PR to change the API Reap call to update the migration phase here: https://github.com/juju/juju/pull/7647

Revision history for this message

Christian Muirhead (2-xtian) wrote on 2017-07-18:

#6

PR for 2.2 branch: https://github.com/juju/juju/pull/7652

Changed in juju:
status:	In Progress → Fix Committed

Tim Penhey (thumper) on 2017-11-06

Changed in juju:
milestone:	2.3.0 → 2.3-beta2
status:	Fix Committed → Fix Released

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.