juju-core

A subordinate charm hook scheduled to run(but waiting for the principal charm hook to release the lock) goes to an error state after the principal charm hook triggers a reboot.

Bug #1464470 reported by Adrian Vladu on 2015-06-12

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	juju-core	Fix Released	High	Bogdan Teleaga	juju-core 1.25-alpha1
	1.24	Fix Released	High	Bogdan Teleaga	juju-core 1.24.2

Bug Description

This scenario needs at least 3 charms:
- one principal and one subordinate charm
- a third charm that services the principal charm

A relation must exist between the principal and the third charm.

This issue happens only when the principal charm executes the relation hook triggered by the third charm(let's name this hook third-relation-joined-hook), while the subordinate charm has at least a charm hook in the queue(let's name this hook secondary-relation-hook). secondary-relation-hook execution must wait for the lock to be released by third-relation-joined-hook.

if third-relation-joined-hook triggers a reboot using the command "juju-reboot --now", after the subsequent reboot, secondary-relation-hook goes into an error state.

This issue happens on Windows 2012 R2 with Juju versions 1.24 and 1.25.

Tags:

Adrian Vladu (avladu) on 2015-06-12

summary:

- A subordinate charm hook scheduled to run(but it is waiting for the
- principal charm hook to release the lock) goes to an error state after
- the principal charm triggers a reboot.
+ A subordinate charm hook scheduled to run(but waiting for the principal
+ charm hook to release the lock) goes to an error state after the
+ principal charm hook triggers a reboot.

Revision history for this message

Gabriel Samfira (gabriel-samfira) wrote on 2015-06-12:

This I believe happens because of the way hooks are run. The hook run has been split up in 3 steps:

* prepare
* execute
* commit

During prepare, the state of the hook is written, but we only start locking for hook execution in the "execute" phase. So we can have one hook that requires a reboot executing, and another one finishing the prepare phase, ready to execute. If we reboot the machine, when the uniter comes back up, it will see the second hook in the preparing phase, but has no knowledge of this happening, and sets it in error state.

I think this can be fixed by simply locking in the prepare phase and unlocking after execute, instead of locking in execute.

Revision history for this message

Gabriel Samfira (gabriel-samfira) wrote on 2015-06-12:

Here is a quick and dirty fix for this:

https://github.com/juju/juju/compare/1.24...gabriel-samfira:executor-lock?expand=1

I am unsure on how to do this cleanly, or what the deeper implications of locking in "prepare" are. Any advice from @fwreade would be welcome :).

Curtis Hovey (sinzui) on 2015-06-12

tags:	added: reboot subordinate
tags:	added: windows
Changed in juju-core:
status:	New → Triaged
importance:	Undecided → High
milestone:	none → 1.25.0

Curtis Hovey (sinzui) on 2015-06-12

tags:	added: regression
tags:	added: hooks

Revision history for this message

William Reade (fwereade) wrote on 2015-06-16:

I think this is essentially sound, but that logic is a bit tangly. Can we do it in operation.executor perhaps? (e.g. add a NeedsGlobalMachineLock() bool method to the Operation interface?, and check/acquire that first of all before even calling prepare?)

Gabriel Samfira (gabriel-samfira) on 2015-06-18

Changed in juju-core:
assignee:	nobody → Gabriel Samfira (gabriel-samfira)

Gabriel Samfira (gabriel-samfira) on 2015-06-22

Changed in juju-core:
assignee:	Gabriel Samfira (gabriel-samfira) → Bogdan Teleaga (bteleaga)
status:	Triaged → In Progress

Revision history for this message

Bogdan Teleaga (bteleaga) wrote on 2015-06-30:

http://reviews.vapour.ws/r/2058/ <--- Backport

Changed in juju-core:
status:	In Progress → Fix Committed

Curtis Hovey (sinzui) on 2015-07-07

Changed in juju-core:
status:	Fix Committed → Fix Released

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.