kill-controller is stuck, lots of "lease manager stopped" errors

Bug #1573136 reported by Casey Marshall
26
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Expired
Medium
Unassigned
juju-core
Won't Fix
High
Unassigned
1.25
Won't Fix
Undecided
Unassigned

Bug Description

juju-2.0-beta5, lxd controller, lxd 2.0.0-0ubuntu4

`juju kill-controller` is stuck and not making progress in tearing down my models & controller.

Messages writing to machine-0.log on the controller every 3 seconds like this:

2016-04-21 16:47:47 ERROR juju.worker.dependency engine.go:526 "is-responsible-flag" manifold worker returned unexpected error: lease manager stopped
2016-04-21 16:47:50 ERROR juju.worker.dependency engine.go:526 "is-responsible-flag" manifold worker returned unexpected error: lease manager stopped
2016-04-21 16:47:53 ERROR juju.worker.dependency engine.go:526 "is-responsible-flag" manifold worker returned unexpected error: lease manager stopped
2016-04-21 16:47:56 ERROR juju.worker.dependency engine.go:526 "is-responsible-flag" manifold worker returned unexpected error: lease manager stopped
2016-04-21 16:47:59 ERROR juju.worker.dependency engine.go:526 "is-responsible-flag" manifold worker returned unexpected error: lease manager stopped
2016-04-21 16:48:02 ERROR juju.worker.dependency engine.go:526 "is-responsible-flag" manifold worker returned unexpected error: lease manager stopped
2016-04-21 16:48:05 ERROR juju.worker.dependency engine.go:526 "is-responsible-flag" manifold worker returned unexpected error: lease manager stopped
2016-04-21 16:48:08 ERROR juju.worker.dependency engine.go:526 "is-responsible-flag" manifold worker returned unexpected error: lease manager stopped
2016-04-21 16:48:11 ERROR juju.worker.dependency engine.go:526 "is-responsible-flag" manifold worker returned unexpected error: lease manager stopped
2016-04-21 16:48:14 ERROR juju.worker.dependency engine.go:526 "is-responsible-flag" manifold worker returned unexpected error: lease manager stopped
2016-04-21 16:48:17 ERROR juju.worker.dependency engine.go:526 "is-responsible-flag" manifold worker returned unexpected error: lease manager stopped
2016-04-21 16:48:20 ERROR juju.worker.dependency engine.go:526 "is-responsible-flag" manifold worker returned unexpected error: lease manager stopped
2016-04-21 16:48:23 ERROR juju.worker.dependency engine.go:526 "is-responsible-flag" manifold worker returned unexpected error: lease manager stopped
2016-04-21 16:48:26 ERROR juju.worker.dependency engine.go:526 "is-responsible-flag" manifold worker returned unexpected error: lease manager stopped

I'm going to nuke the lxc containers and try again with master to see if the issue has been fixed...

Revision history for this message
Cheryl Jennings (cherylj) wrote :

I saw the same problem when trying to clean up the environment in bug #1572237, and I think this is the same underlying issue. I think for that bug, the fix William is working on is specific to the pinger, but there is a more general issue of sub-state workers not being managed workers yet. We may use this bug for that issue.

And, there's also bug #1566426 which asks for kill-controller to time out when destroying through the API when we have cases like this where the environment is broken.

Changed in juju-core:
status: New → Triaged
importance: Undecided → High
milestone: none → 2.0-rc1
summary: - kill-controller is stuck
+ kill-controller is stuck, lots of "lease manager stopped" errors
Curtis Hovey (sinzui)
Changed in juju-core:
milestone: 2.0-beta6 → 2.0-beta7
Curtis Hovey (sinzui)
Changed in juju-core:
milestone: 2.0-beta7 → 2.0-beta8
Revision history for this message
Cheryl Jennings (cherylj) wrote :

Looks like William has PRs for this in master:
https://github.com/juju/juju/pull/5355
https://github.com/juju/juju/pull/5367

He's evaluating the risk of backporting to 1.25.

Changed in juju-core:
status: Triaged → Fix Committed
assignee: nobody → William Reade (fwereade)
Curtis Hovey (sinzui)
Changed in juju-core:
status: Fix Committed → Fix Released
Revision history for this message
Anastasia (anastasia-macmood) wrote :

@William - have you considered whether the fix can be backported to 1.25 as per comment #2?

affects: juju-core → juju
Changed in juju:
milestone: 2.0-beta8 → none
milestone: none → 2.0-beta8
Changed in juju-core:
importance: Undecided → High
status: New → Triaged
Changed in juju-core:
status: Triaged → Won't Fix
Revision history for this message
Anastasia (anastasia-macmood) wrote :

From William:

It should be possible [to backport], yes.
Worst case it wants to run some worker that 1.25 doesn't have, but that should be trivial to change.

Revision history for this message
Adam Stokes (adam-stokes) wrote :

I've seen this on juju 2 beta17 with the same lease manager errors. The agents become in a blocked state with the agent status as failed.

Bouncing jujud on the controller unwedges this and my deployments will continue.

tags: added: conjure
Changed in juju:
status: Fix Released → Triaged
milestone: 2.0-beta8 → 2.0-rc1
assignee: William Reade (fwereade) → nobody
importance: High → Medium
Changed in juju:
milestone: 2.0-rc1 → 2.0.1
Curtis Hovey (sinzui)
Changed in juju:
milestone: 2.0.1 → none
Revision history for this message
Canonical Juju QA Bot (juju-qa-bot) wrote :

This bug has not been updated in 5 years, so we're marking it Expired. If you believe this is incorrect, please update the status.

Changed in juju:
status: Triaged → Expired
tags: added: expirebugs-bot
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.