Service with just one unit left which doesn't think it's the leader

Bug #1488166 reported by Adam Collard
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
juju-core
Triaged
High
William Reade

Bug Description

I'm not sure of exact steps to reproduce this, but broadly it was

1. juju deploy foo
2. // Oops, hit an error in install hook because I forgot some config
3. juju set foo bar=baz
4. juju retry --resolved foo/0
5. // Oops that still didn't work
6. juju destroy-unit foo/0
7. juju set foo bar=baz
8. juju add-unit foo --to lxc:0
9. // WTH, my service is behaving weirdly
10. juju run --service foo is-leader

Output:
False

Expected:
True

$ juju version
1.24.5-vivid-amd64

MAAS provider using 1.8.1

Curtis Hovey (sinzui)
Changed in juju-core:
status: New → Triaged
importance: Undecided → High
milestone: none → 1.25.0
Curtis Hovey (sinzui)
Changed in juju-core:
milestone: 1.25-alpha1 → 1.25-beta1
Changed in juju-core:
assignee: nobody → William Reade (fwereade)
milestone: 1.25-beta1 → 1.25-beta2
Revision history for this message
William Reade (fwereade) wrote :

Did it stay false forever, or did it get elected within 60s? Would love to see logs if you can repro...

Revision history for this message
Alexis Bruemmer (alexis-bruemmer) wrote :

Please provide info requested in comment #1

Changed in juju-core:
status: Triaged → Incomplete
Revision history for this message
Adam Collard (adam-collard) wrote : Re: [Bug 1488166] Re: Service with just one unit left which doesn't think it's the leader

On Thu, 17 Sep 2015 at 13:30 William Reade <email address hidden>
wrote:

> Did it stay false forever, or did it get elected within 60s? Would love
> to see logs if you can repro...
>

It stayed false forever, I looked multiple times because it was very
surprising.

Changed in juju-core:
status: Incomplete → New
Curtis Hovey (sinzui)
Changed in juju-core:
status: New → Triaged
William Reade (fwereade)
Changed in juju-core:
status: Triaged → In Progress
Revision history for this message
William Reade (fwereade) wrote :

So, secure in the knowledge that it *can* happen, I'm still looking; but there's nothing to indicate where the missing link is; so, if you ever come across it again, please:

  * set logging config to contain `juju.worker.leadership=DEBUG;juju.state.leadership=TRACE;state.lease=TRACE` in addition to whatever else you usually use
  * let it run for 60s or so to capture whatever's happening to begin with
  * bounce the affected unit agent, see if the unit gets elected (will happen within a few seconds or not at all)
  * bounce the state server, see if the unit gets elected (similarly)

...and post the logs for the state server and the unit agent to the bug.

description: updated
Changed in juju-core:
milestone: 1.25-beta2 → 1.25.1
Revision history for this message
Cheryl Jennings (cherylj) wrote :

If you can reproduce this, please enable DEBUG logging and send in machine and unit logs.

Changed in juju-core:
status: In Progress → Incomplete
Curtis Hovey (sinzui)
Changed in juju-core:
milestone: 1.25.1 → 1.25.2
Changed in juju-core:
milestone: 1.25.2 → 1.25.3
Revision history for this message
Caio Begotti (caio1982) wrote :

FWIW I just had this problem this morning with 1.24.6.1 (trusty) and it definitely took longer than 60s. I was removing and adding a few relations meanwhile so perhaps that triggered some update in the agents, I'm not sure. Anyway, the leader status got updated after a while.

Revision history for this message
Björn Tillenius (bjornt) wrote :

Running the integration tests for the landscape-server charms using Juju 1.25.0 with the local provider reproduces this bug every time.

The tests starts with having one landscape-server unit, then adds another and then kills the leader. After the leader is killed, it doesn't get re-elected. I waited at least 10 minutes without a new leader being elected.

Restarting the unit agent didn't do anything. Restarting the state server did result in a leader being elected.

I'm attaching the all-machines.log file and the unit-landscape-server-*.log files,

Revision history for this message
Björn Tillenius (bjornt) wrote :
Revision history for this message
Björn Tillenius (bjornt) wrote :
Revision history for this message
Björn Tillenius (bjornt) wrote :

The environment had logging-config=`juju.worker.leadership=DEBUG;juju.state.leadership=TRACE;state.lease=TRACE`

Changed in juju-core:
status: Incomplete → New
Changed in juju-core:
status: New → Triaged
tags: added: kanban-cross-team
tags: removed: kanban-cross-team
Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Seems to be the same as https://bugs.launchpad.net/juju-core/+bug/1511659 where others have reproduced it quite reliably.

Revision history for this message
Ursula Junque (ursinha) wrote :

I can reproduce the issue consistently. Once I poweroff the leader, the other unit just won't show up as leader, even long after 60s.
Juju 1.25.0 in wily.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.