juju.state.leadership manager.go:72 stopping leadership manager with error: state changing too quickly; try again soon
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
juju-core |
Fix Released
|
High
|
Andrew Wilkins |
Bug Description
[Environment]
1.25.13
Trusty
[Description]
Juju is operating normally, until the following entry is displayed in the logs, at
this point isn't longer possible to operate Juju (deploy/
machine-0: 2017-11-03 19:02:59 ERROR juju.state.
Preceded by the following sequence:
machine-0: 2017-11-03 19:02:59 TRACE state.lease.
machine-0: 2017-11-03 19:02:59 TRACE state.lease.
machine-0: 2017-11-03 19:02:59 DEBUG juju.state.
machine-0: 2017-11-03 19:02:59 TRACE juju.state.
machine-0: 2017-11-03 19:02:59 TRACE state.lease.
machine-0: 2017-11-03 19:02:59 TRACE juju.state.
machine-0: 2017-11-03 19:02:59 TRACE state.lease.
machine-0: 2017-11-03 19:02:59 TRACE juju.state txns.go:164 rewrote transaction: []txn.Op{
exists":false}}}}, Insert:interface {}(nil), Update:
509735778784391
The only possible workaround is to restart jujud on machine-0.
tags: | added: sts |
Changed in juju-core: | |
status: | New → Triaged |
importance: | Undecided → High |
assignee: | nobody → Andrew Wilkins (axwalk) |
Changed in juju-core: | |
status: | Triaged → In Progress |
milestone: | none → 1.25.14 |
Changed in juju-core: | |
status: | Fix Committed → Fix Released |
The lease docs and transactions look OK, so I suspect there really is a lot of contention on the lease docs.
The Juju 1.25 branch has a few issues that would cause this:
- when the leadership manager dies, it remains dead and is not restarted automatically
- there are potentially many leadership managers running concurrently, within the same process, trying to maintain leases
It's possible that those many leadership managers are each making changes to the same clock document; there's a single clock document for all applications/ services.