HA tests fail after the leader is deleted
Bug #1640535 reported by
Curtis Hovey
This bug affects 2 people
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Canonical Juju |
Fix Released
|
Critical
|
Horacio Durán |
Bug Description
As seen in
http://
Juju CI started seeing systemic HA failures about Friday Nov 3. The failures were in AWS, Rackspace, Azure, and Prodstack. The failures were across trusty and xenial controllers. After the leader is deleted, juju status fails. We expect juju to fall back to another controller.
The first signs of the problem are in
http://
http://
After a change to the test to ensure lingering procs, we see a consistent timeout waiting for a new controller to answer the clients call to status.
http://
Changed in juju: | |
assignee: | James Tunnicliffe (dooferlad) → Richard Harding (rharding) |
description: | updated |
Changed in juju: | |
milestone: | 2.1-beta1 → 2.1-beta2 |
Changed in juju: | |
assignee: | Richard Harding (rharding) → Horacio Durán (hduran-8) |
Changed in juju: | |
status: | Triaged → In Progress |
Changed in juju: | |
milestone: | 2.1-beta2 → none |
Changed in juju: | |
milestone: | none → 2.1-rc1 |
Changed in juju: | |
status: | In Progress → Fix Committed |
Changed in juju: | |
milestone: | 2.1-rc1 → 2.1-beta3 |
Changed in juju: | |
status: | Fix Committed → Fix Released |
To post a comment you must log in.
While a prodstack test was failing, I was able to confirm on the machine that juju status failed. I see the three controllers that "nova list" show in the api-endpoints in JUJU_DATA dir setup for the test.
$ cat /var/lib/ jenkins/ cloud-city/ jes-homes/ functional- ha-recovery- prodstack/ controllers. yaml ha-recovery- prodstack: api-endpoints: ['10.25. 28.67:17070' , '10.25.28.7:17070', '10.25. 28.70:17070' ] 8236-4984- 8d91-7be04f1e47 32 28.67:17070' , '10.25.28.7:17070', '10.25. 28.70:17070' ] machine- count: 1 controller- machine- count: 0 ha-recovery- prodstack
controllers:
functional-
unresolved-
uuid: 60807685-
api-endpoints: ['10.25.
ca-cert: |
-----BEGIN CERTIFICATE-----
CERT
-----END CERTIFICATE-----
cloud: prodstack45
region: bootstack-ps45
agent-version: 2.1-beta1
controller-
active-
model-count: 2
machine-count: 1
current-controller: functional-