APP status remains "active" even after agent is lost
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Canonical Juju |
Fix Released
|
High
|
Unassigned |
Bug Description
How to reproduce:
1. deploy one service
2. forcibly stop the machine hosting the service (lxc stop --force or IPMI force power-off)
Expected:
APP status will be changed from "active" to something indicating error
Actual:
APP status remains as "active"
$ juju status mysql
MODEL CONTROLLER CLOUD/REGION VERSION
default localhost lxd/localhost 2.0-beta16
APP VERSION STATUS EXPOSED ORIGIN CHARM REV OS
mysql active false jujucharms percona-cluster 2 ubuntu
RELATION PROVIDES CONSUMES TYPE
shared-db glance mysql regular
shared-db keystone mysql regular
cluster mysql mysql peer
shared-db mysql neutron-api regular
shared-db mysql nova-cloud-
UNIT WORKLOAD AGENT MACHINE PUBLIC-ADDRESS PORTS MESSAGE
mysql/0 unknown lost 2 10.0.8.122 agent is lost, sorry! See 'juju status-history mysql/0'
MACHINE STATE DNS INS-ID SERIES AZ
2 started 10.0.8.122 juju-4a9c27-2 xenial
Changed in juju: | |
status: | New → Triaged |
importance: | Undecided → High |
milestone: | none → 2.0-beta18 |
Changed in juju: | |
milestone: | 2.0-beta18 → 2.0-beta19 |
Changed in juju: | |
milestone: | 2.0-beta19 → 2.0-rc1 |
Changed in juju: | |
milestone: | 2.0-rc1 → 2.0-rc2 |
Changed in juju: | |
milestone: | 2.0-rc2 → 2.1.0 |
Changed in juju: | |
status: | Fix Committed → Fix Released |
This issue has been addressed in recent times as part of other observability related issues.
I have recently tested it with AWS:
* bootstrap;
* deploy charm with some units;
* stop a unit machine via CLI (outside of Juju);
* watch juju status.
Machine status was updated to 'down'.