OS-charms should check for expected services/processes before setting workload status to a ready state.
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
nova-compute (Juju Charms Collection) |
Fix Released
|
High
|
Alex Kavanagh |
Bug Description
OS-charms should check for expected services/processes before setting workload status to a ready state.
As of the 15.10 charms, workload status can be set to "Unit is Ready," even when a critical service has failed to start.
Taking it one step further: a hook should probably fail in those cases.
I've observed bug leaks in the following, where this type of sanity check within the charm would have raised red flags before charm commits or SRUs:
- nova-compute
- swift-*
- rabbitmq-server
This also impacts automation and testability of our charms in that:
1. The amulet tests, mojo spec tests, and other tests, wait for the charm to advertise "I'm Ready" via workload status before commencing tests. Service checks in the Amulet tests will catch this leak, but other functional tests which may not inspect or exercise all relevant processes may not catch it.
2. Systems of automation, such as autopilot, mojo specs, and generic bundle deployment would be better-served by early failure, ie. a failed hook or a not-ready service, before moving on to next steps of the deployment automation.
This is targeted to the nova-compute charm for initial discussion. However, all OpenStack charms should be considered for this enhancement.
Related branches
- Liam Young (community): Approve
- David Ames (community): Approve
-
Diff: 329 lines (+256/-2)4 files modifiedcharmhelpers/contrib/network/ip.py (+15/-0)
charmhelpers/contrib/openstack/utils.py (+73/-2)
tests/contrib/network/test_ip.py (+9/-0)
tests/contrib/openstack/test_openstack_utils.py (+159/-0)
- David Ames (community): Approve
-
Diff: 220 lines (+109/-5)6 files modifiedcharmhelpers/contrib/network/ip.py (+15/-0)
charmhelpers/contrib/openstack/context.py (+5/-1)
charmhelpers/contrib/openstack/templates/section-keystone-authtoken (+11/-0)
charmhelpers/contrib/openstack/utils.py (+73/-2)
hooks/keystone_utils.py (+2/-1)
unit_tests/test_keystone_utils.py (+3/-1)
description: | updated |
Changed in nova-compute (Juju Charms Collection): | |
status: | New → In Progress |
Changed in nova-compute (Juju Charms Collection): | |
assignee: | nobody → Alex Kavanagh (ajkavanagh) |
milestone: | none → 16.04 |
importance: | Undecided → High |
Changed in nova-compute (Juju Charms Collection): | |
status: | Fix Committed → Fix Released |
I've done some digging through 4 charms and there appears to be (or perhaps the beginnings of) a pattern that defines the following useful three functions 'services()', 'restart_map()' and 'assess_ status( configs) '.
The charmhelpers. core.host module provides a 'service_ running( <service_ name_string> )' function that returns True/False if the service is running.
The charmelpers. core.host module also provides 'service(<action string>, <service name string>)' that uses the OS systemctl (systemd) of service commands to perform an action (like start, stop, restart, etc.). This is call blocks until the OS command finishes. Thus, either a 'restart' or 'start' will succeed or fail (quickly), unless the service later fails.
The proposal, therefore, is to either:
a) Modify assess_status(...) in all of the charms to call something like:
all_running = reduce( operator. and_, [service_running(s) for s in services()], True)
if not all_running:
<set state to some failed state>
(obviously, for efficiency, we might want to bail on the first 'not running' service, so we could re-write that as a for // break.)
OR
b) Change set_os_ workload_ status( ...) to test for whether the services that should be running are running, and set a failed state if they are not. This would require a charm sync across all the charms, but might be simpler from a conceptual perspective.
However, I'm not sure (enough) how set_os_ workload_ status( ...) is used to know whether this is a breaking change to how it was designed to be used.
Thoughts?