Minor update of HA services doesn't restart containers on config change
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tripleo |
Fix Released
|
High
|
Damien Ciabrini |
Bug Description
On HA overcloud, there are three different ways a pacemaker-managed container may need to be restarted:
1. a container image update
2. a tripleo service config change (in /var/lib/
3. a pacemaker resource config update (i.e. pcs resource update <...>)
Case 1. has to take place during a minor update workflow (i.e. not during a stack redeploy/update), because
it requires a coordinated action. In this workflow, various ansible tasks are executed, sequentially, one controller after the other. The sequentiality of the workflow ensure that the image update is coordinated across the entire cluster.
Case 2. and 3. are handled a bit differently depending on the action (stack update or minor update workflow):
Stack update:
docker-puppet regenerates the service configs on all the controller nodes. Then:
. a special transient container <service>
. another special container <service>
Minor update:
The pacemaker cluster is restarted on each controller node, sequentially. This guarantees that all pacemaker-managed containers are being restarted unconditionally, and without service disruption (services restart on one node at a time, so there are always two controller nodes available).
Thanks to the unconditional restart, we avoid running container <service>
However, the approach of "skipping <service>
[1] https:/
tags: | added: queens-backport-potential |
Fix proposed to branch: master /review. opendev. org/679102
Review: https:/