HA: restarting control plane nodes in parallel creates unexpected pacemaker shutdown
Bug #1904193 reported by
Damien Ciabrini
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tripleo |
In Progress
|
High
|
Damien Ciabrini |
Bug Description
Originally reported in [1].
Minor update is a sequential process, each node of the control plan
must be updated sequentially.
But in a composable HA control plane comprised of several roles (DB, API, Messaging...),
nothing prevents each role can from running its minor update in parallel of other roles.
Unfortunately this has a side effect in pacemaker: sometimes stopping several pacemaker
nodes concurrently can perturbate the election of a new DC, and cause a spurious stop
on the impacted pacemaker node. This node would then stay down, and prevent the usual
minor update workflow to finish succesfully.
Changed in tripleo: | |
status: | Confirmed → In Progress |
Changed in tripleo: | |
importance: | Undecided → High |
Changed in tripleo: | |
milestone: | wallaby-1 → wallaby-2 |
Changed in tripleo: | |
milestone: | wallaby-2 → wallaby-3 |
Changed in tripleo: | |
milestone: | wallaby-3 → wallaby-rc1 |
Changed in tripleo: | |
milestone: | wallaby-rc1 → xena-1 |
Changed in tripleo: | |
milestone: | xena-1 → xena-2 |
Changed in tripleo: | |
milestone: | xena-2 → xena-3 |
To post a comment you must log in.
This issue was fixed in the openstack/ tripleo- heat-templates 14.0.0 release.