deep_compare misses changes under certain conditions
Bug #1797542 reported by
Michele Baldessari
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
puppet-pacemaker |
Fix Released
|
High
|
Michele Baldessari |
Bug Description
We currently run a crm_diff --cib -o orig -n new and then parse the diff xml to detect if we need to change a resource. Namely we check on the following xpath:
"/diff/
If we do find 'change' entries which contain an 'operation' attribute and the resource name is in the diff *and* it ends in /change-result we deem the resource needs an update.
This seems to be not always the case and it seems that when the number of changes in a bind mount list is more than one, the '/change-result' part is a section on its own.
Changed in puppet-pacemaker: | |
status: | New → Triaged |
importance: | Undecided → High |
Changed in puppet-pacemaker: | |
assignee: | nobody → Michele Baldessari (michele) |
status: | Triaged → In Progress |
To post a comment you must log in.
Reviewed: https:/ /review. openstack. org/609998 /git.openstack. org/cgit/ openstack/ puppet- pacemaker/ commit/ ?id=90033be6533 218efd93182c3dc f6c883b03c95d3
Committed: https:/
Submitter: Zuul
Branch: master
commit 90033be6533218e fd93182c3dcf6c8 83b03c95d3
Author: Michele Baldessari <email address hidden>
Date: Fri Oct 26 15:37:17 2018 +0200
Improve deep_compare detection
In the initial implementation of resource updates in puppet-pacemaker,
in order to detect if a resource changed we did the following:
1) dump the cib and make a copy
2) remove the resource from the dump
3) readd the resource with current parameters
4) via crm_diff --cib we would detect if there were any changes related
to the resource by checking a specific xpath on the the diff
5) If 4) matched the xpath we would push the newly created CIB back to
the cluster
This approach has a number of drawbacks:
A) In case of bundles it would remove the resource running inside the
bundle for a very brief amount of time
B) pcs would remove all constraints related to the resource
automatically, so we had to recreate them forcefully during the update
operation and then update the CIB
C) We sometimes missed to trigger a resource update because the XPATH on
4) was not always matching and so we missed updating a resource from
time to time.
We now change approach slightly to make it more robust. Namely,
instead of removing a resource from the offline CIB we actually just
update it via pcs update commands on the offline CIB. In the case of
bundles we need to actually remove all currently existing storage maps
and then add the newly defined set, because pcs does not allow to update
a resource in a declarative way [1]. We also extend the xpath at C) to match
most things and also be careful about not matching empty meta attributes
which get added by pcs due to a bug [2].
This new approach solves A) and B) and is also more precise and solve C)
fully as well.
Given the complexity of the changes here is how I tested this change: has_changed (ng version) returned true for resource rabbitmq-bundle
1) 1 mount point added
Oct 25 14:46:31 controller-0 journal: Debug: pcmk_resource_
and rabbitmq-bundle was correctly restarted with the correct bind mounts
2) 1 mount point removed has_changed (ng version) returned true for resource rabbitmq-bundle
Oct 25 15:56:38 controller-0 journal: Debug: pcmk_resource_
and rabbitmq-bundle was correctly restarted with the correct bind mounts
3) 3 mount points added has_changed (ng version) returned true for resource rabbitmq-bundle
Oct 26 01:46:42 controller-0 journal: Debug: pcmk_resource_
and rabbitmq-bundle was correctly restarted with the correct bind mounts
4) 2 mount points removed has_changed (ng version) returned true for resource rabbitmq-bundle
Oct 26 02:42:07 controller-0 journal: Debug: pcmk_resource_
and rabbitmq-bundle was correctly restarted with the correct bind mounts
5) 1 mount point added plus replicas number changed
Correctly get a reduced rep...