resource op defaults and any pacemaker property are not guaranteed to be created before pcmk resources

Bug #1948032 reported by Michele Baldessari
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Michele Baldessari

Bug Description

Seen on https://b73db77100b67e8785aa-cbf7b900962d6b77d5144bf757270132.ssl.cf5.rackcdn.com/814758/1/gate/tripleo-ci-centos-8-scenario001-standalone/a90bfcc/logs/undercloud/var/log/pacemaker/pacemaker.log

Galera could be restarted during the deploy:
> Oct 21 08:28:37 standalone.localdomain pacemaker-execd [77107]
> (child_timeout_callback) warning: galera-bundle-podman-0_stop_0
> process (PID 131066) timed out
> Oct 21 08:28:37 standalone.localdomain pacemaker-execd [77107]
> (operation_finished) warning: galera-bundle-podman-0_stop_0[131066]
> timed out after 20000ms
> Oct 21 08:28:37 standalone.localdomain pacemaker-execd [77107]
> (log_finished) info: galera-bundle-podman-0 stop (call 22, PID
> 131066) exited with status 1 (execution time 20003ms, queue time 0ms)

and since podman's performance is abysmal if we have not set the op resource default timeout to 120s yet we'll see the timeout above in 20s.

In fact we set it later:
Oct 21 08:31:28 standalone.localdomain pacemaker-based [77105]
(cib_perform_op) info: ++ <nvpair
id="op_defaults-meta_attributes-timeout" name="timeout" value="120s"/>

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to puppet-tripleo (master)
Changed in tripleo:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to puppet-tripleo (master)

Reviewed: https://review.opendev.org/c/openstack/puppet-tripleo/+/815015
Committed: https://opendev.org/openstack/puppet-tripleo/commit/e4e9be76dc3f6f1a1ee3f79e05d5167eddc34563
Submitter: "Zuul (22348)"
Branch: master

commit e4e9be76dc3f6f1a1ee3f79e05d5167eddc34563
Author: Michele Baldessari <email address hidden>
Date: Thu Oct 21 16:49:45 2021 +0200

    Make sure resource_op_defaults are set before bundles

    See LP#1948032. We set a higher default resource op time out because
    podman often times out within the default 20s. We need to guarantee
    that this timeout is set *before* we create any bundles otherwise
    it might be too late and we'd fail a deployment like described in the
    LP.

    Change-Id: Ic6ac8d21d12389e70cd382018290324d3ae948ce
    Closes-Bug: #1948032

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to puppet-tripleo (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/puppet-tripleo/+/814940

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to puppet-tripleo (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/puppet-tripleo/+/814940
Committed: https://opendev.org/openstack/puppet-tripleo/commit/fbdc1403bf5dd7e91b4c93a631346cd8cde5b43f
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit fbdc1403bf5dd7e91b4c93a631346cd8cde5b43f
Author: Michele Baldessari <email address hidden>
Date: Thu Oct 21 16:49:45 2021 +0200

    Make sure resource_op_defaults are set before bundles

    See LP#1948032. We set a higher default resource op time out because
    podman often times out within the default 20s. We need to guarantee
    that this timeout is set *before* we create any bundles otherwise
    it might be too late and we'd fail a deployment like described in the
    LP.

    Change-Id: Ic6ac8d21d12389e70cd382018290324d3ae948ce
    Closes-Bug: #1948032
    (cherry picked from commit e4e9be76dc3f6f1a1ee3f79e05d5167eddc34563)

tags: added: in-stable-wallaby
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to puppet-tripleo (stable/victoria)

Fix proposed to branch: stable/victoria
Review: https://review.opendev.org/c/openstack/puppet-tripleo/+/815361

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to puppet-tripleo (stable/ussuri)

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/c/openstack/puppet-tripleo/+/815436

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to puppet-tripleo (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/c/openstack/puppet-tripleo/+/815437

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to puppet-tripleo (stable/ussuri)

Reviewed: https://review.opendev.org/c/openstack/puppet-tripleo/+/815436
Committed: https://opendev.org/openstack/puppet-tripleo/commit/d7e5ad5251cdafbc93ed75d14cdef2b8d63ac0f9
Submitter: "Zuul (22348)"
Branch: stable/ussuri

commit d7e5ad5251cdafbc93ed75d14cdef2b8d63ac0f9
Author: Michele Baldessari <email address hidden>
Date: Thu Oct 21 16:49:45 2021 +0200

    Make sure resource_op_defaults are set before bundles

    See LP#1948032. We set a higher default resource op time out because
    podman often times out within the default 20s. We need to guarantee
    that this timeout is set *before* we create any bundles otherwise
    it might be too late and we'd fail a deployment like described in the
    LP.

    Change-Id: Ic6ac8d21d12389e70cd382018290324d3ae948ce
    Closes-Bug: #1948032
    (cherry picked from commit e4e9be76dc3f6f1a1ee3f79e05d5167eddc34563)
    (cherry picked from commit fbdc1403bf5dd7e91b4c93a631346cd8cde5b43f)

tags: added: in-stable-ussuri
tags: added: in-stable-train
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to puppet-tripleo (stable/train)

Reviewed: https://review.opendev.org/c/openstack/puppet-tripleo/+/815437
Committed: https://opendev.org/openstack/puppet-tripleo/commit/b604400917a392a437fb79e341da347e0962a363
Submitter: "Zuul (22348)"
Branch: stable/train

commit b604400917a392a437fb79e341da347e0962a363
Author: Michele Baldessari <email address hidden>
Date: Thu Oct 21 16:49:45 2021 +0200

    Make sure resource_op_defaults are set before bundles

    See LP#1948032. We set a higher default resource op time out because
    podman often times out within the default 20s. We need to guarantee
    that this timeout is set *before* we create any bundles otherwise
    it might be too late and we'd fail a deployment like described in the
    LP.

    Change-Id: Ic6ac8d21d12389e70cd382018290324d3ae948ce
    Closes-Bug: #1948032
    (cherry picked from commit e4e9be76dc3f6f1a1ee3f79e05d5167eddc34563)
    (cherry picked from commit fbdc1403bf5dd7e91b4c93a631346cd8cde5b43f)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to puppet-tripleo (stable/victoria)

Reviewed: https://review.opendev.org/c/openstack/puppet-tripleo/+/815361
Committed: https://opendev.org/openstack/puppet-tripleo/commit/f472e5f795e72bd9de32d302e531e7ea7b309240
Submitter: "Zuul (22348)"
Branch: stable/victoria

commit f472e5f795e72bd9de32d302e531e7ea7b309240
Author: Michele Baldessari <email address hidden>
Date: Thu Oct 21 16:49:45 2021 +0200

    Make sure resource_op_defaults are set before bundles

    See LP#1948032. We set a higher default resource op time out because
    podman often times out within the default 20s. We need to guarantee
    that this timeout is set *before* we create any bundles otherwise
    it might be too late and we'd fail a deployment like described in the
    LP.

    Change-Id: Ic6ac8d21d12389e70cd382018290324d3ae948ce
    Closes-Bug: #1948032
    (cherry picked from commit e4e9be76dc3f6f1a1ee3f79e05d5167eddc34563)
    (cherry picked from commit fbdc1403bf5dd7e91b4c93a631346cd8cde5b43f)

tags: added: in-stable-victoria
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/puppet-tripleo 12.7.1

This issue was fixed in the openstack/puppet-tripleo 12.7.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/puppet-tripleo 16.1.0

This issue was fixed in the openstack/puppet-tripleo 16.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/puppet-tripleo 13.7.0

This issue was fixed in the openstack/puppet-tripleo 13.7.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/puppet-tripleo train-eol

This issue was fixed in the openstack/puppet-tripleo train-eol release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.