Comment 3 for bug 1773754

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to puppet-tripleo (master)

Reviewed: https://review.openstack.org/569578
Committed: https://git.openstack.org/cgit/openstack/puppet-tripleo/commit/?id=5968aeb3207b4c42cc65eb7a7d3a831c4c28d456
Submitter: Zuul
Branch: master

commit 5968aeb3207b4c42cc65eb7a7d3a831c4c28d456
Author: Michele Baldessari <email address hidden>
Date: Sat May 19 16:55:35 2018 +0200

    Make sure remotes are fully up before proceeding

    We currently rely on 'verify_on_create => true' to make
    sure that pacemaker remotes up before proceeding to Step2 (during
    which a remote node is entitled to run pcs commands).
    So if the remote is still not fully up pcs commands can potentially
    fail on the remote nodes with errors like:

    Error: /Stage[main]/Tripleo::Profile::Pacemaker::Compute_instanceha
           /Pacemaker::Property[compute-instanceha-role-node-property]
           /Pcmk_property[property-overcloud-novacomputeiha-0-compute-instanceha-role]:
    Could not evaluate: backup_cib: Running: /usr/sbin/pcs cluster cib
    /var/lib/pacemaker/cib/puppet-cib-backup20180519-20162-ekt31x failed with code: 1 ->

    verify_on_create => true has an incorrect semantic currently
    as it does not really wait for a single resource to be fully up.
    Since implementing that properly will take quite a bit of work
    (given that pcs does not currently support single-resource state
    polling), for now we avoid using verify_on_create and we simply make
    sure the resource is started via an exec.

    Run 25 successful deployments with this (and the depends-on) patch.

    Closes-Bug: #1773754
    Depends-On: I74994a7e52a7470ead7862dd9083074f807f7675
    Change-Id: I9e5d5bb48fc7393df71d8b9eae200ad4ebaa6aa6