crm_node -n needlessly called via facter

Bug #1782231 reported by Michele Baldessari
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
puppet-pacemaker
Fix Released
Undecided
Michele Baldessari

Bug Description

So in lib/facter/pacemaker_node_name.rb we do the following:
Facter.add('pacemaker_node_name') do
  setcode do
    Facter::Core::Execution.exec 'crm_node -n'
  end
end

This is problematic because starting with pacemaker 1.1.19 crm_node -n will trigger
a newly-introduced CRM_OP_NODE_INFO query which will hang if the container runs 1.1.19 code and the cluster runs 1.1.18 code.

Let's simply avoid running these queries when run inside a container

Changed in puppet-pacemaker:
assignee: nobody → Michele Baldessari (michele)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to puppet-pacemaker (master)

Fix proposed to branch: master
Review: https://review.openstack.org/583458

Changed in puppet-pacemaker:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to puppet-pacemaker (master)

Reviewed: https://review.openstack.org/583458
Committed: https://git.openstack.org/cgit/openstack/puppet-pacemaker/commit/?id=966943a79e11eccac5ec0868836239c55e3845dc
Submitter: Zuul
Branch: master

commit 966943a79e11eccac5ec0868836239c55e3845dc
Author: Michele Baldessari <email address hidden>
Date: Wed Jul 18 07:44:00 2018 +0200

    Do not call crm_node -n when running in a container

    In lib/facter/pacemaker_node_name.rb we do the following:
    Facter.add('pacemaker_node_name') do
      setcode do
        Facter::Core::Execution.exec 'crm_node -n'
      end
    end

    This is problematic because starting with pacemaker 1.1.19 crm_node -n
    will trigger a newly-introduced CRM_OP_NODE_INFO query which will hang
    if the container runs 1.1.19 code and the cluster runs 1.1.18 code.

    Let's simply avoid running these queries when run inside a container.

    Tested this by deploying successfully an overcloud with pcmk-1.1.19 in
    containers and 1.1.18 on the host:
    [root@controller-0 ~]# rpm -q pacemaker
    pacemaker-1.1.18-11.el7_5.3.x86_64
    [root@controller-0 ~]# docker exec -it galera-bundle-docker-0 sh -c "rpm -q pacemaker"
    pacemaker-1.1.19-3.el7.x86_64

    For this to work we also need the following RA fix:
    https://github.com/ClusterLabs/resource-agents/pull/1173/

    Previously this would fail with crm_node -n just hanging.

    Change-Id: I9ae7df2f49f918507c5f98b2441b5b17423a38da
    Closes-Bug: #1782231

Changed in puppet-pacemaker:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to puppet-pacemaker (stable/0.6.x)

Fix proposed to branch: stable/0.6.x
Review: https://review.openstack.org/583660

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to puppet-pacemaker (stable/0.6.x)

Reviewed: https://review.openstack.org/583660
Committed: https://git.openstack.org/cgit/openstack/puppet-pacemaker/commit/?id=5a6dc500ae6f4c89466dad5436f341999f1ac38a
Submitter: Zuul
Branch: stable/0.6.x

commit 5a6dc500ae6f4c89466dad5436f341999f1ac38a
Author: Michele Baldessari <email address hidden>
Date: Wed Jul 18 07:44:00 2018 +0200

    Do not call crm_node -n when running in a container

    In lib/facter/pacemaker_node_name.rb we do the following:
    Facter.add('pacemaker_node_name') do
      setcode do
        Facter::Core::Execution.exec 'crm_node -n'
      end
    end

    This is problematic because starting with pacemaker 1.1.19 crm_node -n
    will trigger a newly-introduced CRM_OP_NODE_INFO query which will hang
    if the container runs 1.1.19 code and the cluster runs 1.1.18 code.

    Let's simply avoid running these queries when run inside a container.

    Tested this by deploying successfully an overcloud with pcmk-1.1.19 in
    containers and 1.1.18 on the host:
    [root@controller-0 ~]# rpm -q pacemaker
    pacemaker-1.1.18-11.el7_5.3.x86_64
    [root@controller-0 ~]# docker exec -it galera-bundle-docker-0 sh -c "rpm -q pacemaker"
    pacemaker-1.1.19-3.el7.x86_64

    For this to work we also need the following RA fix:
    https://github.com/ClusterLabs/resource-agents/pull/1173/

    Previously this would fail with crm_node -n just hanging.

    Change-Id: I9ae7df2f49f918507c5f98b2441b5b17423a38da
    Closes-Bug: #1782231
    (cherry picked from commit 966943a79e11eccac5ec0868836239c55e3845dc)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/puppet-pacemaker 0.7.2

This issue was fixed in the openstack/puppet-pacemaker 0.7.2 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.