puppet-pacemaker

crm_node -n needlessly called via facter

Bug #1782231 reported by Michele Baldessari on 2018-07-17

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	puppet-pacemaker	Fix Released	Undecided	Michele Baldessari

Bug Description

So in lib/facter/pacemaker_node_name.rb we do the following:
Facter.add('pacemaker_node_name') do
  setcode do
    Facter::Core::Execution.exec 'crm_node -n'
  end
end

This is problematic because starting with pacemaker 1.1.19 crm_node -n will trigger
a newly-introduced CRM_OP_NODE_INFO query which will hang if the container runs 1.1.19 code and the cluster runs 1.1.18 code.

Let's simply avoid running these queries when run inside a container

Michele Baldessari (michele) on 2018-07-17

Changed in puppet-pacemaker:
assignee:	nobody → Michele Baldessari (michele)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-07-18: Fix proposed to puppet-pacemaker (master)

Fix proposed to branch: master
Review: https://review.openstack.org/583458

Changed in puppet-pacemaker:
status:	New → In Progress

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-07-18: Fix merged to puppet-pacemaker (master)

Reviewed: https://review.openstack.org/583458
Committed: https://git.openstack.org/cgit/openstack/puppet-pacemaker/commit/?id=966943a79e11eccac5ec0868836239c55e3845dc
Submitter: Zuul
Branch: master

commit 966943a79e11eccac5ec0868836239c55e3845dc
Author: Michele Baldessari <email address hidden>
Date: Wed Jul 18 07:44:00 2018 +0200

Do not call crm_node -n when running in a container

    In lib/facter/pacemaker_node_name.rb we do the following:
    Facter.add('pacemaker_node_name') do
      setcode do
        Facter::Core::Execution.exec 'crm_node -n'
      end
    end

    This is problematic because starting with pacemaker 1.1.19 crm_node -n
    will trigger a newly-introduced CRM_OP_NODE_INFO query which will hang
    if the container runs 1.1.19 code and the cluster runs 1.1.18 code.

Let's simply avoid running these queries when run inside a container.

    Tested this by deploying successfully an overcloud with pcmk-1.1.19 in
    containers and 1.1.18 on the host:
    [root@controller-0 ~]# rpm -q pacemaker
    pacemaker-1.1.18-11.el7_5.3.x86_64
    [root@controller-0 ~]# docker exec -it galera-bundle-docker-0 sh -c "rpm -q pacemaker"
    pacemaker-1.1.19-3.el7.x86_64

For this to work we also need the following RA fix:
https://github.com/ClusterLabs/resource-agents/pull/1173/

Previously this would fail with crm_node -n just hanging.

Change-Id: I9ae7df2f49f918507c5f98b2441b5b17423a38da
Closes-Bug: #1782231

Changed in puppet-pacemaker:
status:	In Progress → Fix Released

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-07-18: Fix proposed to puppet-pacemaker (stable/0.6.x)

Fix proposed to branch: stable/0.6.x
Review: https://review.openstack.org/583660

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-07-18: Fix merged to puppet-pacemaker (stable/0.6.x)

Reviewed: https://review.openstack.org/583660
Committed: https://git.openstack.org/cgit/openstack/puppet-pacemaker/commit/?id=5a6dc500ae6f4c89466dad5436f341999f1ac38a
Submitter: Zuul
Branch: stable/0.6.x

commit 5a6dc500ae6f4c89466dad5436f341999f1ac38a
Author: Michele Baldessari <email address hidden>
Date: Wed Jul 18 07:44:00 2018 +0200

Do not call crm_node -n when running in a container

    In lib/facter/pacemaker_node_name.rb we do the following:
    Facter.add('pacemaker_node_name') do
      setcode do
        Facter::Core::Execution.exec 'crm_node -n'
      end
    end

    This is problematic because starting with pacemaker 1.1.19 crm_node -n
    will trigger a newly-introduced CRM_OP_NODE_INFO query which will hang
    if the container runs 1.1.19 code and the cluster runs 1.1.18 code.

Let's simply avoid running these queries when run inside a container.

For this to work we also need the following RA fix:
https://github.com/ClusterLabs/resource-agents/pull/1173/

Previously this would fail with crm_node -n just hanging.

    Change-Id: I9ae7df2f49f918507c5f98b2441b5b17423a38da
    Closes-Bug: #1782231
    (cherry picked from commit 966943a79e11eccac5ec0868836239c55e3845dc)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-03-24: Fix included in openstack/puppet-pacemaker 0.7.2

This issue was fixed in the openstack/puppet-pacemaker 0.7.2 release.

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.