'pacemaker-status' validation reports incorrect status

Bug #1940519 reported by Yadnesh Kulkarni
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Yadnesh Kulkarni

Bug Description

Description
===========
'pacemaker-status' validation always reports PASSED regardless of the actual status of pacemaker service

Steps to reproduce
==================
Stop pacemaker service on ctrl node and run validation
~~~
[root@overcloud-controller-0 ~]# systemctl status pacemaker
● pacemaker.service - Pacemaker High Availability Cluster Manager
   Loaded: loaded (/usr/lib/systemd/system/pacemaker.service; enabled; vendor preset: disabled)
   Active: inactive (dead) since Thu 2021-08-19 09:30:30 UTC; 4min 2s ago
     Docs: man:pacemakerd
           https://clusterlabs.org/pacemaker/doc/
  Process: 675053 ExecStart=/usr/sbin/pacemakerd (code=exited, status=0/SUCCESS)
 Main PID: 675053 (code=exited, status=0/SUCCESS)

(undercloud) [stack@undercloud ~]$ openstack tripleo validator run --validation pacemaker-status --inventory ./overcloud-deploy/overcloud/tripleo-ansible-inventory.yaml
+--------------------------------------+------------------+--------+------------+------------------------+-------------------+-------------+
| UUID | Validations | Status | Host_Group | Status_by_Host | Unreachable_Hosts | Duration |
+--------------------------------------+------------------+--------+------------+------------------------+-------------------+-------------+
| 31899bf0-3c48-4d8a-ba00-a4cb030b506f | pacemaker-status | PASSED | Controller | overcloud-controller-0 | | 0:00:00.634 |
+--------------------------------------+------------------+--------+------------+------------------------+-------------------+-------------+
(undercloud) [stack@undercloud ~]$
~~~

Expected result
===============
Validation should've been reported as failed

Actual result
=============
Validation was reported as passed, meaning pacemaker service is active

Environment
===========
master release

Logs & Configs
==============
Ansible reports the service as inactive in validation logs
~~~
                            "delta": "0:00:00.020432",
                            "end": "2021-08-19 09:30:39.894421",
                            "invocation": {
                                "module_args": {
                                    "_raw_params": "/usr/bin/systemctl show pacemaker --property ActiveState",
                                    "_uses_shell": false,
                                    "argv": null,
                                    "chdir": null,
                                    "creates": null,
                                    "executable": null,
                                    "removes": null,
                                    "stdin": null,
                                    "stdin_add_newline": true,
                                    "strip_empty_ends": true,
                                    "warn": true
                                }
                            },
                            "ok": true,
                            "rc": 0,
                            "start": "2021-08-19 09:30:39.873989",
                            "stderr": "",
                            "stderr_lines": [],
                            "stdout": "ActiveState=inactive", <<<<<<<
                            "stdout_lines": [
                                "ActiveState=inactive"
                            ]
                        }
~~~

In [1] there is no task to fail the playbook if the service is found inactive or failed.

[1] https://opendev.org/openstack/tripleo-validations/src/branch/master/roles/pacemaker_status/tasks/main.yml

Changed in tripleo:
status: New → Triaged
importance: Undecided → High
Changed in tripleo:
assignee: nobody → Yadnesh Kulkarni (ykulkarn)
Revision history for this message
Sandeep Yadav (sandeepyadav93) wrote :

Looks legit issue to me:-

heat-admin@overcloud-controller-1 ~]$ sudo pcs status
Error: error running crm_mon, is pacemaker running?
  crm_mon: Error: cluster is not available on this node

Even after stopping pacemaker on 1 controller, status still shows passed

(undercloud) [zuul@undercloud ~]$ openstack tripleo validator run --validation pacemaker-status
Running Validations without Overcloud settings.
+--------------------------------------+------------------+--------+------------+------------------------------------------------------------------------+-------------------+-------------+
| UUID | Validations | Status | Host_Group | Status_by_Host | Unreachable_Hosts | Duration |
+--------------------------------------+------------------+--------+------------+------------------------------------------------------------------------+-------------------+-------------+
| 60d0b82a-8e1c-48ec-aa61-185cfd8feda6 | pacemaker-status | PASSED | Controller | overcloud-controller-0, overcloud-controller-1, overcloud-controller-2 | | 0:00:02.900 |
+--------------------------------------+------------------+--------+------------+------------------------------------------------------------------------+-------------------+-------------+

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-validations (master)
Changed in tripleo:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-validations (master)

Reviewed: https://review.opendev.org/c/openstack/tripleo-validations/+/805187
Committed: https://opendev.org/openstack/tripleo-validations/commit/2a7c43df4ae997d66f1fd9cf2e4e5acf32646e7b
Submitter: "Zuul (22348)"
Branch: master

commit 2a7c43df4ae997d66f1fd9cf2e4e5acf32646e7b
Author: Yadnesh Kulkarni <email address hidden>
Date: Thu Aug 19 16:33:28 2021 +0530

    Fail validation if pacemaker service is not active

    Pacemaker-status validation always reports success regardless
    of the actual status of the service.

    Add a task to fail validation when pacemaker service is found
    inactive/failed

    Closes-Bug: #1940519

    Signed-off-by: Yadnesh Kulkarni <email address hidden>
    Change-Id: Id8d99321c870aee70f0e177b1b633654a04c8402

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-validations (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/tripleo-validations/+/806903

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-validations (stable/victoria)

Fix proposed to branch: stable/victoria
Review: https://review.opendev.org/c/openstack/tripleo-validations/+/806904

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-validations (stable/ussuri)

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/c/openstack/tripleo-validations/+/806905

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-validations (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/c/openstack/tripleo-validations/+/806906

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-validations (stable/victoria)

Reviewed: https://review.opendev.org/c/openstack/tripleo-validations/+/806904
Committed: https://opendev.org/openstack/tripleo-validations/commit/365c82fb38c5b6faa20a01d83cf753426a28fff6
Submitter: "Zuul (22348)"
Branch: stable/victoria

commit 365c82fb38c5b6faa20a01d83cf753426a28fff6
Author: Yadnesh Kulkarni <email address hidden>
Date: Thu Aug 19 16:33:28 2021 +0530

    Fail validation if pacemaker service is not active

    Pacemaker-status validation always reports success regardless
    of the actual status of the service.

    Add a task to fail validation when pacemaker service is found
    inactive/failed

    Closes-Bug: #1940519

    Signed-off-by: Yadnesh Kulkarni <email address hidden>
    Change-Id: Id8d99321c870aee70f0e177b1b633654a04c8402
    (cherry picked from commit 2a7c43df4ae997d66f1fd9cf2e4e5acf32646e7b)

tags: added: in-stable-victoria
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-validations (stable/ussuri)

Reviewed: https://review.opendev.org/c/openstack/tripleo-validations/+/806905
Committed: https://opendev.org/openstack/tripleo-validations/commit/05a588149e950c6b225f4fbd81655968e0a0f975
Submitter: "Zuul (22348)"
Branch: stable/ussuri

commit 05a588149e950c6b225f4fbd81655968e0a0f975
Author: Yadnesh Kulkarni <email address hidden>
Date: Thu Aug 19 16:33:28 2021 +0530

    Fail validation if pacemaker service is not active

    Pacemaker-status validation always reports success regardless
    of the actual status of the service.

    Add a task to fail validation when pacemaker service is found
    inactive/failed

    Closes-Bug: #1940519

    Signed-off-by: Yadnesh Kulkarni <email address hidden>
    Change-Id: Id8d99321c870aee70f0e177b1b633654a04c8402
    (cherry picked from commit 2a7c43df4ae997d66f1fd9cf2e4e5acf32646e7b)

tags: added: in-stable-ussuri
tags: added: in-stable-train
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-validations (stable/train)

Reviewed: https://review.opendev.org/c/openstack/tripleo-validations/+/806906
Committed: https://opendev.org/openstack/tripleo-validations/commit/d6f3a5c742c24b738a90ddeea023ec10af1bdf68
Submitter: "Zuul (22348)"
Branch: stable/train

commit d6f3a5c742c24b738a90ddeea023ec10af1bdf68
Author: Yadnesh Kulkarni <email address hidden>
Date: Thu Aug 19 16:33:28 2021 +0530

    Fail validation if pacemaker service is not active

    Pacemaker-status validation always reports success regardless
    of the actual status of the service.

    Add a task to fail validation when pacemaker service is found
    inactive/failed

    Closes-Bug: #1940519

    Signed-off-by: Yadnesh Kulkarni <email address hidden>
    Change-Id: Id8d99321c870aee70f0e177b1b633654a04c8402
    (cherry picked from commit 2a7c43df4ae997d66f1fd9cf2e4e5acf32646e7b)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-validations (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/tripleo-validations/+/806903
Committed: https://opendev.org/openstack/tripleo-validations/commit/90bccaef3f582df64112059e7450a616abe6bce6
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit 90bccaef3f582df64112059e7450a616abe6bce6
Author: Yadnesh Kulkarni <email address hidden>
Date: Thu Aug 19 16:33:28 2021 +0530

    Fail validation if pacemaker service is not active

    Pacemaker-status validation always reports success regardless
    of the actual status of the service.

    Add a task to fail validation when pacemaker service is found
    inactive/failed

    Closes-Bug: #1940519

    Signed-off-by: Yadnesh Kulkarni <email address hidden>
    Change-Id: Id8d99321c870aee70f0e177b1b633654a04c8402
    (cherry picked from commit 2a7c43df4ae997d66f1fd9cf2e4e5acf32646e7b)

tags: added: in-stable-wallaby
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-validations 15.1.0

This issue was fixed in the openstack/tripleo-validations 15.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-validations ussuri-eol

This issue was fixed in the openstack/tripleo-validations ussuri-eol release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-validations 13.5.0

This issue was fixed in the openstack/tripleo-validations 13.5.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-validations 14.3.0

This issue was fixed in the openstack/tripleo-validations 14.3.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-validations train-eol

This issue was fixed in the openstack/tripleo-validations train-eol release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.