While rollback of instance HA, the operation fails at 'Remove resources associated to remote nodes'.
The error is
~~~
TASK [instance-ha : Remove resources associated to remote nodes] *******************************************************************************************************************
fatal: [undercloud-0 -> controller-2]: FAILED! =>{"changed": true,"cmd":"for resourceid in $(pcs resource show | grep compute | grep -v -e Stopped: -e Started: -e disabled -e remote | awk '{print $3}')\n do\npcs resource cleanup $resourceid\n pcs --force resource delete $resourceid\n done","delta": "0:00:03.667994","end": "2019-08-13 11:02:15.318707","failed": true,"msg": "non-zero return code","rc": 1,"start": "2019-08-13 11:02:11.650713","stderr":"Error: Unable to forget failed operations of resource: FAILED\nResource 'FAILED' not found\nError performing operation: No such device or address\nError: Resource 'FAILED' does not exist.\nError: Unable to forget failed operations of resource: FAILED\nResource 'FAILED' not found\nError performing operation: No such device or address\nError: Resource 'FAILED' does not exist.","stderr_lines": ["Error: Unable to forget failed operations of resource: FAILED","Resource 'FAILED' not found","Error performing operation: No such device or address","Error: Resource 'FAILED' does not exist.","Error: Unable to forget failed operations of resource: FAILED","Resource 'FAILED' not found","Error performing operation: No such device or address","Error: Resource 'FAILED' does not exist."],"stdout":"Cleaned up nova-compute-checkevacuate:0 on compute-1\nCleaned up nova-compute-checkevacuate:1 on compute-0\nCleaned up nova-compute-checkevacuate:2 on compute-1\nCleaned up nova-compute-checkevacuate:2 on compute-0\nCleaned up nova-compute-checkevacuate:2 on controller-2\nCleaned up nova-compute-checkevacuate:2 on controller-1\nCleaned up nova-compute-checkevacuate:2 on controller-0\nCleaned up nova-compute-checkevacuate:3 on compute-1\nCleaned up nova-compute-checkevacuate:3 on compute-0\nCleaned up nova-compute-checkevacuate:3 on controller-2\nCleaned up nova-compute-checkevacuate:3 on controller-1\nCleaned up nova-compute-checkevacuate:3 on controller-0\nCleaned up nova-compute-checkevacuate:4 on compute-1\nCleaned up nova-compute-checkevacuate:4 on compute-0\nCleaned up nova-compute-checkevacuate:4 on controller-2\nCleaned up nova-compute-checkevacuate:4 on controller-1\nCleaned up nova-compute-checkevacuate:4 on controller-0\nRemoving Constraint - location-nova-compute-checkevacuate-clone\nRemoving Constraint - order-nova-compute-checkevacuate-clone-nova-compute-clone-mandatory\nDeleting Resource - nova-compute-checkevacuate\nCleaned up nova-compute:0 on compute-1\nCleaned up nova-compute:1 on compute-0\nCleaned up nova-compute:2 on compute-1\nCleaned up nova-compute:2 on compute-0\nCleaned up nova-compute:2 on controller-2\nCleaned up nova-compute:2 on controller-1\nCleaned up nova-compute:2 on controller-0\nCleaned up nova-compute:3 on compute-1\nCleaned up nova-compute:3 on compute-0\nCleaned up nova-compute:3 on controller-2\nCleaned up nova-compute:3 on controller-1\nCleaned up nova-compute:3 on controller-0\nCleaned up nova-compute:4 on compute-1\nCleaned up nova-compute:4 on compute-0\nCleaned up nova-compute:4 on controller-2\nCleaned up nova-compute:4 on controller-1\nCleaned up nova-compute:4 on controller-0\nWaiting for 2 replies from the CRMd.. OK\nRemoving Constraint - location-nova-compute-clone\nRemoving Constraint - order-nova-compute-clone-nova-evacuate-mandatory\nDeleting Resource - nova-compute","stdout_lines": ["Cleaned up nova-compute-checkevacuate:0 on compute-1", "Cleaned up nova-compute-checkevacuate:1 on compute-0", "Cleaned up nova-compute-checkevacuate:2 on compute-1", "Cleaned up nova-compute-checkevacuate:2 on compute-0", "Cleaned up nova-compute-checkevacuate:2 on controller-2", "Cleaned up nova-compute-checkevacuate:2 on controller-1", "Cleaned up nova-compute-checkevacuate:2 on controller-0", "Cleaned up nova-compute-checkevacuate:3 on compute-1", "Cleaned up nova-compute-checkevacuate:3 on compute-0", "Cleaned up nova-compute-checkevacuate:3 on controller-2", "Cleaned up nova-compute-checkevacuate:3 on controller-1", "Cleaned up nova-compute-checkevacuate:3 on controller-0", "Cleaned up nova-compute-checkevacuate:4 on compute-1", "Cleaned up nova-compute-checkevacuate:4 on compute-0", "Cleaned up nova-compute-checkevacuate:4 on controller-2", "Cleaned up nova-compute-checkevacuate:4 on controller-1", "Cleaned up nova-compute-checkevacuate:4 on controller-0", "Removing Constraint - location-nova-compute-checkevacuate-clone", "Removing Constraint - order-nova-compute-checkevacuate-clone-nova-compute-clone-mandatory", "Deleting Resource - nova-compute-checkevacuate", "Cleaned up nova-compute:0 on compute-1", "Cleaned up nova-compute:1 on compute-0", "Cleaned up nova-compute:2 on compute-1", "Cleaned up nova-compute:2 on compute-0", "Cleaned up nova-compute:2 on controller-2", "Cleaned up nova-compute:2 on controller-1", "Cleaned up nova-compute:2 on controller-0", "Cleaned up nova-compute:3 on compute-1", "Cleaned up nova-compute:3 on compute-0", "Cleaned up nova-compute:3 on controller-2", "Cleaned up nova-compute:3 on controller-1", "Cleaned up nova-compute:3 on controller-0", "Cleaned up nova-compute:4 on compute-1", "Cleaned up nova-compute:4 on compute-0", "Cleaned up nova-compute:4 on controller-2", "Cleaned up nova-compute:4 on controller-1", "Cleaned up nova-compute:4 on controller-0", "Waiting for 2 replies from the CRMd.. OK", "Removing Constraint - location-nova-compute-clone", "Removing Constraint - order-nova-compute-clone-nova-evacuate-mandatory", "Deleting Resource - nova-compute"]}
~~~
The current task is defined as
~~~
- name: Remove resources associated to remote nodes
shell: |
for resourceid in $(pcs resource show | grep compute | grep -v -e Stopped: -e Started: -e disabled -e remote | awk '{print $3}')
do
pcs resource cleanup $resourceid
pcs --force resource delete $resourceid
done
~~~
However, the filter in grep command is insufficient. There is a case that the filter matches the starting resources. Then it will return 'Starting' word.
pcs resource show when the issue happens is below.
~~~
" ip-10.0.0.109\t(ocf::heartbeat:IPaddr2):\tStarted controller-2",
" ip-172.17.3.14\t(ocf::heartbeat:IPaddr2):\tStarted controller-0",
" Clone Set: haproxy-clone [haproxy]",
" Started: [ controller-0 controller-1 controller-2 ]",
" Stopped: [ compute-0 compute-1 ]",
" Master/Slave Set: galera-master [galera]",
" Masters: [ controller-0 controller-1 controller-2 ]",
" Stopped: [ compute-0 compute-1 ]",
" ip-192.168.24.9\t(ocf::heartbeat:IPaddr2):\tStarted controller-1",
" ip-172.17.4.12\t(ocf::heartbeat:IPaddr2):\tStarted controller-2",
" Clone Set: rabbitmq-clone [rabbitmq]",
" Started: [ controller-0 controller-1 controller-2 ]",
" Stopped: [ compute-0 compute-1 ]",
" Master/Slave Set: redis-master [redis]",
" Masters: [ controller-1 ]",
" Slaves: [ controller-0 controller-2 ]",
" Stopped: [ compute-0 compute-1 ]",
" ip-172.17.1.11\t(ocf::heartbeat:IPaddr2):\tStarted controller-0",
" ip-172.17.1.18\t(ocf::heartbeat:IPaddr2):\tStarted controller-1",
" openstack-cinder-volume\t(systemd:openstack-cinder-volume):\tStarted controller-2",
" nova-evacuate\t(ocf::openstack:NovaEvacuate):\tStopped",
" Clone Set: nova-compute-checkevacuate-clone [nova-compute-checkevacuate]",
" Started: [ compute-0 compute-1 ]",
" Stopped: [ controller-0 controller-1 controller-2 ]",
" Clone Set: nova-compute-clone [nova-compute]",
" nova-compute\t(systemd:openstack-nova-compute):\tStarting compute-1",
" nova-compute\t(systemd:openstack-nova-compute):\tStarting compute-0",
" Stopped: [ controller-0 controller-1 controller-2 ]",
" compute-1\t(ocf::pacemaker:remote):\tStarted controller-0",
" compute-0\t(ocf::pacemaker:remote):\tStarted controller-1"
~~~
In this task, the target resources are 'Clone Set'.
So, it looks that we should change the filter to use 'Clone Set'.
Fix proposed to branch: master /review. opendev. org/676092
Review: https:/