Resource created with
sudo pcs resource create test2 ocf:pacemaker:Dummy op_sleep=10 op monitor interval=30s timeout=30s op start timeout=30s op stop timeout=30s
Resource Group: grp_nova_vips
res_nova_bf9661e_vip (ocf::heartbeat:IPaddr2): Started juju-acda3d-pacemaker-remote-7
Clone Set: cl_nova_haproxy [res_nova_haproxy]
Started: [ juju-acda3d-pacemaker-remote-7 juju-acda3d-pacemaker-remote-8 juju-acda3d-pacemaker-remote-9 ]
juju-acda3d-pacemaker-remote-10.cloud.sts (ocf::pacemaker:remote): Started juju-acda3d-pacemaker-remote-8
juju-acda3d-pacemaker-remote-12.cloud.sts (ocf::pacemaker:remote): Started juju-acda3d-pacemaker-remote-8
juju-acda3d-pacemaker-remote-11.cloud.sts (ocf::pacemaker:remote): Started juju-acda3d-pacemaker-remote-7
test2 (ocf::pacemaker:Dummy): Started juju-acda3d-pacemaker-remote-10.cloud.sts
## After running the following commands on juju-acda3d-pacemaker-remote-10.cloud.sts
1) sudo systemctl stop pacemaker_remote
2) forcedfully shutdown (openstack server stop xxxx) in less than 10 seconds after the pacemaker_remote gets
executed.
However, If I do a clean shutdown (without interrupting the pacemaker_remote fence), that ends up
with the resource migrated correctly to another node.
I am able to reproduce a similar issue with the following bundle: https:/ /paste. ubuntu. com/p/VJ3m7nMN7 9/
Resource created with
sudo pcs resource create test2 ocf:pacemaker:Dummy op_sleep=10 op monitor interval=30s timeout=30s op start timeout=30s op stop timeout=30s
juju ssh nova-cloud- controller/ 2 "sudo pcs constraint location test2 prefers juju-acda3d- pacemaker- remote- 10.cloud. sts" controller/ 2 "sudo pcs constraint location test2 prefers juju-acda3d- pacemaker- remote- 11.cloud. sts" controller/ 2 "sudo pcs constraint location test2 prefers juju-acda3d- pacemaker- remote- 12.cloud. sts"
juju ssh nova-cloud-
juju ssh nova-cloud-
Online: [ juju-acda3d- pacemaker- remote- 7 juju-acda3d- pacemaker- remote- 8 juju-acda3d- pacemaker- remote- 9 ] pacemaker- remote- 10.cloud. sts juju-acda3d- pacemaker- remote- 11.cloud. sts juju-acda3d- pacemaker- remote- 12.cloud. sts ]
RemoteOnline: [ juju-acda3d-
Full list of resources:
Resource Group: grp_nova_vips bf9661e_ vip (ocf::heartbeat :IPaddr2) : Started juju-acda3d- pacemaker- remote- 7 pacemaker- remote- 7 juju-acda3d- pacemaker- remote- 8 juju-acda3d- pacemaker- remote- 9 ] pacemaker- remote- 10.cloud. sts (ocf::pacemaker :remote) : Started juju-acda3d- pacemaker- remote- 8 pacemaker- remote- 12.cloud. sts (ocf::pacemaker :remote) : Started juju-acda3d- pacemaker- remote- 8 pacemaker- remote- 11.cloud. sts (ocf::pacemaker :remote) : Started juju-acda3d- pacemaker- remote- 7
res_nova_
Clone Set: cl_nova_haproxy [res_nova_haproxy]
Started: [ juju-acda3d-
juju-acda3d-
juju-acda3d-
juju-acda3d-
test2 (ocf::pacemaker :Dummy) : Started juju-acda3d- pacemaker- remote- 10.cloud. sts
## After running the following commands on juju-acda3d- pacemaker- remote- 10.cloud. sts
1) sudo systemctl stop pacemaker_remote
2) forcedfully shutdown (openstack server stop xxxx) in less than 10 seconds after the pacemaker_remote gets
executed.
Remote is shutdown
RemoteOFFLINE: [ juju-acda3d- pacemaker- remote- 10.cloud. sts ]
The resource status remains as stopped across the 3 machines, and doesn't recovers.
$ juju run --application nova-cloud- controller "sudo pcs resource show | grep -i test2" ocf::pacemaker: Dummy): \tStopped\ n" controller/ 0 ocf::pacemaker: Dummy): \tStopped\ n" controller/ 1 ocf::pacemaker: Dummy): \tStopped\ n" controller/ 2
- Stdout: " test2\t(
UnitId: nova-cloud-
- Stdout: " test2\t(
UnitId: nova-cloud-
- Stdout: " test2\t(
UnitId: nova-cloud-
However, If I do a clean shutdown (without interrupting the pacemaker_remote fence), that ends up
with the resource migrated correctly to another node.
6 nodes configured
9 resources configured
Online: [ juju-acda3d- pacemaker- remote- 7 juju-acda3d- pacemaker- remote- 8 juju-acda3d- pacemaker- remote- 9 ] pacemaker- remote- 11.cloud. sts juju-acda3d- pacemaker- remote- 12.cloud. sts ] pacemaker- remote- 10.cloud. sts ]
RemoteOnline: [ juju-acda3d-
RemoteOFFLINE: [ juju-acda3d-
Full list of resources:
[...] :Dummy) : Started juju-acda3d- pacemaker- remote- 12.cloud. sts
test2 (ocf::pacemaker
I will keep investigating this behavior and determine is this is linked to the bug reported.