Comment 13 for bug 1900016

Revision history for this message
Lucas Kanashiro (lucaskanashiro) wrote : Re: pgsql resource agent uses regexes for old crm_mon format, breaks pgsql-status and pgsql-data-status attributes

Hi Jason,

Thanks for providing your configuration, it was helpful. I spent my day yesterday trying to trigger this bug and for some reason I was not able to do it. I have set up a 2-node Focal cluster, set up PostgreSQL with replica streaming, and I tried to swap the master and slave some times and it worked for me. FWIW this is my cluster configuration:

node 1: focal01 \
 attributes pgsql-data-status="STREAMING|SYNC"
node 2: focal02 \
 attributes pgsql-data-status=LATEST
primitive postgresql pgsql \
 params pgctl="/usr/lib/postgresql/12/bin/pg_ctl" pgdata="/var/lib/postgresql/12/main" psql="/usr/bin/psql" config="/etc/postgresql/12/main/postgresql.conf" rep_mode=sync master_ip=192.168.3.3 repuser=replicator restart_on_promote=true check_wal_receiver=true node_list="focal01 focal02" \
 op monitor timeout=30 interval=2
primitive vip_public IPaddr2 \
 params ip=192.168.3.4 cidr_netmask=24 \
 op monitor interval=10s \
 meta target-role=Started
primitive vip_replica IPaddr2 \
 params ip=192.168.3.3 cidr_netmask=24 \
 op monitor interval=10s \
 meta target-role=Started
ms master_postgresql postgresql \
 meta notify=true master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 target-role=Started
location cli-prefer-vip_public vip_public role=Started inf: focal01
order order-vip_replica-psql_master-vip_public inf: vip_replica:start master_postgresql:promote vip_public:start
colocation psql_master_and_vips inf: master_postgresql vip_public vip_replica
property cib-bootstrap-options: \
 have-watchdog=false \
 dc-version=2.0.3-4b1f869f0f \
 cluster-infrastructure=corosync \
 cluster-name=cluster01 \
 stonith-enabled=false \
 last-lrm-refresh=1603195934
rsc_defaults rsc-options: \
 resource-stickiness=100

And here you can see the some of the commands I ran and their respective output:

https://pastebin.ubuntu.com/p/ZBBByznQfT/

You can see the 'postgresql-receiver-status' error value in the xml output but it should not impact the reproducibility of this bug, actually the 'pgsql-data-status' is there. During this process I found an unrelated issue with the PostgreSQL resource and I filed this bug:

https://bugs.launchpad.net/ubuntu/+source/resource-agents/+bug/1900613

However, after checking the code I can see your point and it indeed seems buggy, I do not know why my attempt did not trigger it. I'd not want to spend too much more time on it, I'd like to use your configuration as the test case for the SRU and if it's possible ask you to do the validation work when the SRU team requests it. Is that OK for you? Maybe you could try to do what I did (in the pastebin link) and that would be good enough for our test case I think.