Looking at the deployment log, it seems ovn-dbs-bundle resource fails to start
and pacemaker does not start the vip resource because of location constraint.
```
Full List of Resources:
* ip-192.168.24.3 (ocf:heartbeat:IPaddr2): Stopped
* Container bundle: haproxy-bundle [127.0.0.1:5001/tripleomastercentos9/openstack-haproxy:pcmklatest]:
* haproxy-bundle-podman-0 (ocf:heartbeat:podman): Started standalone
* Container bundle: galera-bundle [127.0.0.1:5001/tripleomastercentos9/openstack-mariadb:pcmklatest]:
* galera-bundle-0 (ocf:heartbeat:galera): Promoted standalone
* Container bundle: rabbitmq-bundle [127.0.0.1:5001/tripleomastercentos9/openstack-rabbitmq:pcmklatest]:
* rabbitmq-bundle-0 (ocf:heartbeat:rabbitmq-cluster): Started standalone
* Container bundle: ovn-dbs-bundle [127.0.0.1:5001/tripleomastercentos9/openstack-ovn-northd:pcmklatest]:
* ovn-dbs-bundle-0 (ocf:ovn:ovndb-servers): Unpromoted standalone
Failed Resource Actions:
* ovndb_servers promote on ovn-dbs-bundle-0 could not be executed (Timed Out: Resource agent did not complete within 2m) at Tue Jun 21 06:41:09 2022 after 2m1ms
```
Looking at journal log, it seems ovn-nbctl command crashes with core dump.
```
Jun 21 06:41:08 standalone.localdomain kernel: traps: ovn-nbctl[212704] trap invalid opcode ip:55d658f09ba8 sp:7ffcdc0e3140 error:0 in ovn-nbctl[55d658f05000+5c000]
Jun 21 06:41:08 standalone.localdomain systemd[1]: Started Process Core Dump (PID 212705/UID 0).
Jun 21 06:41:08 standalone.localdomain systemd-coredump[212707]: Process 212704 (ovn-nbctl) of user 0 dumped core.
Module /usr/bin/ovn-nbctl with build-id 2798d30ce0833d6e0fcabb6d8a0a98cba4da707d Module linux-vdso.so.1 with build-id 932e8861e1b4a3fa34f93ff803210fc441bcd188 Module libnghttp2.so.14 with build-id 7eadbd56a0e5bcd3d8a6b39b9bab2327e380283a Module libpython3.9.so.1.0 with build-id bbe909b82db5ae1835b0022275d690951734a378 Module libevent-2.1.so.7 with build-id af406c254338ff6ceff47360cba92cdcf233cf14 Module libprotobuf-c.so.1 with build-id 46661ae5d66cbaa2aa82b1b765472bdfa4712a24 Module ld-linux-x86-64.so.2 with build-id 1d95aae3e4174446d3b885ad234d4f7e573e71db Module libz.so.1 with build-id 25486226566596e403da5485fb0ec85deed6b9fa Module libc.so.6 with build-id 14830f7e71953d5f0dac317543ac1e3fcdd874f5 Module libunbound.so.8 with build-id def32d1bb7a7d99c59bf62e00c628af0246afa91 Module libm.so.6 with build-id 3eb525d2e163793ef2e888d5bb46e104d11a3201 Module libcap-ng.so.0 with build-id fdca0a301667e15db99d726152b57feeb35e4dbe Module libcrypto.so.3 with build-id ea50b2486363fd2ce58686de4fe12956a9fa4622 Module libssl.so.3 with build-id 6a3692862938d5df4111a2474b84f3ee9124f941 Stack trace of thread 4928: #0 0x000055d658f09ba8 n/a (/usr/bin/ovn-nbctl + 0x16ba8) ELF object binary architecture: AMD x86-64
```
Steps to reproduce
==================
* Deploy standalone with ml2+ovn enabled
Expected result
===============
* Deployment should succeed without any error
Actual result
=============
* Deployment fails because vip is not started
Environment
===========
* The problem is observed only in master so far
Description
===========
The puppet- glance- tripleo- standalone job started to fail consistently.
Example: /zuul.opendev. org/t/openstack /build/ 4757380fddac4d5 9a02f778887727c 0e
https:/
Looking at the deployment log, it seems ovn-dbs-bundle resource fails to start
and pacemaker does not start the vip resource because of location constraint.
https:/ /storage. gra.cloud. ovh.net/ v1/AUTH_ dcaab5e32b234d5 6b626f72581e364 4c/zuul_ opendev_ logs_475/ 846784/ 8/check/ puppet- glance- tripleo- standalone/ 4757380/ logs/undercloud /var/log/ extra/pcs. txt
``` IPaddr2) : Stopped 0.1:5001/ tripleomasterce ntos9/openstack -haproxy: pcmklatest] : bundle- podman- 0 (ocf:heartbeat: podman) : Started standalone 0.1:5001/ tripleomasterce ntos9/openstack -mariadb: pcmklatest] : galera) : Promoted standalone 0.1:5001/ tripleomasterce ntos9/openstack -rabbitmq: pcmklatest] : rabbitmq- cluster) : Started standalone 0.1:5001/ tripleomasterce ntos9/openstack -ovn-northd: pcmklatest] : ovndb-servers) : Unpromoted standalone
Full List of Resources:
* ip-192.168.24.3 (ocf:heartbeat:
* Container bundle: haproxy-bundle [127.0.
* haproxy-
* Container bundle: galera-bundle [127.0.
* galera-bundle-0 (ocf:heartbeat:
* Container bundle: rabbitmq-bundle [127.0.
* rabbitmq-bundle-0 (ocf:heartbeat:
* Container bundle: ovn-dbs-bundle [127.0.
* ovn-dbs-bundle-0 (ocf:ovn:
Failed Resource Actions:
* ovndb_servers promote on ovn-dbs-bundle-0 could not be executed (Timed Out: Resource agent did not complete within 2m) at Tue Jun 21 06:41:09 2022 after 2m1ms
```
Looking at journal log, it seems ovn-nbctl command crashes with core dump.
``` localdomain kernel: traps: ovn-nbctl[212704] trap invalid opcode ip:55d658f09ba8 sp:7ffcdc0e3140 error:0 in ovn-nbctl[ 55d658f05000+ 5c000] localdomain systemd[1]: Started Process Core Dump (PID 212705/UID 0). localdomain systemd- coredump[ 212707] : Process 212704 (ovn-nbctl) of user 0 dumped core.
Jun 21 06:41:08 standalone.
Jun 21 06:41:08 standalone.
Jun 21 06:41:08 standalone.
```
Steps to reproduce
==================
* Deploy standalone with ml2+ovn enabled
Expected result
===============
* Deployment should succeed without any error
Actual result
=============
* Deployment fails because vip is not started
Environment
===========
* The problem is observed only in master so far
Logs & Configs /zuul.opendev. org/t/openstack /build/ 4757380fddac4d5 9a02f778887727c 0e
==============
See https:/