Activity log for bug #1979276

Date Who What changed Old value New value Message
2022-06-21 07:29:14 Takashi Kajinami bug added bug
2022-06-21 07:36:03 Takashi Kajinami description Description =========== The puppet-glance-tripleo-standalone job started to fail consistently. Example: https://zuul.opendev.org/t/openstack/build/4757380fddac4d59a02f778887727c0e Looking at the deployment log, it seems ovn-dbs-bundle resource fails to start and pacemaker does not start the vip resource because of location constraint. https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_475/846784/8/check/puppet-glance-tripleo-standalone/4757380/logs/undercloud/var/log/extra/pcs.txt ``` Full List of Resources: * ip-192.168.24.3 (ocf:heartbeat:IPaddr2): Stopped * Container bundle: haproxy-bundle [127.0.0.1:5001/tripleomastercentos9/openstack-haproxy:pcmklatest]: * haproxy-bundle-podman-0 (ocf:heartbeat:podman): Started standalone * Container bundle: galera-bundle [127.0.0.1:5001/tripleomastercentos9/openstack-mariadb:pcmklatest]: * galera-bundle-0 (ocf:heartbeat:galera): Promoted standalone * Container bundle: rabbitmq-bundle [127.0.0.1:5001/tripleomastercentos9/openstack-rabbitmq:pcmklatest]: * rabbitmq-bundle-0 (ocf:heartbeat:rabbitmq-cluster): Started standalone * Container bundle: ovn-dbs-bundle [127.0.0.1:5001/tripleomastercentos9/openstack-ovn-northd:pcmklatest]: * ovn-dbs-bundle-0 (ocf:ovn:ovndb-servers): Unpromoted standalone Failed Resource Actions: * ovndb_servers promote on ovn-dbs-bundle-0 could not be executed (Timed Out: Resource agent did not complete within 2m) at Tue Jun 21 06:41:09 2022 after 2m1ms ``` Looking at journal log, it seems ovn-nbctl command crashes with core dump. ``` Jun 21 06:41:08 standalone.localdomain kernel: traps: ovn-nbctl[212704] trap invalid opcode ip:55d658f09ba8 sp:7ffcdc0e3140 error:0 in ovn-nbctl[55d658f05000+5c000] Jun 21 06:41:08 standalone.localdomain systemd[1]: Started Process Core Dump (PID 212705/UID 0). Jun 21 06:41:08 standalone.localdomain systemd-coredump[212707]: Process 212704 (ovn-nbctl) of user 0 dumped core. Module /usr/bin/ovn-nbctl with build-id 2798d30ce0833d6e0fcabb6d8a0a98cba4da707d Module linux-vdso.so.1 with build-id 932e8861e1b4a3fa34f93ff803210fc441bcd188 Module libnghttp2.so.14 with build-id 7eadbd56a0e5bcd3d8a6b39b9bab2327e380283a Module libpython3.9.so.1.0 with build-id bbe909b82db5ae1835b0022275d690951734a378 Module libevent-2.1.so.7 with build-id af406c254338ff6ceff47360cba92cdcf233cf14 Module libprotobuf-c.so.1 with build-id 46661ae5d66cbaa2aa82b1b765472bdfa4712a24 Module ld-linux-x86-64.so.2 with build-id 1d95aae3e4174446d3b885ad234d4f7e573e71db Module libz.so.1 with build-id 25486226566596e403da5485fb0ec85deed6b9fa Module libc.so.6 with build-id 14830f7e71953d5f0dac317543ac1e3fcdd874f5 Module libunbound.so.8 with build-id def32d1bb7a7d99c59bf62e00c628af0246afa91 Module libm.so.6 with build-id 3eb525d2e163793ef2e888d5bb46e104d11a3201 Module libcap-ng.so.0 with build-id fdca0a301667e15db99d726152b57feeb35e4dbe Module libcrypto.so.3 with build-id ea50b2486363fd2ce58686de4fe12956a9fa4622 Module libssl.so.3 with build-id 6a3692862938d5df4111a2474b84f3ee9124f941 Stack trace of thread 4928: #0 0x000055d658f09ba8 n/a (/usr/bin/ovn-nbctl + 0x16ba8) ELF object binary architecture: AMD x86-64 ``` Steps to reproduce ================== * Deploy standalone with ml2+ovn enabled Expected result =============== * Deployment should succeed without any error Actual result ============= * Deployment fails because vip is not started Environment =========== * The problem is observed only in master so far Logs & Configs ============== See https://zuul.opendev.org/t/openstack/build/4757380fddac4d59a02f778887727c0e Description =========== The puppet-glance-tripleo-standalone job started failong consistently. Example: https://zuul.opendev.org/t/openstack/build/4757380fddac4d59a02f778887727c0e Looking at the deployment log, it seems ovn-dbs-bundle resource fails to start and pacemaker does not start the vip resource because of location constraint. https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_475/846784/8/check/puppet-glance-tripleo-standalone/4757380/logs/undercloud/var/log/extra/pcs.txt ``` Full List of Resources:   * ip-192.168.24.3 (ocf:heartbeat:IPaddr2): Stopped   * Container bundle: haproxy-bundle [127.0.0.1:5001/tripleomastercentos9/openstack-haproxy:pcmklatest]:     * haproxy-bundle-podman-0 (ocf:heartbeat:podman): Started standalone   * Container bundle: galera-bundle [127.0.0.1:5001/tripleomastercentos9/openstack-mariadb:pcmklatest]:     * galera-bundle-0 (ocf:heartbeat:galera): Promoted standalone   * Container bundle: rabbitmq-bundle [127.0.0.1:5001/tripleomastercentos9/openstack-rabbitmq:pcmklatest]:     * rabbitmq-bundle-0 (ocf:heartbeat:rabbitmq-cluster): Started standalone   * Container bundle: ovn-dbs-bundle [127.0.0.1:5001/tripleomastercentos9/openstack-ovn-northd:pcmklatest]:     * ovn-dbs-bundle-0 (ocf:ovn:ovndb-servers): Unpromoted standalone Failed Resource Actions:   * ovndb_servers promote on ovn-dbs-bundle-0 could not be executed (Timed Out: Resource agent did not complete within 2m) at Tue Jun 21 06:41:09 2022 after 2m1ms ``` Looking at journal log, it seems ovn-nbctl command crashes with core dump. ``` Jun 21 06:41:08 standalone.localdomain kernel: traps: ovn-nbctl[212704] trap invalid opcode ip:55d658f09ba8 sp:7ffcdc0e3140 error:0 in ovn-nbctl[55d658f05000+5c000] Jun 21 06:41:08 standalone.localdomain systemd[1]: Started Process Core Dump (PID 212705/UID 0). Jun 21 06:41:08 standalone.localdomain systemd-coredump[212707]: Process 212704 (ovn-nbctl) of user 0 dumped core.                                                                  Module /usr/bin/ovn-nbctl with build-id 2798d30ce0833d6e0fcabb6d8a0a98cba4da707d                                                                  Module linux-vdso.so.1 with build-id 932e8861e1b4a3fa34f93ff803210fc441bcd188                                                                  Module libnghttp2.so.14 with build-id 7eadbd56a0e5bcd3d8a6b39b9bab2327e380283a                                                                  Module libpython3.9.so.1.0 with build-id bbe909b82db5ae1835b0022275d690951734a378                                                                  Module libevent-2.1.so.7 with build-id af406c254338ff6ceff47360cba92cdcf233cf14                                                                  Module libprotobuf-c.so.1 with build-id 46661ae5d66cbaa2aa82b1b765472bdfa4712a24                                                                  Module ld-linux-x86-64.so.2 with build-id 1d95aae3e4174446d3b885ad234d4f7e573e71db                                                                  Module libz.so.1 with build-id 25486226566596e403da5485fb0ec85deed6b9fa                                                                  Module libc.so.6 with build-id 14830f7e71953d5f0dac317543ac1e3fcdd874f5                                                                  Module libunbound.so.8 with build-id def32d1bb7a7d99c59bf62e00c628af0246afa91                                                                  Module libm.so.6 with build-id 3eb525d2e163793ef2e888d5bb46e104d11a3201                                                                  Module libcap-ng.so.0 with build-id fdca0a301667e15db99d726152b57feeb35e4dbe                                                                  Module libcrypto.so.3 with build-id ea50b2486363fd2ce58686de4fe12956a9fa4622                                                                  Module libssl.so.3 with build-id 6a3692862938d5df4111a2474b84f3ee9124f941                                                                  Stack trace of thread 4928:                                                                  #0 0x000055d658f09ba8 n/a (/usr/bin/ovn-nbctl + 0x16ba8)                                                                  ELF object binary architecture: AMD x86-64 ``` Steps to reproduce ================== * Deploy standalone with ml2+ovn enabled Expected result =============== * Deployment should succeed without any error Actual result ============= * Deployment fails because vip is not started Environment =========== * The problem is observed only in master so far Logs & Configs ============== See https://zuul.opendev.org/t/openstack/build/4757380fddac4d59a02f778887727c0e
2022-06-21 09:18:51 yatin bug added subscriber yatin
2022-06-21 11:39:18 Arx Cruz tripleo: status New Triaged
2022-06-21 11:39:20 Arx Cruz tripleo: importance Undecided Critical
2022-06-21 11:39:45 Arx Cruz tags alert ci promotion-blocker
2022-06-23 12:00:30 yatin bug watch added https://bugzilla.redhat.com/show_bug.cgi?id=2100393
2022-06-23 14:42:26 Dr. Jens Harbott bug added subscriber Dr. Jens Harbott
2022-06-27 14:41:39 yatin tripleo: milestone yoga-1
2023-02-14 15:17:24 Alan Pevec tripleo: status Triaged Invalid