Activity log for bug #1963698

Date Who What changed Old value New value Message
2022-03-04 15:33:31 DUFOUR Olivier bug added bug
2022-03-04 15:35:59 DUFOUR Olivier bug added subscriber Canonical Field High
2022-03-04 15:36:18 DUFOUR Olivier removed subscriber Canonical Field High
2022-03-04 15:36:43 DUFOUR Olivier bug added subscriber Canonical Field High
2022-03-04 15:39:34 Alexander Litvinov bug added subscriber Alexander Litvinov
2022-03-04 15:45:27 Launchpad Janitor ovn (Ubuntu): status New Confirmed
2022-03-04 15:45:27 Nobuto Murata affects networking-ovn ovn (Ubuntu)
2022-03-04 15:45:59 Nobuto Murata bug added subscriber Nobuto Murata
2022-03-04 16:12:51 DUFOUR Olivier description We are deploying Focal Wallaby for a customer Neutron package version (2:18.2.0-0ubuntu1~cloud0), GLIBC 2.31-0ubuntu9.7 When running rally/tempest tests that are creating some VMs, the following symptoms happen: 1) A huge increase of size and load of writings on /var/lib/openvswitch/conf.db (If ovsdb-server is restarted while OVS database is a few GB, the unit can fail to start) 2) A very high CPU usage on the following processes : * neutron-ovn-metadata-agent * nova-compute * ovn-controller * ovsdb-server 3) The Nova compute node may face some severe delays and may time-out when creating any instance (for Nova or Octavia Amphora) on it. A temporary way to solve the issue is to restart ovn-controller service. Then it reproduces again after some time on a different hypervisor. It has been reproducible so far only on a customer deployment with many Nova-compute units. Ovn-controller.log on the hypervisor: 2022-03-04T12:54:43.065Z|00479|binding|INFO|Changing chassis for lport cr-lrp-f741e3f2-4708-4091-841d-4a9c05f09b53 from comp04.maas to comp18.maas . 2022-03-04T12:54:43.065Z|00480|binding|INFO|cr-lrp-f741e3f2-4708-4091-841d-4a9c05f09b53: Claiming fa:16:3e:15:1f:a6 10.218.131.106/18 2022-03-04T12:54:43.077Z|00481|binding|INFO|Releasing lport cr-lrp-f741e3f2-4708-4091-841d-4a9c05f09b53 from this chassis. 2022-03-04T12:54:46.798Z|00482|poll_loop|INFO|wakeup due to [POLLIN] on fd 13 (<->/var/run/openvswitch/db.sock) at lib/stream-fd.c:157 (64% CPU usage) 2022-03-04T12:54:46.799Z|00483|poll_loop|INFO|wakeup due to [POLLIN] on fd 13 (<->/var/run/openvswitch/db.sock) at lib/stream-fd.c:157 (64% CPU usage) 2022-03-04T12:54:46.799Z|00484|poll_loop|INFO|wakeup due to [POLLIN] on fd 13 (<->/var/run/openvswitch/db.sock) at lib/stream-fd.c:157 (64% CPU usage) 2022-03-04T12:54:46.799Z|00485|poll_loop|INFO|wakeup due to [POLLIN] on fd 13 (<->/var/run/openvswitch/db.sock) at lib/stream-fd.c:157 (64% CPU usage) Full log of ovn-controller available here : https://private-fileshare.canonical.com/~alitvinov/random/ovn-controller.txt We are deploying Focal Wallaby for a customer Neutron package version (2:18.2.0-0ubuntu1~cloud0), GLIBC 2.31-0ubuntu9.7 When running rally/tempest tests that are creating some VMs, the following symptoms happen: 1) A huge increase of size and load of writings on /var/lib/openvswitch/conf.db (If ovsdb-server is restarted while OVS database is a few GB, the unit can fail to start) 2) A very high CPU usage on the following processes : * neutron-ovn-metadata-agent * nova-compute * ovn-controller * ovsdb-server 3) The Nova compute node may face some severe delays and may time-out when creating any instance (for Nova or Octavia Amphora) on it. A temporary way to solve the issue is to restart ovn-controller service. Then it reproduces again after some time on a different hypervisor. It has been reproducible so far only on a customer deployment with many Nova-compute units. Ovn-controller.log on the hypervisor: 2022-03-04T12:54:43.065Z|00479|binding|INFO|Changing chassis for lport cr-lrp-f741e3f2-4708-4091-841d-4a9c05f09b53 from comp04.maas to comp18.maas . 2022-03-04T12:54:43.065Z|00480|binding|INFO|cr-lrp-f741e3f2-4708-4091-841d-4a9c05f09b53: Claiming fa:16:3e:15:1f:a6 10.218.131.106/18 2022-03-04T12:54:43.077Z|00481|binding|INFO|Releasing lport cr-lrp-f741e3f2-4708-4091-841d-4a9c05f09b53 from this chassis. 2022-03-04T12:54:46.798Z|00482|poll_loop|INFO|wakeup due to [POLLIN] on fd 13 (<->/var/run/openvswitch/db.sock) at lib/stream-fd.c:157 (64% CPU usage) 2022-03-04T12:54:46.799Z|00483|poll_loop|INFO|wakeup due to [POLLIN] on fd 13 (<->/var/run/openvswitch/db.sock) at lib/stream-fd.c:157 (64% CPU usage) 2022-03-04T12:54:46.799Z|00484|poll_loop|INFO|wakeup due to [POLLIN] on fd 13 (<->/var/run/openvswitch/db.sock) at lib/stream-fd.c:157 (64% CPU usage) 2022-03-04T12:54:46.799Z|00485|poll_loop|INFO|wakeup due to [POLLIN] on fd 13 (<->/var/run/openvswitch/db.sock) at lib/stream-fd.c:157 (64% CPU usage) Full log of ovn-controller available here : https://private-fileshare.canonical.com/~alitvinov/random/ovn-controller.txt Bundle available as well here : https://private-fileshare.canonical.com/~alitvinov/random/bundle-ovn-controller.txt
2022-03-07 03:19:09 Nobuto Murata bug added subscriber Yoshi Kadokawa
2022-03-07 06:44:35 Dominique Poulain bug added subscriber Dominique Poulain
2022-03-08 11:46:53 DUFOUR Olivier removed subscriber Canonical Field High