2022-03-04 15:33:31 |
DUFOUR Olivier |
bug |
|
|
added bug |
2022-03-04 15:35:59 |
DUFOUR Olivier |
bug |
|
|
added subscriber Canonical Field High |
2022-03-04 15:36:18 |
DUFOUR Olivier |
removed subscriber Canonical Field High |
|
|
|
2022-03-04 15:36:43 |
DUFOUR Olivier |
bug |
|
|
added subscriber Canonical Field High |
2022-03-04 15:39:34 |
Alexander Litvinov |
bug |
|
|
added subscriber Alexander Litvinov |
2022-03-04 15:45:27 |
Launchpad Janitor |
ovn (Ubuntu): status |
New |
Confirmed |
|
2022-03-04 15:45:27 |
Nobuto Murata |
affects |
networking-ovn |
ovn (Ubuntu) |
|
2022-03-04 15:45:59 |
Nobuto Murata |
bug |
|
|
added subscriber Nobuto Murata |
2022-03-04 16:12:51 |
DUFOUR Olivier |
description |
We are deploying Focal Wallaby for a customer
Neutron package version (2:18.2.0-0ubuntu1~cloud0), GLIBC 2.31-0ubuntu9.7
When running rally/tempest tests that are creating some VMs, the following symptoms happen:
1) A huge increase of size and load of writings on /var/lib/openvswitch/conf.db
(If ovsdb-server is restarted while OVS database is a few GB, the unit can fail to start)
2) A very high CPU usage on the following processes :
* neutron-ovn-metadata-agent
* nova-compute
* ovn-controller
* ovsdb-server
3) The Nova compute node may face some severe delays and may time-out when creating any instance (for Nova or Octavia Amphora) on it.
A temporary way to solve the issue is to restart ovn-controller service.
Then it reproduces again after some time on a different hypervisor.
It has been reproducible so far only on a customer deployment with many Nova-compute units.
Ovn-controller.log on the hypervisor:
2022-03-04T12:54:43.065Z|00479|binding|INFO|Changing chassis for lport cr-lrp-f741e3f2-4708-4091-841d-4a9c05f09b53 from comp04.maas to comp18.maas
.
2022-03-04T12:54:43.065Z|00480|binding|INFO|cr-lrp-f741e3f2-4708-4091-841d-4a9c05f09b53: Claiming fa:16:3e:15:1f:a6 10.218.131.106/18
2022-03-04T12:54:43.077Z|00481|binding|INFO|Releasing lport cr-lrp-f741e3f2-4708-4091-841d-4a9c05f09b53 from this chassis.
2022-03-04T12:54:46.798Z|00482|poll_loop|INFO|wakeup due to [POLLIN] on fd 13 (<->/var/run/openvswitch/db.sock) at lib/stream-fd.c:157 (64% CPU usage)
2022-03-04T12:54:46.799Z|00483|poll_loop|INFO|wakeup due to [POLLIN] on fd 13 (<->/var/run/openvswitch/db.sock) at lib/stream-fd.c:157 (64% CPU usage)
2022-03-04T12:54:46.799Z|00484|poll_loop|INFO|wakeup due to [POLLIN] on fd 13 (<->/var/run/openvswitch/db.sock) at lib/stream-fd.c:157 (64% CPU usage)
2022-03-04T12:54:46.799Z|00485|poll_loop|INFO|wakeup due to [POLLIN] on fd 13 (<->/var/run/openvswitch/db.sock) at lib/stream-fd.c:157 (64% CPU usage)
Full log of ovn-controller available here :
https://private-fileshare.canonical.com/~alitvinov/random/ovn-controller.txt |
We are deploying Focal Wallaby for a customer
Neutron package version (2:18.2.0-0ubuntu1~cloud0), GLIBC 2.31-0ubuntu9.7
When running rally/tempest tests that are creating some VMs, the following symptoms happen:
1) A huge increase of size and load of writings on /var/lib/openvswitch/conf.db
(If ovsdb-server is restarted while OVS database is a few GB, the unit can fail to start)
2) A very high CPU usage on the following processes :
* neutron-ovn-metadata-agent
* nova-compute
* ovn-controller
* ovsdb-server
3) The Nova compute node may face some severe delays and may time-out when creating any instance (for Nova or Octavia Amphora) on it.
A temporary way to solve the issue is to restart ovn-controller service.
Then it reproduces again after some time on a different hypervisor.
It has been reproducible so far only on a customer deployment with many Nova-compute units.
Ovn-controller.log on the hypervisor:
2022-03-04T12:54:43.065Z|00479|binding|INFO|Changing chassis for lport cr-lrp-f741e3f2-4708-4091-841d-4a9c05f09b53 from comp04.maas to comp18.maas
.
2022-03-04T12:54:43.065Z|00480|binding|INFO|cr-lrp-f741e3f2-4708-4091-841d-4a9c05f09b53: Claiming fa:16:3e:15:1f:a6 10.218.131.106/18
2022-03-04T12:54:43.077Z|00481|binding|INFO|Releasing lport cr-lrp-f741e3f2-4708-4091-841d-4a9c05f09b53 from this chassis.
2022-03-04T12:54:46.798Z|00482|poll_loop|INFO|wakeup due to [POLLIN] on fd 13 (<->/var/run/openvswitch/db.sock) at lib/stream-fd.c:157 (64% CPU usage)
2022-03-04T12:54:46.799Z|00483|poll_loop|INFO|wakeup due to [POLLIN] on fd 13 (<->/var/run/openvswitch/db.sock) at lib/stream-fd.c:157 (64% CPU usage)
2022-03-04T12:54:46.799Z|00484|poll_loop|INFO|wakeup due to [POLLIN] on fd 13 (<->/var/run/openvswitch/db.sock) at lib/stream-fd.c:157 (64% CPU usage)
2022-03-04T12:54:46.799Z|00485|poll_loop|INFO|wakeup due to [POLLIN] on fd 13 (<->/var/run/openvswitch/db.sock) at lib/stream-fd.c:157 (64% CPU usage)
Full log of ovn-controller available here :
https://private-fileshare.canonical.com/~alitvinov/random/ovn-controller.txt
Bundle available as well here :
https://private-fileshare.canonical.com/~alitvinov/random/bundle-ovn-controller.txt |
|
2022-03-07 03:19:09 |
Nobuto Murata |
bug |
|
|
added subscriber Yoshi Kadokawa |
2022-03-07 06:44:35 |
Dominique Poulain |
bug |
|
|
added subscriber Dominique Poulain |
2022-03-08 11:46:53 |
DUFOUR Olivier |
removed subscriber Canonical Field High |
|
|
|