Comment 6 for bug 1964940

Revision history for this message
yatin (yatinkarel) wrote :

I looked into this and agree with slawek that something is wrong on Neutron OVN side. Adding my findings as below:-

Some Data points:-
- Issue is random as jobs succeeds some time[1], so likely some race or missing events somehow
- Issue is not specific to wallaby, i can see similar failure in all releases(train+)[2]
- Issue is not specific to Distro, seeing in both C8, RHEL8 and C9 jobs[2]
- Issue happening from long, i could see failures one month back, before that logs are not persisted, adding reference to logs from last month[2]
- Issue also seen in jobs running with 1 controller[3], found only few occurances, looked only in wallaby and train.

<< - and OVN reports status UP, but it's way to long after vm was already deleted:
<< 2022-03-15 16:50:31.218 15 INFO neutron.plugins.ml2.drivers.ovn.mech_driver.mech_driver [req-dbbfd0fb-bec7-4a80-83af-c863ca531175 - - - - -] OVN reports status up for port: 6a712e97-bc61-49a0-aee6-66d4fcd7b72d

Seems ^ is triggered instead(of PortBindingUpdateUpEvent, missed somehow) by Maintenance task: Fixing resource 6a712e97-bc61-49a0-aee6-66d4fcd7b72d (type: ports) at create/update check_for_inconsistencies /usr/lib/python3.9/site-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/maintenance.py:358

One thing i noticed in logs i checked is "OVN reports status up for port" is not logged after event
PortBindingUpdateUpEvent. But didn't got what can cause it as i see it's the first statement to be executed with the event[4][5].

Considering the related event, [6] looked suspicious which is backported till Ussuri. But since it seen also in Train, may be [6] just increased reproducibility or it is some general issue with events processing. Will involve Luis(author of [6]) and someone with better understanding around these.

[1]
https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-wallaby
https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp-featureset001-wallaby
[2]
https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp-featureset001-master/66b141d/logs/undercloud/var/log/tempest/stestr_results.html.gz
https://logserver.rdoproject.org/openstack-periodic-integration-stable1-cs9/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp-featureset001-wallaby/6b04066/logs/undercloud/var/log/tempest/stestr_results.html.gz
https://logserver.rdoproject.org/openstack-periodic-integration-stable1/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-wallaby/ee71b17/logs/undercloud/var/log/tempest/stestr_results.html.gz
https://logserver.rdoproject.org/openstack-periodic-integration-stable2/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-victoria/ef11bd8/logs/undercloud/var/log/tempest/stestr_results.html.gz
https://logserver.rdoproject.org/openstack-periodic-integration-stable3/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-ussuri/da5136f/logs/undercloud/var/log/tempest/stestr_results.html.gz
https://logserver.rdoproject.org/openstack-periodic-integration-stable4/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-train/1cc8115/stestr_results.html.gz
[3]
https://logserver.rdoproject.org/openstack-periodic-integration-stable4/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-1ctlr_2comp-featureset020-train/cea1b09/stestr_results.html.gz
https://logserver.rdoproject.org/29/37029/24/check/periodic-tripleo-ci-centos-8-ovb-1ctlr_2comp-featureset020-wallaby/99bccb6/stestr_results.html.gz
[4] https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovsdb_monitor.py#L531
[5] https://github.com/openstack/neutron/blob/2d160d9eec6c09b01e4d7ae0507eb2d09527b576/neutron/plugins/ml2/drivers/ovn/mech_driver/mech_driver.py#L1077
[6] https://review.opendev.org/q/Ib071889271f4e4d6acd83b219bf908a9ae80ce5c

<< i wonder if this may be related to https://bugs.launchpad.net/neutron/+bug/1961184 ?
Doesn't look related based on above points and that is targetting only virtual ports.