[FT] Failure of "test_floatingip_mac_bindings"

Bug #1955008 reported by Rodolfo Alonso
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Critical
Slawek Kaplonski
Miguel Lavalle (minsel)
Changed in neutron:
status: New → Confirmed
importance: Undecided → Low
importance: Low → Medium
tags: added: functional-tests gate-failure
tags: added: ovn
Changed in neutron:
importance: Medium → Critical
Revision history for this message
Mamatisa Nurmatov (isabek) wrote :
Changed in neutron:
assignee: nobody → Slawek Kaplonski (slaweq)
Revision history for this message
Slawek Kaplonski (slaweq) wrote :
Download full text (7.8 KiB)

I today compared logs from the failed (https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_f76/825521/3/gate/neutron-functional-with-uwsgi/f76cef6/controller/logs/dsvm-functional-logs/neutron.tests.functional.plugins.ml2.drivers.ovn.mech_driver.ovsdb.test_ovsdb_monitor.TestNBDbMonitorOverTcp.test_floatingip_mac_bindings/) and good (https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_f76/825521/3/gate/neutron-functional-with-uwsgi/f76cef6/controller/logs/dsvm-functional-logs/neutron.tests.functional.plugins.ml2.drivers.ovn.mech_driver.ovsdb.test_ovsdb_monitor.TestNBDbMonitor.test_floatingip_mac_bindings/) runs. Here's what I found:

- In the SB logs, there is difference like:

* in good run we have:
2022-01-23T19:28:26.309Z|00140|jsonrpc|DBG|unix#3: send notification, method="update2", params=["a220fbde-7c82-11ec-9a03-4fce359daaf8",{"Datapath_Binding":{"2c98db7e-84ad-4782-9d87-24eafae1700e":{"insert":{"tunnel_key":1,"external_ids":["map",[["logical-switch","5b801987-900b-4b79-aa3b-05c5fdc422eb"],["name","neutron-2d726047-4990-446b-9f1b-9dc6403adb3e"],["name2","network1"]]]}}}}]
2022-01-23T19:28:26.330Z|00141|poll_loop|DBG|wakeup due to [POLLIN] on fd 17 (/tmp/tmp8aro35_w/ovnsb_db.sock<->) at lib/stream-fd.c:157
2022-01-23T19:28:26.330Z|00142|jsonrpc|DBG|unix#3: received request, method="transact", params=["OVN_Southbound",{"uuid-name":"rowdebd1810_12e4_4ae8_9bcd_a18781f1b986","table":"MAC_Binding","op":"insert","row":{"ip":"100.0.0.21","datapath":["uuid","2c98db7e-84ad-4782-9d87-24eafae1700e"]}}], id=1260
2022-01-23T19:28:26.330Z|00143|jsonrpc|DBG|unix#3: send reply, result=[{"uuid":["uuid","e514d19c-7371-4d1b-9958-25ffd612b68b"]}], id=1260
2022-01-23T19:28:26.330Z|00144|jsonrpc|DBG|unix#4: send notification, method="update3", params=[["monid","OVN_Southbound"],"00000000-0000-0000-0000-000000000000",{"MAC_Binding":{"e514d19c-7371-4d1b-9958-25ffd612b68b":{"insert":{"ip":"100.0.0.21","datapath":["uuid","2c98db7e-84ad-4782-9d87-24eafae1700e"]}}}}]
2022-01-23T19:28:26.460Z|00145|poll_loop|DBG|wakeup due to [POLLIN] on fd 18 (/tmp/tmp8aro35_w/ovnsb_db.sock<->) at lib/stream-fd.c:157

but in failed run there is only:

2022-01-23T19:28:46.868Z|00156|jsonrpc|DBG|tcp:127.0.0.1:33296: send notification, method="update2", params=["ae7bd9c6-7c82-11ec-9a03-4fce359daaf8",{"Connection":{"83c29962-a6e2-4f3c-9fc7-888cf46540ed":{"modify":{"is_connected":true,"status":["map",[["bound_port","42597"],["n_connections","2"],["sec_since_connect","0"],["sec_since_disconnect","0"]]]}}},"Datapath_Binding":{"15003f80-2b91-44d6-8b9d-f8d217285fb3":{"insert":{"tunnel_key":1,"external_ids":["map",[["logical-switch","70caa880-3fed-4f9b-9558-f624c0d2c469"],["name","neutron-51ac15e4-24cc-4348-85c3-fe3a51cfcaec"],["name2","network1"]]]}}}}]
2022-01-23T19:28:49.369Z|00157|poll_loop|DBG|wakeup due to 2500-ms timeout at ovsdb/ovsdb-server.c:296 (0% CPU usage)
2022-01-23T19:28:51.872Z|00158|poll_loop|DBG|wakeup due to 2503-ms timeout at ovsdb/ovsdb-server.c:296 (0% CPU usage)
2022-01-23T19:28:54.373Z|00159|poll_loop|DBG|wakeup due to 2501-ms timeout at ovsdb/ovsdb-server.c:296 (0% CPU usage)

and nothi...

Read more...

Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

The ovsdb error is stopping the test:
2022-01-23 19:28:46.875 54853 ERROR ovsdbapp.backend.ovs_idl.vlog [-] tcp:127.0.0.1:42597: error parsing update: ovsdb error: Modify non-existing row: ovs.db.error.Error: ovsdb error: Modify non-existing row

After this line, the test is just stopping the controllers and the DB servers.

I think we should implement [1] but for all FT tests, that means in "TestOVNFunctionalBase" and not only for the NB but for the SB too. In the DB log I see the connection is updated AFTER the network creation command "network1" has been committed:
2022-01-23T19:28:46.868Z|00156|jsonrpc|DBG|tcp:127.0.0.1:33296: send notification, method="update2", params=["ae7bd9c6-7c82-11ec-9a03-4fce359daaf8",{"Connection":{"83c29962-a6e2-4f3c-9fc7-888cf46540ed":{"modify":{"is_connected":true,"status":["map",[["bound_port","42597"],["n_connections","2"],["sec_since_connect","0"],["sec_since_disconnect","0"]]]}}},"Datapath_Binding":{"15003f80-2b91-44d6-8b9d-f8d217285fb3":{"insert":{"tunnel_key":1,"external_ids":["map",[["logical-switch","70caa880-3fed-4f9b-9558-f624c0d2c469"],["name","neutron-51ac15e4-24cc-4348-85c3-fe3a51cfcaec"],["name2","network1"]]]}}}}]

In any case, what I think is failing is the connection to the OVS DB, not the NB or SB DBs.

[0]https://review.opendev.org/q/Id84bf17953527c415d611bfc198038fb6f811de3
[1]https://review.opendev.org/c/openstack/neutron/+/825530/8/neutron/tests/functional/plugins/ml2/drivers/ovn/mech_driver/ovsdb/test_ovn_db_resources.py#32

Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Closing this bug because this error is not being reproduce in the CI.

Changed in neutron:
status: Confirmed → Won't Fix
Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :
Changed in neutron:
status: Won't Fix → Confirmed
summary: - [functional] Failure of "test_floatingip_mac_bindings"
+ [FT] Failure of "test_floatingip_mac_bindings"
Revision history for this message
yatin (yatinkarel) wrote :
Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

It seems that there is a problem in the "_check_mac_binding_exists" call, when executing the "ovsdb-client" command, as seen in c#5 [1] and c#6 [2]. I'll add a tenacity decorator on this method to retry in case of timeout.

[1]https://paste.opendev.org/show/bFLxtBIVP9akE2h61pw7/
[2]https://paste.opendev.org/show/bPc63NVDvSwAlED3zsnm/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/906474

Changed in neutron:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/2023.2)

Fix proposed to branch: stable/2023.2
Review: https://review.opendev.org/c/openstack/neutron/+/907235

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/2023.1)

Fix proposed to branch: stable/2023.1
Review: https://review.opendev.org/c/openstack/neutron/+/907236

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/zed)

Fix proposed to branch: stable/zed
Review: https://review.opendev.org/c/openstack/neutron/+/907237

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/906474
Committed: https://opendev.org/openstack/neutron/commit/64fddf4f2d18b134b5cc8348049a3c4f10f69a28
Submitter: "Zuul (22348)"
Branch: master

commit 64fddf4f2d18b134b5cc8348049a3c4f10f69a28
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Fri Jan 19 11:41:15 2024 +0000

    [OVN][FT] Retry in case of timeout when executing "ovsdb-client".

    The shell command "ovsdb-client", in the functional tests, is prone to
    timeouts. This patch adds a tenacity decorator and sets the command
    timeout to 3 seconds, that should be more than enough to retrieve one
    single register.

    Closes-Bug: #1955008
    Change-Id: I38626835ca809cc3f2894e5f81fab55cf3f40071

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/2023.2)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/907235
Committed: https://opendev.org/openstack/neutron/commit/0526fb868ef16b2725d1a7d0cf737d01fe9d78d0
Submitter: "Zuul (22348)"
Branch: stable/2023.2

commit 0526fb868ef16b2725d1a7d0cf737d01fe9d78d0
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Fri Jan 19 11:41:15 2024 +0000

    [OVN][FT] Retry in case of timeout when executing "ovsdb-client".

    The shell command "ovsdb-client", in the functional tests, is prone to
    timeouts. This patch adds a tenacity decorator and sets the command
    timeout to 3 seconds, that should be more than enough to retrieve one
    single register.

    Closes-Bug: #1955008
    Change-Id: I38626835ca809cc3f2894e5f81fab55cf3f40071
    (cherry picked from commit 64fddf4f2d18b134b5cc8348049a3c4f10f69a28)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/2023.1)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/907236
Committed: https://opendev.org/openstack/neutron/commit/2f678b980cfd9fbd6fdcda18af1a140916630055
Submitter: "Zuul (22348)"
Branch: stable/2023.1

commit 2f678b980cfd9fbd6fdcda18af1a140916630055
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Fri Jan 19 11:41:15 2024 +0000

    [OVN][FT] Retry in case of timeout when executing "ovsdb-client".

    The shell command "ovsdb-client", in the functional tests, is prone to
    timeouts. This patch adds a tenacity decorator and sets the command
    timeout to 3 seconds, that should be more than enough to retrieve one
    single register.

    Closes-Bug: #1955008
    Change-Id: I38626835ca809cc3f2894e5f81fab55cf3f40071
    (cherry picked from commit 64fddf4f2d18b134b5cc8348049a3c4f10f69a28)

tags: added: in-stable-zed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/zed)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/907237
Committed: https://opendev.org/openstack/neutron/commit/1f5ce18a14bc3b3d3884c3dc50daf1c8062605b6
Submitter: "Zuul (22348)"
Branch: stable/zed

commit 1f5ce18a14bc3b3d3884c3dc50daf1c8062605b6
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Fri Jan 19 11:41:15 2024 +0000

    [OVN][FT] Retry in case of timeout when executing "ovsdb-client".

    The shell command "ovsdb-client", in the functional tests, is prone to
    timeouts. This patch adds a tenacity decorator and sets the command
    timeout to 3 seconds, that should be more than enough to retrieve one
    single register.

    Closes-Bug: #1955008
    Change-Id: I38626835ca809cc3f2894e5f81fab55cf3f40071
    (cherry picked from commit 64fddf4f2d18b134b5cc8348049a3c4f10f69a28)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 24.0.0.0rc1

This issue was fixed in the openstack/neutron 24.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 21.2.1

This issue was fixed in the openstack/neutron 21.2.1 release.

Revision history for this message
yatin (yatinkarel) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 22.2.0

This issue was fixed in the openstack/neutron 22.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 23.2.0

This issue was fixed in the openstack/neutron 23.2.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.