Functional tests failing with ovsdbapp.exceptions.TimeoutException

Bug #1734090 reported by Daniel Alvarez
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
networking-ovn
Invalid
Undecided
Unassigned

Bug Description

Since Nov 17th, we're seeing timeouts in ovsdbapp in our functional tests [0].
We bumped ovsdbapp to 0.8.0 that exact same day (see commit a9af75cd3ce6cd6685b6435b325c97cacc83ce0e) so it could be related.

The error is as follows:

2017-11-22 20:36:52.325716 | primary | 2017-11-22 20:36:52.325 | ovsdbapp.exceptions.TimeoutException: Commands [<networking_ovn.ovsdb.commands.AddAddrSetCommand object at 0x7f1bce860350>, <networking_ovn.ovsdb.commands.UpdateAddrSetCommand object at 0x7f1bce8603d0>, <networking_ovn.ovsdb.commands.DelAddrSetCommand object at 0x7f1bce860ad0>, <networking_ovn.ovsdb.commands.DelAddrSetCommand object at 0x7f1bced443d0>] exceeded timeout 5 seconds

Tests failing:

networking_ovn.tests.functional.test_ovn_db_sync.TestOvnNbSyncOverSsl.test_ovn_nb_sync_repair
networking_ovn.tests.functional.test_ovn_db_sync.TestOvnNbSyncOverTcp.test_ovn_nb_sync_repair

[0] http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22ovsdbapp.exceptions.TimeoutException%5C%22%20AND%20project%3A%5C%22openstack%2Fnetworking-ovn%5C%22

Revision history for this message
Daniel Alvarez (dalvarezs) wrote :

I sent this patch [0] to test with the release branch of OVS instead of master branch and it passed twice. I'll recheck a few more times and if they pass we should look into OVS patches to find the culprit.

[0] https://review.openstack.org/#/c/522574/

Revision history for this message
Daniel Alvarez (dalvarezs) wrote :

I also sent this patch [0] to have functional test logs in networking-ovn.
Looks like the connection to ovsdb is lost (some crash?) [1] and then we're failing
to connect again. As we're starting ovsdb-server in the test, if it crashes
we're not going to recover. Why it fails is still unknown to me but I'll try to
dig further to see if it crashes at some point and why.

2017-11-23 12:04:39.378 24566 INFO ovsdbapp.backend.ovs_idl.vlog [-] ssl:127.0.0.1:45997: connection closed by peer
2017-11-23 12:04:39.379 24566 INFO ovsdbapp.backend.ovs_idl.vlog [-] ssl:127.0.0.1:35722: connection closed by peer
2017-11-23 12:04:39.379 24566 INFO ovsdbapp.backend.ovs_idl.vlog [-] ssl:127.0.0.1:45997: connection closed by peer
2017-11-23 12:04:39.379 24566 INFO ovsdbapp.backend.ovs_idl.vlog [-] ssl:127.0.0.1:35722: connection closed by peer
2017-11-23 12:04:39.380 24566 INFO ovsdbapp.backend.ovs_idl.vlog [-] ssl:127.0.0.1:35722: connection closed by peer
2017-11-23 12:04:39.380 24566 INFO ovsdbapp.backend.ovs_idl.vlog [-] ssl:127.0.0.1:45997: connection closed by peer

[0] https://review.openstack.org/#/c/522528/
[1] http://logs.openstack.org/28/522528/1/check/networking-ovn-dsvm-functional/ff0ad9a/logs/dsvm-functional-logs/networking_ovn.tests.functional.test_ovn_db_sync.TestOvnNbSyncOverSsl.test_ovn_nb_sync_repair.txt.gz#_2017-11-23_12_04_39_378

Revision history for this message
Daniel Alvarez (dalvarezs) wrote :

Disregard #2, connection is closed by ovsdb-server and only 6 seconds later it's recovered.
In the meantime we can see 3030 TRY_AGAIN messages from ovsdbapp!

Revision history for this message
Lucas Alvares Gomes (lucasagomes) wrote :

Closing this as Invalid because the fix has already been merged in ovsdbapp according to the author of the bug ticket.

Changed in networking-ovn:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers