[ovn] neutron_ovn_db_sync_util hangs on sync_routers_and_rports

Bug #1950679 reported by Daniel Speichert
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu Cloud Archive
Invalid
Undecided
Unassigned
Wallaby
Fix Released
High
Unassigned
neutron
Fix Released
High
Daniel Speichert

Bug Description

neutron-ovn-db-sync-util hangs in certain scenarios while running sync_routers_and_rports.

Specifically, it seems to be hanging on self.l3_plugin.get_routers(ctx)
-> model_query.get_collection(...) of get_routers(...) in neutron.db.l3_db.py
-> get_collection(...) in neutron_lib.db.model_query.py runs dict_funcs which somehow reaches to nb_ovn property accessor in neutron.plugins.ml2.drivers.ovn.mech_driver.mech_driver.py
-> which runs self._post_fork_event.wait()

That mutex seems to never be "set" and blocks further execution because it might not be applicable to this flow.

It looks like the neutron-ovn-db-sync-util might need to always "set" it since it mocks other parts of the NB/DB client in a similar fashion to some unit tests.

I'm not yet sure what kind of exact circumstances lead to that access and that wait(), syncing via the util to an empty OVN NB/DB seems to work. I see the issue more frequently on subsequent runs.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/817637

Changed in neutron:
status: New → In Progress
Changed in neutron:
assignee: nobody → Daniel Speichert (dasp)
tags: added: ovn
Changed in neutron:
importance: Undecided → Medium
importance: Medium → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/817637
Committed: https://opendev.org/openstack/neutron/commit/7e2f73350ffdc90f7b340788db36edc439f96f6e
Submitter: "Zuul (22348)"
Branch: master

commit 7e2f73350ffdc90f7b340788db36edc439f96f6e
Author: Daniel Speichert <email address hidden>
Date: Thu Nov 11 13:18:49 2021 -0500

    [OVN] Fix deadlock in neutron_ovn_db_sync_util.py

    A feature to synchronize OVN DB connections when handling events
    introduced in 90980f496cfa3cc5df1c93cf834a44f33d3f1f6f is not applicable
    to the offline sync process executed by this utility.

    Closes-bug: #1950679
    Change-Id: Iac4eb364bfc1c44f5d4526bae71967bede29cc36

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/xena)

Fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/neutron/+/818583

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/818583
Committed: https://opendev.org/openstack/neutron/commit/dd89a0748c15ec3905b1680957bd959d8d822c10
Submitter: "Zuul (22348)"
Branch: stable/xena

commit dd89a0748c15ec3905b1680957bd959d8d822c10
Author: Daniel Speichert <email address hidden>
Date: Thu Nov 11 13:18:49 2021 -0500

    [OVN] Fix deadlock in neutron_ovn_db_sync_util.py

    A feature to synchronize OVN DB connections when handling events
    introduced in 90980f496cfa3cc5df1c93cf834a44f33d3f1f6f is not applicable
    to the offline sync process executed by this utility.

    Closes-bug: #1950679
    Change-Id: Iac4eb364bfc1c44f5d4526bae71967bede29cc36
    (cherry picked from commit 7e2f73350ffdc90f7b340788db36edc439f96f6e)

tags: added: in-stable-xena
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 19.1.0

This issue was fixed in the openstack/neutron 19.1.0 release.

tags: added: neutron-proactive-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 20.0.0.0rc1

This issue was fixed in the openstack/neutron 20.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/neutron/+/843190

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/843190
Committed: https://opendev.org/openstack/neutron/commit/941c822b7ba62217e3f4740b5ca5662895d60ae8
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit 941c822b7ba62217e3f4740b5ca5662895d60ae8
Author: Daniel Speichert <email address hidden>
Date: Thu Nov 11 13:18:49 2021 -0500

    [OVN] Fix deadlock in neutron_ovn_db_sync_util.py

    A feature to synchronize OVN DB connections when handling events
    introduced in 90980f496cfa3cc5df1c93cf834a44f33d3f1f6f is not applicable
    to the offline sync process executed by this utility.

    Closes-bug: #1950679
    Change-Id: Iac4eb364bfc1c44f5d4526bae71967bede29cc36
    (cherry picked from commit 7e2f73350ffdc90f7b340788db36edc439f96f6e)
    (cherry picked from commit dd89a0748c15ec3905b1680957bd959d8d822c10)

tags: added: in-stable-wallaby
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 18.5.0

This issue was fixed in the openstack/neutron 18.5.0 release.

Changed in cloud-archive:
status: New → Invalid
Revision history for this message
Corey Bryant (corey.bryant) wrote : Please test proposed package

Hello Daniel, or anyone else affected,

Accepted neutron into wallaby-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository.

Please help us by testing this new package. To enable the -proposed repository:

  sudo add-apt-repository cloud-archive:wallaby-proposed
  sudo apt-get update

Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-wallaby-needed to verification-wallaby-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-wallaby-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags: added: verification-wallaby-needed
Revision history for this message
Liam Young (gnuoy) wrote :

Thank you Corey the package in wallaby-proposed fixed the issue for me.

Fixed with 2:18.4.0-0ubuntu1~cloud3

Broken with 2:18.4.0-0ubuntu1~cloud0

tags: added: verification-wallaby-done
removed: verification-wallaby-needed
Revision history for this message
Corey Bryant (corey.bryant) wrote : Update Released

The verification of the Stable Release Update for neutron has completed successfully and the package has now been released to -updates. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Corey Bryant (corey.bryant) wrote :

This bug was fixed in the package neutron - 2:18.4.0-0ubuntu1~cloud3
---------------

 neutron (2:18.4.0-0ubuntu1~cloud3) focal-wallaby; urgency=medium
 .
   * d/p/ovn-fix-deadlock-in-neutron-ovn-db-sync-util.patch: Picked from
     upstream to fix a deadlock in neutron-ovn-db-sync-util (LP: #1950679).

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.