[ovn] overlapping security group rules break neutron-ovn-db-sync-util

Bug #1961112 reported by Daniel Speichert
16
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu Cloud Archive
Fix Released
Undecided
Unassigned
Ussuri
Fix Released
High
Unassigned
neutron
Fix Released
Critical
Jake Yip
neutron (Ubuntu)
Fix Released
Undecided
Unassigned
Focal
Fix Released
High
Unassigned

Bug Description

Neutron (Xena) is happy to accept equivalent rules with overlapping remote CIDR prefix as long as the notation is different, e.g. 10.0.0.0/8 and 10.0.0.1/8.

However, OVN is smarter, normalizes the prefix and figures out that they both are 10.0.0.0/8.

This does not have any fatal effects in a running OVN deployment (creating and using such rules does not even trigger a warning) but upon running neutron-ovn-db-sync-util, it crashes and won't perform a sync. This is a blocker for upgrades (and other scenarios).

Security group's rules:

$ openstack security group rule list overlap-sgr
+--------------------------------------+-------------+-----------+------------+------------+-----------+-----------------------+----------------------+
| ID | IP Protocol | Ethertype | IP Range | Port Range | Direction | Remote Security Group | Remote Address Group |
+--------------------------------------+-------------+-----------+------------+------------+-----------+-----------------------+----------------------+
| 3c41fa80-1d23-49c9-9ec1-adf581e07e24 | tcp | IPv4 | 10.0.0.1/8 | | ingress | None | None |
| 639d263e-6873-47cb-b2c4-17fc824252db | None | IPv4 | 0.0.0.0/0 | | egress | None | None |
| 96e99039-cbc0-48fe-98fe-ef28d41b9d9b | tcp | IPv4 | 10.0.0.0/8 | | ingress | None | None |
| bf9160a3-fc9b-467e-85d5-c889811fd6ca | None | IPv6 | ::/0 | | egress | None | None |
+--------------------------------------+-------------+-----------+------------+------------+-----------+-----------------------+----------------------+

Log excerpt:
16/Feb/2022:20:55:40.568 527216 INFO neutron.cmd.ovn.neutron_ovn_db_sync_util [req-c595a893-db9b-484e-ae8a-bb7dbe8b31f3 - - - - -] Sync for Northbound db started with mode : repair
16/Feb/2022:20:55:42.105 527216 INFO neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.extensions.qos [req-c595a893-db9b-484e-ae8a-bb7dbe8b31f3 - - - - -] Starting OVNClientQosExtension
16/Feb/2022:20:55:42.380 527216 INFO neutron.db.ovn_revision_numbers_db [req-c595a893-db9b-484e-ae8a-bb7dbe8b31f3 - - - - -] Successfully bumped revision number for resource 49b3249a-7624-4711-b271-3e63c6a27658 (type: ports) to 17
16/Feb/2022:20:55:43.205 527216 WARNING neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovn_db_sync [req-c595a893-db9b-484e-ae8a-bb7dbe8b31f3 - - - - -] ACLs-to-be-added 1 ACLs-to-be-removed 0
16/Feb/2022:20:55:43.206 527216 WARNING neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.ovn_db_sync [req-c595a893-db9b-484e-ae8a-bb7dbe8b31f3 - - - - -] ACL found in Neutron but not in OVN DB for port group pg_e90b68f3_9f8d_4250_9b6a_7531e2249c99
16/Feb/2022:20:55:43.208 527216 ERROR ovsdbapp.backend.ovs_idl.transaction [req-c595a893-db9b-484e-ae8a-bb7dbe8b31f3 - - - - -] Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/ovsdbapp/backend/ovs_idl/connection.py", line 131, in run
    txn.results.put(txn.do_commit())
  File "/usr/lib/python3/dist-packages/ovsdbapp/backend/ovs_idl/transaction.py", line 93, in do_commit
    command.run_idl(txn)
  File "/usr/lib/python3/dist-packages/ovsdbapp/schema/ovn_northbound/commands.py", line 123, in run_idl
    raise RuntimeError("ACL (%s, %s, %s) already exists" % (
RuntimeError: ACL (to-lport, 1002, outport == @pg_e90b68f3_9f8d_4250_9b6a_7531e2249c99 && ip4 && ip4.src == 10.0.0.0/8 && tcp) already exists

===== Ubuntu SRU Details =====

[Impact]
See bug description.

[Test Case]
Deploy openstack with OVN. Create overlapping security group rules. Run neutron-ovn-db-sync-util and ensure it completes successfully.

[Where problems could occur]
If the logic driven by the may_exist parameter is not correct, the existing bug could still occur. Presumably this is not the case, but that is a theoritical potential for where problems could occur. All of these patches have already landed in the corresponding upstream branches.

tags: added: ovn
Changed in neutron:
importance: Undecided → Critical
Changed in neutron:
status: New → In Progress
Changed in neutron:
assignee: nobody → Jake Yip (waipengyip)
Revision history for this message
Slawek Kaplonski (slaweq) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/801707
Committed: https://opendev.org/openstack/neutron/commit/5a0a2b7847da067817640404f53e0807755e08d7
Submitter: "Zuul (22348)"
Branch: master

commit 5a0a2b7847da067817640404f53e0807755e08d7
Author: Jake Yip <email address hidden>
Date: Tue Jul 20 17:03:08 2021 +1000

    Allow ovn_db_sync to continue on duplicate normalised CIDR

    OVN now uses normalised CIDR when adding a security group rule[1]. It
    uses may_exist=True for adding ACL (secgroup rule), in case there are
    multiple CIDRs in neutron that normalises to the same.

    Do the same in ovn_db_sync, so that the sync don't fail hard on such
    duplicates.

    [1] https://review.opendev.org/c/openstack/neutron/+/736386/

    Change-Id: I9d9c21e460029e4a6a845520bfcc2889ad20429b
    Related-Bug: #1869129
    Closes-Bug: #1961112

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 20.0.0.0rc1

This issue was fixed in the openstack/neutron 20.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/xena)

Fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/neutron/+/833559

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/neutron/+/833560

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/victoria)

Fix proposed to branch: stable/victoria
Review: https://review.opendev.org/c/openstack/neutron/+/833561

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/ussuri)

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/c/openstack/neutron/+/833562

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/833559
Committed: https://opendev.org/openstack/neutron/commit/bf308a12a1f87ead18f7c8478ada93dc292bfcba
Submitter: "Zuul (22348)"
Branch: stable/xena

commit bf308a12a1f87ead18f7c8478ada93dc292bfcba
Author: Jake Yip <email address hidden>
Date: Tue Jul 20 17:03:08 2021 +1000

    Allow ovn_db_sync to continue on duplicate normalised CIDR

    OVN now uses normalised CIDR when adding a security group rule[1]. It
    uses may_exist=True for adding ACL (secgroup rule), in case there are
    multiple CIDRs in neutron that normalises to the same.

    Do the same in ovn_db_sync, so that the sync don't fail hard on such
    duplicates.

    [1] https://review.opendev.org/c/openstack/neutron/+/736386/

    Change-Id: I9d9c21e460029e4a6a845520bfcc2889ad20429b
    Related-Bug: #1869129
    Closes-Bug: #1961112
    (cherry picked from commit 5a0a2b7847da067817640404f53e0807755e08d7)

tags: added: in-stable-xena
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/833560
Committed: https://opendev.org/openstack/neutron/commit/691a7ceeecb020e302e8d9a2ec3ceb67741144a8
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit 691a7ceeecb020e302e8d9a2ec3ceb67741144a8
Author: Jake Yip <email address hidden>
Date: Tue Jul 20 17:03:08 2021 +1000

    Allow ovn_db_sync to continue on duplicate normalised CIDR

    OVN now uses normalised CIDR when adding a security group rule[1]. It
    uses may_exist=True for adding ACL (secgroup rule), in case there are
    multiple CIDRs in neutron that normalises to the same.

    Do the same in ovn_db_sync, so that the sync don't fail hard on such
    duplicates.

    [1] https://review.opendev.org/c/openstack/neutron/+/736386/

    Change-Id: I9d9c21e460029e4a6a845520bfcc2889ad20429b
    Related-Bug: #1869129
    Closes-Bug: #1961112
    (cherry picked from commit 5a0a2b7847da067817640404f53e0807755e08d7)

tags: added: in-stable-wallaby
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/victoria)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/833561
Committed: https://opendev.org/openstack/neutron/commit/10c8d590af7fc4e1355e38d455bbd4ecd4bb2a4c
Submitter: "Zuul (22348)"
Branch: stable/victoria

commit 10c8d590af7fc4e1355e38d455bbd4ecd4bb2a4c
Author: Jake Yip <email address hidden>
Date: Tue Jul 20 17:03:08 2021 +1000

    Allow ovn_db_sync to continue on duplicate normalised CIDR

    OVN now uses normalised CIDR when adding a security group rule[1]. It
    uses may_exist=True for adding ACL (secgroup rule), in case there are
    multiple CIDRs in neutron that normalises to the same.

    Do the same in ovn_db_sync, so that the sync don't fail hard on such
    duplicates.

    [1] https://review.opendev.org/c/openstack/neutron/+/736386/

    Change-Id: I9d9c21e460029e4a6a845520bfcc2889ad20429b
    Related-Bug: #1869129
    Closes-Bug: #1961112
    (cherry picked from commit 5a0a2b7847da067817640404f53e0807755e08d7)

tags: added: in-stable-victoria
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/ussuri)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/833562
Committed: https://opendev.org/openstack/neutron/commit/f1fe5260c7f22415fef7e098a2b66e84f116c649
Submitter: "Zuul (22348)"
Branch: stable/ussuri

commit f1fe5260c7f22415fef7e098a2b66e84f116c649
Author: Jake Yip <email address hidden>
Date: Tue Jul 20 17:03:08 2021 +1000

    Allow ovn_db_sync to continue on duplicate normalised CIDR

    OVN now uses normalised CIDR when adding a security group rule[1]. It
    uses may_exist=True for adding ACL (secgroup rule), in case there are
    multiple CIDRs in neutron that normalises to the same.

    Do the same in ovn_db_sync, so that the sync don't fail hard on such
    duplicates.

    [1] https://review.opendev.org/c/openstack/neutron/+/736386/

    Change-Id: I9d9c21e460029e4a6a845520bfcc2889ad20429b
    Related-Bug: #1869129
    Closes-Bug: #1961112
    (cherry picked from commit 5a0a2b7847da067817640404f53e0807755e08d7)

tags: added: in-stable-ussuri
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 17.4.0

This issue was fixed in the openstack/neutron 17.4.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 18.3.0

This issue was fixed in the openstack/neutron 18.3.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 19.2.0

This issue was fixed in the openstack/neutron 19.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/networking-ovn train-eol

This issue was fixed in the openstack/networking-ovn train-eol release.

Changed in neutron (Ubuntu):
status: New → Fix Released
Changed in neutron (Ubuntu Focal):
status: New → Triaged
importance: Undecided → High
Changed in cloud-archive:
status: New → Fix Released
description: updated
description: updated
Revision history for this message
Corey Bryant (corey.bryant) wrote :

A new package version with this fix has been uploaded to the focal unapproved queue.

Revision history for this message
Andreas Hasenack (ahasenack) wrote : Please test proposed package

Hello Daniel, or anyone else affected,

Accepted neutron into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/neutron/2:16.4.2-0ubuntu6.4 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in neutron (Ubuntu Focal):
status: Triaged → Fix Committed
tags: added: verification-needed verification-needed-focal
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron ussuri-eol

This issue was fixed in the openstack/neutron ussuri-eol release.

Revision history for this message
Brian Haley (brian-haley) wrote :

Verified this was fixed on a juju-deployed focal/ussuri cloud, no errors/exceptions with overlapping SGs where with previous 6.3 version there was.

tags: added: verification-done-focal
removed: verification-needed verification-needed-focal
tags: added: verification-done
Revision history for this message
Chris Halse Rogers (raof) wrote : Update Released

The verification of the Stable Release Update for neutron has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package neutron - 2:16.4.2-0ubuntu6.4

---------------
neutron (2:16.4.2-0ubuntu6.4) focal; urgency=medium

  [ Corey Bryant ]
  * d/p/ovn-db-sync-continue-on-duplicate-normalise.patch: Cherry-picked
    from upstream to allow ovn_db_sync to continue on duplicate normalised
    CIDR (LP: #1961112).
  * d/p/ovn-db-sync-check-for-router-port-differences.patch:
    Cherry-picked from upstream to ensure router ports are marked
    for needing updates only if they have changed (LP: #2030773).
  * d/p/ovn-specify-port-type-if-router-port-when-updating.patch:
    Specify port type if it's a router port when updating to avoid
    port flapping (LP: #1955578).
  * d/p/fix-acl-sync-when-default-sg-group-created.patch:
    Cherry-picked form upstream to fix ACL sync when default security
    group is created (LP: #2008943).

  [ Mustafa Kemal GILOR ]
  * d/p/add_uplink_status_propagation.patch: Add the
    'uplink-status-propagation' extension to ML2/OVN (LP: #2032770).

 -- Corey Bryant <email address hidden> Wed, 08 Nov 2023 11:41:21 -0500

Changed in neutron (Ubuntu Focal):
status: Fix Committed → Fix Released
Revision history for this message
James Page (james-page) wrote : Please test proposed package

Hello Daniel, or anyone else affected,

Accepted neutron into ussuri-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository.

Please help us by testing this new package. To enable the -proposed repository:

  sudo add-apt-repository cloud-archive:ussuri-proposed
  sudo apt-get update

Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-ussuri-needed to verification-ussuri-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-ussuri-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags: added: verification-ussuri-needed
Revision history for this message
Brian Haley (brian-haley) wrote :

I have tested neutron version 2:16.4.2-0ubuntu6.4~cloud0 from the cloud-archive:ussuri-proposed repository and can verify the code has this change, and the failure does not occur. I followed the steps from the bug description:

Created two security group rules, one for 10.0.0.1/8 and the other for 10.0.0.0/8, so SG rule list was similar to this:

$ openstack security group rule list overlap-sgr
+--------------------------------------+-------------+-----------+------------+------------+-----------+-----------------------+----------------------+
| ID | IP Protocol | Ethertype | IP Range | Port Range | Direction | Remote Security Group | Remote Address Group |
+--------------------------------------+-------------+-----------+------------+------------+-----------+-----------------------+----------------------+
| 3c41fa80-1d23-49c9-9ec1-adf581e07e24 | tcp | IPv4 | 10.0.0.1/8 | | ingress | None | None |
| 639d263e-6873-47cb-b2c4-17fc824252db | None | IPv4 | 0.0.0.0/0 | | egress | None | None |
| 96e99039-cbc0-48fe-98fe-ef28d41b9d9b | tcp | IPv4 | 10.0.0.0/8 | | ingress | None | None |
| bf9160a3-fc9b-467e-85d5-c889811fd6ca | None | IPv6 | ::/0 | | egress | None | None |
+--------------------------------------+-------------+-----------+------------+------------+-----------+-----------------------+----------------------+

Then ran the ovn DB sync utility:

neutron-ovn-db-sync-util --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/ml2_conf.ini --ovn-neutron_sync_mode repair

It completed successfully without the exception as noted in the bug description.

tags: added: verification-ussuri-done
removed: verification-ussuri-needed
Revision history for this message
James Page (james-page) wrote : Update Released

The verification of the Stable Release Update for neutron has completed successfully and the package has now been released to -updates. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
James Page (james-page) wrote :

This bug was fixed in the package neutron - 2:16.4.2-0ubuntu6.4~cloud0
---------------

 neutron (2:16.4.2-0ubuntu6.4~cloud0) bionic-ussuri; urgency=medium
 .
   * New update for the Ubuntu Cloud Archive.
 .
 neutron (2:16.4.2-0ubuntu6.4) focal; urgency=medium
 .
   [ Corey Bryant ]
   * d/p/ovn-db-sync-continue-on-duplicate-normalise.patch: Cherry-picked
     from upstream to allow ovn_db_sync to continue on duplicate normalised
     CIDR (LP: #1961112).
   * d/p/ovn-db-sync-check-for-router-port-differences.patch:
     Cherry-picked from upstream to ensure router ports are marked
     for needing updates only if they have changed (LP: #2030773).
   * d/p/ovn-specify-port-type-if-router-port-when-updating.patch:
     Specify port type if it's a router port when updating to avoid
     port flapping (LP: #1955578).
   * d/p/fix-acl-sync-when-default-sg-group-created.patch:
     Cherry-picked form upstream to fix ACL sync when default security
     group is created (LP: #2008943).
 .
   [ Mustafa Kemal GILOR ]
   * d/p/add_uplink_status_propagation.patch: Add the
     'uplink-status-propagation' extension to ML2/OVN (LP: #2032770).

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.