OVN DB Sync utility cannot find NB DB Port Group

Bug #2008943 reported by Miro Tomaska
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
In Progress
Medium
Miro Tomaska

Bug Description

Runtime exception:

ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Port_Group with name=pg_aa9f203b_ec51_4893_9bda_cfadbff9f800

can occure while performing database sync between Neutron db and OVN NB db using neutron-ovn-db-sync-util.
This exception occures when the `sync_networks_ports_and_dhcp_opts()` function ends up implicitly creating a new default security group for a tenant/project id. This is normally ok but the problem is that `sync_port_groups` was already called and thus the port_group does not exists in NB DB. When the `sync_acls()` is called later there is no port group found and exception occurs.

Quick way to reproduce on ML2/OVN:
- openstack project create test_project
- openstack create network --project test_project test_network
- openstack port delete $(openstack port list --network test_network -c ID -f value) # since this is an empty network only the metadata port should get listed and subsequently deleted
- openstack security group delete test_project

So now that you have a network without a metadata port in it and no default security group for the project/tenant that this network belongs to run

neutron-ovn-db-sync-util --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/ml2_conf.ini --ovn-neutron_sync_mode migrate

The exeption should occur

Here is a more realistic scenario how we can run into this with ML2/OVS to ML2/OVN migration. I am also including why the code runs into it.

1. ML2/OVS enviroment with a network but no default security group for the project/tenant associated with the network
2. Perform ML2/OVS to ML2/OVN migration. This migration process will run neutron-ovn-db-sync-util with --migrate
3. During the sync we first sync port groups[1] from Neutron DB to OVN DB
4. Then we sync network ports [2]. The process will detect that the network in question is not part of OVN NB. It will create that network in OVN NB db and along with that it will create a metadata port for it(OVN network requires metadataport). The Port_create call will implicitly notify _ensure_default_security_group_handler which will not find securty group for that tenant/project id and create one. Now you have a new security group with 4 new default security group rules.
5. When sync_acls[4] runs it will pick up those 4 new rules but commit to NB DB will fail since the port_group(aka security group) does not exists in NB DB

[1] https://opendev.org/openstack/neutron/src/branch/master/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_db_sync.py#L104
[2] https://opendev.org/openstack/neutron/src/branch/master/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_db_sync.py#L10
[3] https://opendev.org/openstack/neutron/src/branch/master/neutron/db/securitygroups_db.py#L915
[4] https://opendev.org/openstack/neutron/src/branch/master/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_db_sync.py#L107

Miro Tomaska (mtomaska)
Changed in neutron:
assignee: nobody → Miro Tomaska (mtomaska)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/875989

Changed in neutron:
status: New → In Progress
Revision history for this message
Lajos Katona (lajos-katona) wrote :
tags: added: ovn
Changed in neutron:
importance: Undecided → Medium
Miro Tomaska (mtomaska)
description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 20.4.0

This issue was fixed in the openstack/neutron 20.4.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/victoria)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/890118
Committed: https://opendev.org/openstack/neutron/commit/da9401aa7455a88f65953efedb9e0ce797674445
Submitter: "Zuul (22348)"
Branch: stable/victoria

commit da9401aa7455a88f65953efedb9e0ce797674445
Author: Miro Tomaska <email address hidden>
Date: Wed Mar 1 16:32:50 2023 -0600

    Fix ACL sync when default sg group is created

    Port group not being available in NB DB during ACL sync
    is bit of a corner case but possible during the ML2/OVS
    to ML2/OVN migration sync. It can also happen in ML2/OVN
    only enviroment. See my detailed description of both
    scenarios in the linked Bug.
    The easiest fix is to just retry ALL port groups sync
    one more time if ACL sync cant find a port group row. This
    additional resync is really quick.

    Closes-Bug: #2008943
    Change-Id: Iac1472f7f896ea434deacb6d236ab469f4f6ed56
    (cherry picked from commit 33cf2cdc83a8cee9ee075eb371f779c3d356cf48)

tags: added: in-stable-victoria
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/ussuri)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/888651
Committed: https://opendev.org/openstack/neutron/commit/a6f5160b6c20144db5293b615f3c9e35c8fc59c7
Submitter: "Zuul (22348)"
Branch: stable/ussuri

commit a6f5160b6c20144db5293b615f3c9e35c8fc59c7
Author: Miro Tomaska <email address hidden>
Date: Wed Mar 1 16:32:50 2023 -0600

    Fix ACL sync when default sg group is created

    Port group not being available in NB DB during ACL sync
    is bit of a corner case but possible during the ML2/OVS
    to ML2/OVN migration sync. It can also happen in ML2/OVN
    only enviroment. See my detailed description of both
    scenarios in the linked Bug.
    The easiest fix is to just retry ALL port groups sync
    one more time if ACL sync cant find a port group row. This
    additional resync is really quick.

    Conflicts:
      neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_db_sync.py

    Closes-Bug: #2008943
    Change-Id: Iac1472f7f896ea434deacb6d236ab469f4f6ed56
    (cherry picked from commit 33cf2cdc83a8cee9ee075eb371f779c3d356cf48)

tags: added: in-stable-ussuri
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 23.0.0.0b3

This issue was fixed in the openstack/neutron 23.0.0.0b3 development milestone.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.