Load balancer creation often failing due to logical switch not found

Bug #1963921 reported by Fernando Royo
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Undecided
Fernando Royo

Bug Description

It seems possible that a load-balancer creation was triggered while multiple Subnets were being deleted, causing an exception of logical switch not found and moving the load-balancer to ERROR state.

2022-02-10 14:48:49.115 16 ERROR ovsdbapp.backend.ovs_idl.transaction [-] Traceback (most recent call last):
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver [-] Exception occurred during creation of loadbalancer: ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Logical_Switch with name=neutron-6fa06cae-0145-4571-9919-0541f0bea93a
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver Traceback (most recent call last):
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver File "/usr/lib/python3.6/site-packages/ovsdbapp/api.py", line 111, in transaction
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver yield self._nested_txns_map[cur_thread_id]
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver KeyError: 139860903978752
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver During handling of the above exception, another exception occurred:
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver Traceback (most recent call last):
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver File "/usr/lib/python3.6/site-packages/networking_ovn/octavia/ovn_driver.py", line 1033, in lb_create
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver self._execute_commands(commands)
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver File "/usr/lib/python3.6/site-packages/networking_ovn/octavia/ovn_driver.py", line 626, in _execute_commands
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver txn.add(command)
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver File "/usr/lib64/python3.6/contextlib.py", line 88, in __exit__
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver next(self.gen)
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver File "/usr/lib/python3.6/site-packages/networking_ovn/ovsdb/impl_idl_ovn.py", line 252, in transaction
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver yield t
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver File "/usr/lib64/python3.6/contextlib.py", line 88, in __exit__
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver next(self.gen)
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver File "/usr/lib/python3.6/site-packages/ovsdbapp/api.py", line 119, in transaction
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver del self._nested_txns_map[cur_thread_id]
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver File "/usr/lib/python3.6/site-packages/ovsdbapp/api.py", line 69, in __exit__
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver self.result = self.commit()
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", line 62, in commit
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver raise result.ex
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/connection.py", line 128, in run
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver txn.results.put(txn.do_commit())
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", line 86, in do_commit
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver command.run_idl(txn)
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver File "/usr/lib/python3.6/site-packages/ovsdbapp/schema/ovn_northbound/commands.py", line 1159, in run_idl
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver ls = self.api.lookup('Logical_Switch', self.switch)
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/__init__.py", line 172, in lookup
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver return self._lookup(table, record)
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/__init__.py", line 215, in _lookup
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver row = idlutils.row_by_value(self, rl.table, rl.column, record)
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/idlutils.py", line 130, in row_by_value
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver raise RowNotFound(table=table, col=column, match=match)
2022-02-10 14:48:49.115 16 ERROR networking_ovn.octavia.ovn_driver ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Logical_Switch with name=neutron-6fa06cae-0145-4571-9919-0541f0bea93a

Changed in neutron:
assignee: nobody → Fernando Royo (froyoredhat)
Changed in neutron:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ovn-octavia-provider (master)

Reviewed: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/829126
Committed: https://opendev.org/openstack/ovn-octavia-provider/commit/1a62902b5601c764da659f13242131828b4078ca
Submitter: "Zuul (22348)"
Branch: master

commit 1a62902b5601c764da659f13242131828b4078ca
Author: Fernando Royo <email address hidden>
Date: Mon Feb 14 19:56:20 2022 +0100

    Retry logical switch associations to load balancers

    On load-balancer creation, or other operations related to association
    of the load balancer to the logical router, all logical switches
    associated with the logical router in the target network are also
    associated with the load-balancer. When the topology includes multiple
    subnets, it may happen that the operation over the load-balancer match
    in time with the removal of some of the subnets.

    The creation of the load-balancer is a transactional atomic
    process. This patch is splitting such transactions and retrying
    in case of the above error. In case the attempts are exhausted and the
    error remains, we evaluate command by command, in case LsLbAdd or
    LsLbDel of an Ls associated to the Lr, we can omit the error and go
    forward.

    Closes-Bug: #1963921

    Change-Id: I2c7e0c677f0687990bbd71b3f8d511051dbe0359

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ovn-octavia-provider (stable/yoga)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ovn-octavia-provider (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/833882

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ovn-octavia-provider (stable/victoria)

Fix proposed to branch: stable/victoria
Review: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/833884

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ovn-octavia-provider (stable/xena)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ovn-octavia-provider (stable/ussuri)

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/833995

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ovn-octavia-provider (stable/yoga)

Reviewed: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/833872
Committed: https://opendev.org/openstack/ovn-octavia-provider/commit/bafe17a64d8ef0aff97c6035c9d7f5a6e64bc707
Submitter: "Zuul (22348)"
Branch: stable/yoga

commit bafe17a64d8ef0aff97c6035c9d7f5a6e64bc707
Author: Fernando Royo <email address hidden>
Date: Mon Feb 14 19:56:20 2022 +0100

    Retry logical switch associations to load balancers

    On load-balancer creation, or other operations related to association
    of the load balancer to the logical router, all logical switches
    associated with the logical router in the target network are also
    associated with the load-balancer. When the topology includes multiple
    subnets, it may happen that the operation over the load-balancer match
    in time with the removal of some of the subnets.

    The creation of the load-balancer is a transactional atomic
    process. This patch is splitting such transactions and retrying
    in case of the above error. In case the attempts are exhausted and the
    error remains, we evaluate command by command, in case LsLbAdd or
    LsLbDel of an Ls associated to the Lr, we can omit the error and go
    forward.

    Closes-Bug: #1963921

    Change-Id: I2c7e0c677f0687990bbd71b3f8d511051dbe0359
    (cherry picked from commit 1a62902b5601c764da659f13242131828b4078ca)

tags: added: in-stable-yoga
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ovn-octavia-provider (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/833882
Committed: https://opendev.org/openstack/ovn-octavia-provider/commit/6c1b8a9ee8411a596a52295c759a240542d9649e
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit 6c1b8a9ee8411a596a52295c759a240542d9649e
Author: Fernando Royo <email address hidden>
Date: Mon Feb 14 19:56:20 2022 +0100

    Retry logical switch associations to load balancers

    On load-balancer creation, or other operations related to association
    of the load balancer to the logical router, all logical switches
    associated with the logical router in the target network are also
    associated with the load-balancer. When the topology includes multiple
    subnets, it may happen that the operation over the load-balancer match
    in time with the removal of some of the subnets.

    The creation of the load-balancer is a transactional atomic
    process. This patch is splitting such transactions and retrying
    in case of the above error. In case the attempts are exhausted and the
    error remains, we evaluate command by command, in case LsLbAdd or
    LsLbDel of an Ls associated to the Lr, we can omit the error and go
    forward.

    Depends-On: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/834661
    Closes-Bug: #1963921

    Change-Id: I2c7e0c677f0687990bbd71b3f8d511051dbe0359
    (cherry picked from commit 1a62902b5601c764da659f13242131828b4078ca)

tags: added: in-stable-wallaby
tags: added: in-stable-victoria
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ovn-octavia-provider (stable/victoria)

Reviewed: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/833884
Committed: https://opendev.org/openstack/ovn-octavia-provider/commit/6d62bb4ebe728ecbdb67887a662374bd4b13a46f
Submitter: "Zuul (22348)"
Branch: stable/victoria

commit 6d62bb4ebe728ecbdb67887a662374bd4b13a46f
Author: Fernando Royo <email address hidden>
Date: Mon Feb 14 19:56:20 2022 +0100

    Retry logical switch associations to load balancers

    On load-balancer creation, or other operations related to association
    of the load balancer to the logical router, all logical switches
    associated with the logical router in the target network are also
    associated with the load-balancer. When the topology includes multiple
    subnets, it may happen that the operation over the load-balancer match
    in time with the removal of some of the subnets.

    The creation of the load-balancer is a transactional atomic
    process. This patch is splitting such transactions and retrying
    in case of the above error. In case the attempts are exhausted and the
    error remains, we evaluate command by command, in case LsLbAdd or
    LsLbDel of an Ls associated to the Lr, we can omit the error and go
    forward.

    Closes-Bug: #1963921

    Change-Id: I2c7e0c677f0687990bbd71b3f8d511051dbe0359
    (cherry picked from commit 1a62902b5601c764da659f13242131828b4078ca)

tags: added: in-stable-ussuri
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ovn-octavia-provider (stable/ussuri)

Reviewed: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/833995
Committed: https://opendev.org/openstack/ovn-octavia-provider/commit/301d19920122baf058719b3764a0cfc25bb453cf
Submitter: "Zuul (22348)"
Branch: stable/ussuri

commit 301d19920122baf058719b3764a0cfc25bb453cf
Author: Fernando Royo <email address hidden>
Date: Wed Mar 16 13:14:28 2022 +0100

    Retry logical switch associations to load balancers

    On load-balancer creation, or other operations related to association
    of the load balancer to the logical router, all logical switches
    associated with the logical router in the target network are also
    associated with the load-balancer. When the topology includes multiple
    subnets, it may happen that the operation over the load-balancer match
    in time with the removal of some of the subnets.

    The creation of the load-balancer is a transactional atomic
    process. This patch is splitting such transactions and retrying
    in case of the above error. In case the attempts are exhausted and the
    error remains, we evaluate command by command, in case LsLbAdd or
    LsLbDel of an Ls associated to the Lr, we can omit the error and go
    forward.

    Closes-Bug: #1963921
    (manually cherry picked from commit
    1a62902b5601c764da659f13242131828b4078ca)

    Change-Id: I2c7e0c677f0687990bbd71b3f8d511051dbe0359

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ovn-octavia-provider (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/833885
Committed: https://opendev.org/openstack/ovn-octavia-provider/commit/fb121e59ae33401913b69df476a85c010deb0040
Submitter: "Zuul (22348)"
Branch: stable/xena

commit fb121e59ae33401913b69df476a85c010deb0040
Author: Fernando Royo <email address hidden>
Date: Mon Feb 14 19:56:20 2022 +0100

    Retry logical switch associations to load balancers

    On load-balancer creation, or other operations related to association
    of the load balancer to the logical router, all logical switches
    associated with the logical router in the target network are also
    associated with the load-balancer. When the topology includes multiple
    subnets, it may happen that the operation over the load-balancer match
    in time with the removal of some of the subnets.

    The creation of the load-balancer is a transactional atomic
    process. This patch is splitting such transactions and retrying
    in case of the above error. In case the attempts are exhausted and the
    error remains, we evaluate command by command, in case LsLbAdd or
    LsLbDel of an Ls associated to the Lr, we can omit the error and go
    forward.

    Depends-On: https://review.opendev.org/c/openstack/ovn-octavia-provider/+/834454
    Closes-Bug: #1963921

    Change-Id: I2c7e0c677f0687990bbd71b3f8d511051dbe0359
    (cherry picked from commit 1a62902b5601c764da659f13242131828b4078ca)

tags: added: in-stable-xena
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/ovn-octavia-provider 1.0.1

This issue was fixed in the openstack/ovn-octavia-provider 1.0.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/ovn-octavia-provider 3.0.0.0rc1

This issue was fixed in the openstack/ovn-octavia-provider 3.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/networking-ovn train-eol

This issue was fixed in the openstack/networking-ovn train-eol release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/ovn-octavia-provider 1.2.0

This issue was fixed in the openstack/ovn-octavia-provider 1.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/ovn-octavia-provider 2.1.0

This issue was fixed in the openstack/ovn-octavia-provider 2.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/ovn-octavia-provider ussuri-eol

This issue was fixed in the openstack/ovn-octavia-provider ussuri-eol release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/ovn-octavia-provider victoria-eom

This issue was fixed in the openstack/ovn-octavia-provider victoria-eom release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.