Possible race in _wait_for_metadata_provisioned_if_needed()

Bug #1728088 reported by Lucas Alvares Gomes
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
networking-ovn
New
Undecided
Lucas Alvares Gomes

Bug Description

Apparently there's a race in the _wait_for_metadata_provisioned_if_needed() method (networking_ovn/ml2/mech_driver.py module).

Sometimes tempest runs fails with:

Traceback (most recent call last):
   File "/usr/local/lib/python2.7/dist-packages/ovsdbapp/event.py", line 117, in notify_loop
     match.run(event, row, updates)
   File "/opt/stack/new/networking-ovn/networking_ovn/ovsdb/ovsdb_monitor.py", line 113, in run
     self.driver.set_port_status_up(row.name)
   File "/opt/stack/new/networking-ovn/networking_ovn/ml2/mech_driver.py", line 602, in set_port_status_up
     self._wait_for_metadata_provisioned_if_needed(port_id)
   File "/opt/stack/new/networking-ovn/networking_ovn/ml2/mech_driver.py", line 671, in _wait_for_metadata_provisioned_if_needed
     self._sb_ovn.get_logical_port_chassis_and_datapath(port_id))
 'NoneType' object is not iterable

It seems that by the time the _wait_for_metadata_provisioned_if_needed() is called the logical port doesn't exist yet (or maybe the IDLs cache is not up to date?).

Full error logs: http://paste.openstack.org/show/624840/

Here's one run example: http://logs.openstack.org/20/511720/5/gate/networking-ovn-tempest-dsvm-ovs-release/31fc010/logs/screen-q-svc.txt.gz?level=ERROR

description: updated
Changed in networking-ovn:
assignee: nobody → Lucas Alvares Gomes (lucasagomes)
Revision history for this message
Daniel Alvarez (dalvarezs) wrote :
Download full text (4.6 KiB)

The issue is not that it doesn't exist yet but that it was deleted a few seconds before that error. Actually, in those logs
For some reason, after the port was deleted in both Neutron and OVN, the mech driver receives various events from ovsdbapp that the port has transitioned to down, then to up and then back to down:

-> to Down:

Oct 27 14:11:45.194651 ubuntu-xenial-rax-ord-0000524329 neutron-server[5590]: INFO networking_ovn.ml2.mech_driver [None req-9675e1f5-e4ee-4929-8a9d-7b06b20bccdf None None] OVN reports status down for port: cc17cd96-a9b5-4ccf-af2f-aec9601952b4
Oct 27 14:11:45.236309 ubuntu-xenial-rax-ord-0000524329 neutron-server[5590]: DEBUG networking_ovn.ml2.mech_driver [None req-9675e1f5-e4ee-4929-8a9d-7b06b20bccdf None None] Port not found during OVN status down report: cc17cd96-a9b5-4ccf-af2f-aec9601952b4 {{(pid=5634) set_port_status_down /opt/stack/new/networking-ovn/networking_ovn/ml2/mech_driver.py:631}}

-> to Up:

Oct 27 14:11:45.236699 ubuntu-xenial-rax-ord-0000524329 neutron-server[5590]: INFO networking_ovn.ml2.mech_driver [None req-9675e1f5-e4ee-4929-8a9d-7b06b20bccdf None None] OVN reports status up for port: cc17cd96-a9b5-4ccf-af2f-aec9601952b4
Oct 27 14:11:45.238874 ubuntu-xenial-rax-ord-0000524329 neutron-server[5590]: DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn command(idx=0): DbListCommand(if_exists=False, records=None, table=Port_Binding, columns=None, row=True) {{(pid=5634) do_commit /usr/local/lib/python2.7/dist-packages/ovsdbapp/backend/ovs_idl/transaction.py:84}}

Oct 27 14:11:45.239120 ubuntu-xenial-rax-ord-0000524329 neutron-server[5590]: DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Transaction caused no change {{(pid=5634) do_commit /usr/local/lib/python2.7/dist-packages/ovsdbapp/backend/ovs_idl/transaction.py:110}}

Oct 27 14:11:45.244375 ubuntu-xenial-rax-ord-0000524329 neutron-server[5590]: ERROR ovsdbapp.event [None req-9675e1f5-e4ee-4929-8a9d-7b06b20bccdf None None] Unexpected exception in notify_loop: TypeError: 'NoneType' object is not iterable
Oct 27 14:11:45.244568 ubuntu-xenial-rax-ord-0000524329 neutron-server[5590]: ERROR ovsdbapp.event Traceback (most recent call last):
Oct 27 14:11:45.244742 ubuntu-xenial-rax-ord-0000524329 neutron-server[5590]: ERROR ovsdbapp.event File "/usr/local/lib/python2.7/dist-packages/ovsdbapp/event.py", line 117, in notify_loop
Oct 27 14:11:45.244918 ubuntu-xenial-rax-ord-0000524329 neutron-server[5590]: ERROR ovsdbapp.event match.run(event, row, updates)
Oct 27 14:11:45.245081 ubuntu-xenial-rax-ord-0000524329 neutron-server[5590]: ERROR ovsdbapp.event File "/opt/stack/new/networking-ovn/networking_ovn/ovsdb/ovsdb_monitor.py", line 113, in run
Oct 27 14:11:45.245240 ubuntu-xenial-rax-ord-0000524329 neutron-server[5590]: ERROR ovsdbapp.event self.driver.set_port_status_up(row.name)
Oct 27 14:11:45.245406 ubuntu-xenial-rax-ord-0000524329 neutron-server[5590]: ERROR ovsdbapp.event File "/opt/stack/new/networking-ovn/networking_ovn/ml2/mech_driver.py", line 602, in set_port_status_up
Oct 27 14:11:45.245568 ubuntu-xenial-rax-ord-0000524329 neutron-server[5590]: ERROR ovsdbapp.event self._wait_for_metadata_provisioned_if_needed(port_i...

Read more...

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to networking-ovn (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/516234

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to networking-ovn (master)

Reviewed: https://review.openstack.org/516234
Committed: https://git.openstack.org/cgit/openstack/networking-ovn/commit/?id=0dc42f3002f18c2572fc6949ac0b58b262128e15
Submitter: Zuul
Branch: master

commit 0dc42f3002f18c2572fc6949ac0b58b262128e15
Author: Daniel Alvarez <email address hidden>
Date: Mon Oct 30 11:07:52 2017 +0100

    Log a warning when trying to wait for metadata on a non existent port

    As noticed in bug #1728088, we're receiving an 'up' event from a port
    that does not exist. Even though we may want to prevent this from
    happening, the current result is that we're throwing a confusing
    exception because the chassis and datapath for this logical port can't
    be retrieved.

    This patch detects the above condition and logs a warning instead.

    Related-Bug: #1728088

    Change-Id: If15f457db6a99ba232e23349048fcbe2b1a4f36d
    Signed-off-by: Daniel Alvarez <email address hidden>

tags: added: networking-ovn-proactive-backport-potential
Revision history for this message
Daniel Alvarez (dalvarezs) wrote :

This is still being seen: we receive notifications about port status up/down after the port was deleted. I think we cannot guarantee the order in which we can receive notifications so logging this warning is enough.

Revision history for this message
Daniel Alvarez (dalvarezs) wrote :

The port is deleted:

Feb 20 16:57:09.110919 ubuntu-xenial-rax-ord-0002635913 neutron-server[26021]: DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn command(idx=0): DelLSwitchPortCommand(lport=676c8066-efa8-4bd3-95ec-596cbd58890b, if_exists=True, lswitch=neutron-bc5dc1dd-debe-41fd-bb8f-78296b0c6acc) {{(pid=26119) do_commit /usr/local/lib/python2.7/dist-packages/ovsdbapp/backend/ovs_idl/transaction.py:84}}

And later we receive a notification that it transitioned to UP:

Feb 20 16:57:25.002984 ubuntu-xenial-rax-ord-0002635913 neutron-server[26021]: INFO networking_ovn.ml2.mech_driver [None req-446db625-0618-40cc-ba94-a9db03249a29 None None] OVN reports status up for port: 676c8066-efa8-4bd3-95ec-596cbd58890b
Feb 20 16:57:25.004026 ubuntu-xenial-rax-ord-0002635913 neutron-server[26021]: DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn command(idx=0): DbListCommand(if_exists=False, records=None, table=Port_Binding, columns=None, row=True) {{(pid=26121) do_commit /usr/local/lib/python2.7/dist-packages/ovsdbapp/backend/ovs_idl/transaction.py:84}}
Feb 20 16:57:25.004745 ubuntu-xenial-rax-ord-0002635913 neutron-server[26021]: DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Transaction caused no change {{(pid=26121) do_commit /usr/local/lib/python2.7/dist-packages/ovsdbapp/backend/ovs_idl/transaction.py:114}}
Feb 20 16:57:25.005888 ubuntu-xenial-rax-ord-0002635913 neutron-server[26021]: WARNING networking_ovn.ml2.mech_driver [None req-446db625-0618-40cc-ba94-a9db03249a29 None None] Logical port 676c8066-efa8-4bd3-95ec-596cbd58890b doesn't exist in OVN

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.