neutron-ovn-metadata-agent AttributeError: 'MetadataProxyHandler' object has no attribute 'sb_idl'

Bug #1928031 reported by Hemanth Nakkina
20
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Ubuntu Cloud Archive
New
Undecided
Unassigned
Ussuri
New
Undecided
Unassigned
Wallaby
New
Undecided
Unassigned
charm-ovn-chassis
Invalid
Undecided
Unassigned
neutron
Fix Released
Undecided
Hemanth Nakkina
neutron (Ubuntu)
Fix Released
High
Unassigned
Focal
New
Undecided
Unassigned
Hirsute
Won't Fix
Undecided
Unassigned
openvswitch (Ubuntu)
Fix Released
Undecided
Unassigned
Focal
New
Undecided
Unassigned
Hirsute
Won't Fix
Undecided
Unassigned

Bug Description

neutron-ovn-metadata-agent not able to handle any metadata requests from the instances.

Scenario:
* Initially there is some intermittent connectivity issues that are descirbed in LP #1907686
  https://bugs.launchpad.net/charm-ovn-chassis/+bug/1907686/comments/9

* The fix for the above is available in python3-openvswitch package in ussuri-proposed pocket
  Installed the fix on all neutron-server and compute and restarted neutron-ovn-metadata-agent one by one

* neutron-ovn-metadata-agent on one of the compute nodes not able to handle any metadata requests after restart. ( Please note the problem happened with only one ovn-metadata agent and rest of the agents are good on other compute nodes, so this is some race condition in IDL)

Stacktrace shows both the workers 69188/69189 timed out on OVNSB IDL connection and hence sb_idl is never initialized.

Stacktrace of Attribute error:
------------------------------
2021-04-27 08:51:01.340 69188 ERROR neutron.agent.ovn.metadata.server Traceback (most recent call last):
2021-04-27 08:51:01.340 69188 ERROR neutron.agent.ovn.metadata.server File "/usr/lib/python3/dist-packages/neutron/agent/ovn/metadata/server.py", line 67, in __call__
2021-04-27 08:51:01.340 69188 ERROR neutron.agent.ovn.metadata.server instance_id, project_id = self._get_instance_and_project_id(req)
2021-04-27 08:51:01.340 69188 ERROR neutron.agent.ovn.metadata.server File "/usr/lib/python3/dist-packages/neutron/agent/ovn/metadata/server.py", line 84, in _get_instance_and_project_id
2021-04-27 08:51:01.340 69188 ERROR neutron.agent.ovn.metadata.server ports = self.sb_idl.get_network_port_bindings_by_ip(network_id,
2021-04-27 08:51:01.340 69188 ERROR neutron.agent.ovn.metadata.server AttributeError: 'MetadataProxyHandler' object has no attribute 'sb_idl'
2021-04-27 08:51:01.340 69188 ERROR neutron.agent.ovn.metadata.server

Stacktrace at the restart of neutron-ovn-metadata-agent process:
----------------------------------------------------------------
2021-04-15 22:27:03.803 69124 INFO neutron.common.config [-] /usr/bin/neutron-ovn-metadata-agent version 16.2.0
2021-04-15 22:27:03.832 69124 INFO ovsdbapp.backend.ovs_idl.vlog [-] tcp:127.0.0.1:6640: connecting...
2021-04-15 22:27:03.833 69124 INFO ovsdbapp.backend.ovs_idl.vlog [-] tcp:127.0.0.1:6640: connected
2021-04-15 22:27:03.949 69124 WARNING neutron.agent.ovn.metadata.agent [-] Can't read ovn-bridge external-id from OVSDB. Using br-int instead.
2021-04-15 22:27:03.950 69124 INFO oslo_service.service [-] Starting 2 workers
2021-04-15 22:27:03.985 69188 INFO ovsdbapp.backend.ovs_idl.vlog [-] ssl:10.216.241.118:6642: connecting...
2021-04-15 22:27:03.986 69189 INFO ovsdbapp.backend.ovs_idl.vlog [-] ssl:10.216.241.118:6642: connecting...
2021-04-15 22:27:04.005 69124 INFO ovsdbapp.backend.ovs_idl.vlog [-] ssl:10.216.241.118:6642: connecting...
2021-04-15 22:27:04.006 69188 INFO ovsdbapp.backend.ovs_idl.vlog [-] ssl:10.216.241.118:6642: connected
2021-04-15 22:27:04.033 69189 INFO ovsdbapp.backend.ovs_idl.vlog [-] ssl:10.216.241.118:6642: connected
2021-04-15 22:27:04.061 69124 INFO ovsdbapp.backend.ovs_idl.vlog [-] ssl:10.216.241.118:6642: connected
2021-04-15 22:27:06.129 69124 INFO oslo.privsep.daemon [-] Running privsep helper: ['sudo', '/usr/bin/neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'privsep-helper', '--config-file', '/etc/neutron/neutron.conf', '--config-file', '/etc/neutron/neutron_ovn_metadata_agent.ini', '--privsep_context', 'neutron.privileged.default', '--privsep_sock_path', '/tmp/tmpgncr2rq7/privsep.sock']
2021-04-15 22:27:06.757 69124 INFO oslo.privsep.daemon [-] Spawned new privsep daemon via rootwrap
2021-04-15 22:27:06.676 69211 INFO oslo.privsep.daemon [-] privsep daemon starting
2021-04-15 22:27:06.678 69211 INFO oslo.privsep.daemon [-] privsep process running with uid/gid: 0/0
2021-04-15 22:27:06.680 69211 INFO oslo.privsep.daemon [-] privsep process running with capabilities (eff/prm/inh): CAP_DAC_OVERRIDE|CAP_DAC_READ_SEARCH|CAP_NET_ADMIN|CAP_SYS_ADMIN|CAP_SYS_PTRACE/CAP_DAC_OVERRIDE|CAP_DAC_READ_SEARCH|CAP_NET_ADMIN|CAP_SYS_ADMIN|CAP_SYS_PTRACE/none
2021-04-15 22:27:06.680 69211 INFO oslo.privsep.daemon [-] privsep daemon running as pid 69211
2021-04-15 22:30:08.542 69188 ERROR neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.impl_idl_ovn [-] OVS database connection to OVN_Southbound failed with error: 'Timeout'. Verify that the OVS and OVN services are available and that the 'ovn_nb_connection' and 'ovn_sb_connection' configuration options are correct.: Exception: Timeout
2021-04-15 22:30:08.542 69188 ERROR neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.impl_idl_ovn Traceback (most recent call last):
2021-04-15 22:30:08.542 69188 ERROR neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.impl_idl_ovn File "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/impl_idl_ovn.py", line 67, in start_connection
2021-04-15 22:30:08.542 69188 ERROR neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.impl_idl_ovn self.ovsdb_connection.start()
2021-04-15 22:30:08.542 69188 ERROR neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.impl_idl_ovn File "/usr/lib/python3/dist-packages/ovsdbapp/backend/ovs_idl/connection.py", line 79, in start
2021-04-15 22:30:08.542 69188 ERROR neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.impl_idl_ovn idlutils.wait_for_change(self.idl, self.timeout)
2021-04-15 22:30:08.542 69188 ERROR neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.impl_idl_ovn File "/usr/lib/python3/dist-packages/ovsdbapp/backend/ovs_idl/idlutils.py", line 173, in wait_for_change
2021-04-15 22:30:08.542 69188 ERROR neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.impl_idl_ovn raise Exception("Timeout") # TODO(twilson) use TimeoutException?
2021-04-15 22:30:08.542 69188 ERROR neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.impl_idl_ovn Exception: Timeout
2021-04-15 22:30:08.542 69188 ERROR neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.impl_idl_ovn
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager [-] Error during notification for neutron.agent.ovn.metadata.server.MetadataProxyHandler.post_fork_initialize-476074 process, after_init: neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.impl_idl_ovn.OvsdbConnectionUnavailable: OVS database connection to OVN_Southbound failed with error: 'Timeout'. Verify that the OVS and OVN services are available and that the 'ovn_nb_connection' and 'ovn_sb_connection' configuration options are correct.
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager Traceback (most recent call last):
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager File "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/impl_idl_ovn.py", line 67, in start_connection
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager self.ovsdb_connection.start()
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager File "/usr/lib/python3/dist-packages/ovsdbapp/backend/ovs_idl/connection.py", line 79, in start
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager idlutils.wait_for_change(self.idl, self.timeout)
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager File "/usr/lib/python3/dist-packages/ovsdbapp/backend/ovs_idl/idlutils.py", line 173, in wait_for_change
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager raise Exception("Timeout") # TODO(twilson) use TimeoutException?
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager Exception: Timeout
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager During handling of the above exception, another exception occurred:
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager Traceback (most recent call last):
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager During handling of the above exception, another exception occurred:
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager Traceback (most recent call last):
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager File "/usr/lib/python3/dist-packages/neutron_lib/callbacks/manager.py", line 197, in _notify_loop
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager callback(resource, event, trigger, **kwargs)
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager File "/usr/lib/python3/dist-packages/neutron/agent/ovn/metadata/server.py", line 60, in post_fork_initialize
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager tables=('Port_Binding', 'Datapath_Binding')).start()
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager File "/usr/lib/python3/dist-packages/neutron/agent/ovn/metadata/ovsdb.py", line 57, in start
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager return impl_idl_ovn.OvsdbSbOvnIdl(conn)
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager File "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/impl_idl_ovn.py", line 724, in __init__
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager super(OvsdbSbOvnIdl, self).__init__(connection)
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager File "/usr/lib/python3/dist-packages/ovsdbapp/schema/ovn_southbound/impl_idl.py", line 26, in __init__
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager super(OvnSbApiIdlImpl, self).__init__(connection)
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager File "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/impl_idl_ovn.py", line 63, in __init__
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager super(Backend, self).__init__(connection)
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager File "/usr/lib/python3/dist-packages/ovsdbapp/backend/ovs_idl/__init__.py", line 32, in __init__
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager self.start_connection(connection)
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager File "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/impl_idl_ovn.py", line 72, in start_connection
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager raise connection_exception
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.impl_idl_ovn.OvsdbConnectionUnavailable: OVS database connection to OVN_Southbound failed with error: 'Timeout'. Verify that the OVS and OVN services are available and that the 'ovn_nb_connection' and 'ovn_sb_connection' configuration options are correct.
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager raise Exception("Timeout") # TODO(twilson) use TimeoutException?
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager Exception: Timeout
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager During handling of the above exception, another exception occurred:
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager Traceback (most recent call last):
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager File "/usr/lib/python3/dist-packages/neutron_lib/callbacks/manager.py", line 197, in _notify_loop
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager callback(resource, event, trigger, **kwargs)
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager File "/usr/lib/python3/dist-packages/neutron/agent/ovn/metadata/server.py", line 60, in post_fork_initialize
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager tables=('Port_Binding', 'Datapath_Binding')).start()
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager File "/usr/lib/python3/dist-packages/neutron/agent/ovn/metadata/ovsdb.py", line 57, in start
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager return impl_idl_ovn.OvsdbSbOvnIdl(conn)
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager File "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/impl_idl_ovn.py", line 724, in __init__
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager super(OvsdbSbOvnIdl, self).__init__(connection)
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager File "/usr/lib/python3/dist-packages/ovsdbapp/schema/ovn_southbound/impl_idl.py", line 26, in __init__
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager super(OvnSbApiIdlImpl, self).__init__(connection)
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager File "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/impl_idl_ovn.py", line 63, in __init__
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager super(Backend, self).__init__(connection)
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager File "/usr/lib/python3/dist-packages/ovsdbapp/backend/ovs_idl/__init__.py", line 32, in __init__
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager self.start_connection(connection)
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager File "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/impl_idl_ovn.py", line 72, in start_connection
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager raise connection_exception
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.impl_idl_ovn.OvsdbConnectionUnavailable: OVS database connection to OVN_Southbound failed with error: 'Timeout'. Verify that the OVS and OVN services are available and that the 'ovn_nb_connection' and 'ovn_sb_connection' configuration options are correct.
2021-04-15 22:30:08.544 69188 ERROR neutron_lib.callbacks.manager
2021-04-15 22:30:08.546 69188 INFO eventlet.wsgi.server [-] (69188) wsgi starting up on http:/var/lib/neutron/metadata_proxy
2021-04-15 22:30:08.550 69189 ERROR neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.impl_idl_ovn [-] OVS database connection to OVN_Southbound failed with error: 'Timeout'. Verify that the OVS and OVN services are available and that the 'ovn_nb_connection' and 'ovn_sb_connection' configuration options are correct.: Exception: Timeout
2021-04-15 22:30:08.550 69189 ERROR neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.impl_idl_ovn Traceback (most recent call last):
2021-04-15 22:30:08.550 69189 ERROR neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.impl_idl_ovn File "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/impl_idl_ovn.py", line 67, in start_connection
2021-04-15 22:30:08.550 69189 ERROR neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.impl_idl_ovn self.ovsdb_connection.start()
2021-04-15 22:30:08.550 69189 ERROR neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.impl_idl_ovn File "/usr/lib/python3/dist-packages/ovsdbapp/backend/ovs_idl/connection.py", line 79, in start
2021-04-15 22:30:08.550 69189 ERROR neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.impl_idl_ovn idlutils.wait_for_change(self.idl, self.timeout)
2021-04-15 22:30:08.550 69189 ERROR neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.impl_idl_ovn File "/usr/lib/python3/dist-packages/ovsdbapp/backend/ovs_idl/idlutils.py", line 173, in wait_for_change
2021-04-15 22:30:08.550 69189 ERROR neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.impl_idl_ovn raise Exception("Timeout") # TODO(twilson) use TimeoutException?
2021-04-15 22:30:08.550 69189 ERROR neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.impl_idl_ovn Exception: Timeout
2021-04-15 22:30:08.550 69189 ERROR neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.impl_idl_ovn
2021-04-15 22:30:08.552 69189 ERROR neutron_lib.callbacks.manager [-] Error during notification for neutron.agent.ovn.metadata.server.MetadataProxyHandler.post_fork_initialize-476074 process, after_init: neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.impl_idl_ovn.OvsdbConnectionUnavailable: OVS database connection to OVN_Southbound failed with error: 'Timeout'. Verify that the OVS and OVN services are available and that the 'ovn_nb_connection' and 'ovn_sb_connection' configuration options are correct.
2021-04-15 22:30:08.552 69189 ERROR neutron_lib.callbacks.manager Traceback (most recent call last):
2021-04-15 22:30:08.552 69189 ERROR neutron_lib.callbacks.manager File "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/impl_idl_ovn.py", line 67, in start_connection
2021-04-15 22:30:08.552 69189 ERROR neutron_lib.callbacks.manager self.ovsdb_connection.start()
2021-04-15 22:30:08.552 69189 ERROR neutron_lib.callbacks.manager File "/usr/lib/python3/dist-packages/ovsdbapp/backend/ovs_idl/connection.py", line 79, in start
2021-04-15 22:30:08.552 69189 ERROR neutron_lib.callbacks.manager idlutils.wait_for_change(self.idl, self.timeout)
2021-04-15 22:30:08.552 69189 ERROR neutron_lib.callbacks.manager File "/usr/lib/python3/dist-packages/ovsdbapp/backend/ovs_idl/idlutils.py", line 173, in wait_for_change
2021-04-15 22:30:08.552 69189 ERROR neutron_lib.callbacks.manager raise Exception("Timeout") # TODO(twilson) use TimeoutException?
2021-04-15 22:30:08.552 69189 ERROR neutron_lib.callbacks.manager Exception: Timeout
2021-04-15 22:30:08.552 69189 ERROR neutron_lib.callbacks.manager
2021-04-15 22:30:08.552 69189 ERROR neutron_lib.callbacks.manager During handling of the above exception, another exception occurred:
2021-04-15 22:30:08.552 69189 ERROR neutron_lib.callbacks.manager
2021-04-15 22:30:08.552 69189 ERROR neutron_lib.callbacks.manager Traceback (most recent call last):
2021-04-15 22:30:08.552 69189 ERROR neutron_lib.callbacks.manager File "/usr/lib/python3/dist-packages/neutron_lib/callbacks/manager.py", line 197, in _notify_loop
2021-04-15 22:30:08.552 69189 ERROR neutron_lib.callbacks.manager callback(resource, event, trigger, **kwargs)
2021-04-15 22:30:08.552 69189 ERROR neutron_lib.callbacks.manager File "/usr/lib/python3/dist-packages/neutron/agent/ovn/metadata/server.py", line 60, in post_fork_initialize
2021-04-15 22:30:08.552 69189 ERROR neutron_lib.callbacks.manager tables=('Port_Binding', 'Datapath_Binding')).start()
2021-04-15 22:30:08.552 69189 ERROR neutron_lib.callbacks.manager File "/usr/lib/python3/dist-packages/neutron/agent/ovn/metadata/ovsdb.py", line 57, in start
2021-04-15 22:30:08.552 69189 ERROR neutron_lib.callbacks.manager return impl_idl_ovn.OvsdbSbOvnIdl(conn)
2021-04-15 22:30:08.552 69189 ERROR neutron_lib.callbacks.manager File "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/impl_idl_ovn.py", line 724, in __init__
2021-04-15 22:30:08.552 69189 ERROR neutron_lib.callbacks.manager super(OvsdbSbOvnIdl, self).__init__(connection)
2021-04-15 22:30:08.552 69189 ERROR neutron_lib.callbacks.manager File "/usr/lib/python3/dist-packages/ovsdbapp/schema/ovn_southbound/impl_idl.py", line 26, in __init__
2021-04-15 22:30:08.552 69189 ERROR neutron_lib.callbacks.manager super(OvnSbApiIdlImpl, self).__init__(connection)
2021-04-15 22:30:08.552 69189 ERROR neutron_lib.callbacks.manager File "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/impl_idl_ovn.py", line 63, in __init__
2021-04-15 22:30:08.552 69189 ERROR neutron_lib.callbacks.manager super(OvsdbSbOvnIdl, self).__init__(connection)
2021-04-15 22:30:08.552 69189 ERROR neutron_lib.callbacks.manager File "/usr/lib/python3/dist-packages/ovsdbapp/schema/ovn_southbound/impl_idl.py", line 26, in __init__
2021-04-15 22:30:08.552 69189 ERROR neutron_lib.callbacks.manager super(OvnSbApiIdlImpl, self).__init__(connection)
2021-04-15 22:30:08.552 69189 ERROR neutron_lib.callbacks.manager File "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/impl_idl_ovn.py", line 63, in __init__
2021-04-15 22:30:08.552 69189 ERROR neutron_lib.callbacks.manager super(Backend, self).__init__(connection)
2021-04-15 22:30:08.552 69189 ERROR neutron_lib.callbacks.manager File "/usr/lib/python3/dist-packages/ovsdbapp/backend/ovs_idl/__init__.py", line 32, in __init__
2021-04-15 22:30:08.552 69189 ERROR neutron_lib.callbacks.manager self.start_connection(connection)
2021-04-15 22:30:08.552 69189 ERROR neutron_lib.callbacks.manager File "/usr/lib/python3/dist-packages/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/impl_idl_ovn.py", line 72, in start_connection
2021-04-15 22:30:08.552 69189 ERROR neutron_lib.callbacks.manager raise connection_exception
2021-04-15 22:30:08.552 69189 ERROR neutron_lib.callbacks.manager neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.impl_idl_ovn.OvsdbConnectionUnavailable: OVS database connection to OVN_Southbound failed with error: 'Timeout'. Verify that the OVS and OVN services are available and that the 'ovn_nb_connection' and 'ovn_sb_connection' configuration options are correct.
2021-04-15 22:30:08.552 69189 ERROR neutron_lib.callbacks.manager
2021-04-15 22:30:08.555 69189 INFO eventlet.wsgi.server [-] (69189) wsgi starting up on http:/var/lib/neutron/metadata_proxy

* However netstat shows the TCP connection to SB IDL is established for both the workers. So this could be some problem in data at IDL.

$ grep 10.216.241.118 sosreport-*-2021-04-27-edidnbl/sos_commands/networking/netstat_-W_-neopa
tcp 0 0 10.216.241.244:53284 10.216.241.118:6642 ESTABLISHED 0 407222779 69124/neutron-ovn-m off (0.00/0/0)
tcp 1490260 0 10.216.241.244:53282 10.216.241.118:6642 ESTABLISHED 0 407288937 69189/neutron-ovn-m off (0.00/0/0)
tcp 1490260 0 10.216.241.244:53280 10.216.241.118:6642 ESTABLISHED 0 407156544 69188/neutron-ovn-m off (0.00/0/0)

It will be good if sb_idl is not initialised, the worker thread can be closed or neutron-ovn-metadata-agent should be stopped so that the problem will be notified to operator at the earliest.

description: updated
description: updated
Revision history for this message
Hemanth Nakkina (hemanth-n) wrote :

I see similar bugs have been raised earlier but marked as duplicate of https://bugs.launchpad.net/charm-ovn-chassis/+bug/1907686

This might not be a duplicate of LP 1907686 since this happens at SB IDL initialisation time and there is no reconnect at this stage.

https://opendev.org/openstack/ovsdbapp/src/branch/master/ovsdbapp/backend/ovs_idl/connection.py#L81

tags: added: sts
Revision history for this message
Hemanth Nakkina (hemanth-n) wrote :

I think this should fix the above problem.
https://review.opendev.org/c/openstack/neutron/+/788596

Another pair of eyes will be helpful to determine as no reproducer is available.

Revision history for this message
Frode Nordahl (fnordahl) wrote :

Thanks for digging that up, does look relevant for sure. The linked review appears to be backported to Ussuri already, and it is in the neutron 16.3.2 point release (released 10 days ago). So we'll get that in Ubuntu on the next point release update to the neutron package.

Changed in neutron (Ubuntu):
status: New → Triaged
importance: Undecided → High
Changed in charm-ovn-chassis:
status: New → Invalid
Revision history for this message
Giuseppe Petralia (peppepetra) wrote :

I am seeing same issue once we upgraded to 16.3.2 https://pastebin.canonical.com/p/F6NxGkrBcx/

Revision history for this message
Hemanth Nakkina (hemanth-n) wrote :

The errors mentioned in bug description are from MetadataAgentProxy process but the one mentioned in comment#4 is from parent MetadataAgent process itself.

ovsdb.MetadataAgentOvnSbIdl().start() seems never returned and so sb_idl is not initialised [1].
However from the logs, we can see the MetadataAgent is connected to OVS DB server and the trigger of function in which error happened indicates the MetadataAgentOvnSbIdl handling OVSDB SB update events.

This need to looked further into IDL code.

[1] https://opendev.org/openstack/neutron/src/commit/1e8197fee5031ee7ba384eb537b13f381a837685/neutron/agent/ovn/metadata/agent.py#L241-L248

Revision history for this message
Hemanth Nakkina (hemanth-n) wrote :

After looking into errors in case description and logs in comment #4, both seems to fail waiting for sb_idl object.

AttributeError: 'MetadataProxyHandler' object has no attribute 'sb_idl'
AttributeError: 'MetadataAgent' object has no attribute 'sb_idl'

Once the failure is in MetadataAgent process and the other time in forked MetadataAgentProxy process.

There is no TimeOut exception second time as this patch is applied [1] which retries the connections without timeout. However in both the cases we can see logs like below which says SSL connection to OVSDB DB is successful.
INFO ovsdbapp.backend.ovs_idl.vlog [-] ssl:10.216.241.118:6642: connected

Seems to stuck at wait_for_change [2]

On neutron-server side, these scenario's are handled by retry logic on getting IDL objects and waiting for post-fork events, see [3], [4]
Similar logic is required for neutron-ovn-metadata-agent as well. I wil submit a patch shortly for review.

[1] https://review.opendev.org/c/openstack/neutron/+/788596
[2] https://opendev.org/openstack/neutron/src/commit/87f7abb86cad13c8bc04b4e6165600ee6fd9ef7c/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/impl_idl_ovn.py#L53-L56
[3] https://opendev.org/openstack/neutron/src/commit/87f7abb86cad13c8bc04b4e6165600ee6fd9ef7c/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/impl_idl_ovn.py#L222-L226
[4] https://review.opendev.org/c/openstack/neutron/+/781555

Changed in neutron:
assignee: nobody → Hemanth Nakkina (hemanth-n)
Revision history for this message
Bodo Petermann (bpetermann) wrote :

We saw the AttributeErrors "has no attribute 'sb_idl'" too and as pointed out already, wait_for_change doesn't finish.
It calls ovs's idl.run which is supposed to return True if there was a change, but there are cases where it doesn't.
See [1]: in the run() function messages are read in a loop. If there are 2 messages to be read and the first one is a change, but the 2nd one isn't, the final return value will be False, because self.change_seqno is reset to initial_change_seqno (seqno from before the 1st message).

If run() only reads one message per call, it's fine. But if the loop reads multiple, it may not.

[1] https://github.com/openvswitch/ovs/blob/cca40141a8250562156ae8628f5c73de3621303e/python/ovs/db/idl.py#L277

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/796613

Changed in neutron:
status: New → In Progress
Revision history for this message
Hemanth Nakkina (hemanth-n) wrote :

Bodo Petermann, thanks for pointing the scenario.

Could you please mention the link if you have an issue/bug raised on ovs.

Revision history for this message
Bodo Petermann (bpetermann) wrote :

I sent in a patch "[PATCH] Python: Fix Idl.run change_seqno update"
see https://mail.openvswitch.org/pipermail/ovs-dev/2021-June/384014.html

Revision history for this message
Bodo Petermann (bpetermann) wrote :

This is the link to the patch in Open vSwitch's Patchwork:
https://patchwork.ozlabs.<email address hidden>/

tags: added: ovn
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/796613
Committed: https://opendev.org/openstack/neutron/commit/95d80a2757e23b6e8cdb44576f3913eedb400810
Submitter: "Zuul (22348)"
Branch: master

commit 95d80a2757e23b6e8cdb44576f3913eedb400810
Author: Hemanth Nakkina <email address hidden>
Date: Wed Jun 16 15:00:44 2021 +0530

    [OVN] neutron-ovn-metadat-agent add retry logic for sb_idl

    Add retry logic to retrieve OvsdbSb IDL object.
    Add port fork event in MetadataProxyHandler to wait for the
    OvsdbSb IDL object.

    Closes-Bug: #1928031
    Change-Id: Idce1ec4e160c5a7f8532b57f577b9518a06b0dd0

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/neutron/+/799356

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/victoria)

Fix proposed to branch: stable/victoria
Review: https://review.opendev.org/c/openstack/neutron/+/799357

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/ussuri)

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/c/openstack/neutron/+/799406

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/799356
Committed: https://opendev.org/openstack/neutron/commit/2b5c141ec9c5b4edeb4bdb0b988edaa0d8635d0b
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit 2b5c141ec9c5b4edeb4bdb0b988edaa0d8635d0b
Author: Hemanth Nakkina <email address hidden>
Date: Wed Jun 16 15:00:44 2021 +0530

    [OVN] neutron-ovn-metadat-agent add retry logic for sb_idl

    Add retry logic to retrieve OvsdbSb IDL object.
    Add port fork event in MetadataProxyHandler to wait for the
    OvsdbSb IDL object.

    Closes-Bug: #1928031
    Change-Id: Idce1ec4e160c5a7f8532b57f577b9518a06b0dd0
    (cherry picked from commit 95d80a2757e23b6e8cdb44576f3913eedb400810)

tags: added: in-stable-wallaby
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/victoria)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/799357
Committed: https://opendev.org/openstack/neutron/commit/033b8e3769c906ee6d3b24a3030017d75781ee7c
Submitter: "Zuul (22348)"
Branch: stable/victoria

commit 033b8e3769c906ee6d3b24a3030017d75781ee7c
Author: Hemanth Nakkina <email address hidden>
Date: Wed Jun 16 15:00:44 2021 +0530

    [OVN] neutron-ovn-metadat-agent add retry logic for sb_idl

    Add retry logic to retrieve OvsdbSb IDL object.
    Add port fork event in MetadataProxyHandler to wait for the
    OvsdbSb IDL object.

    Closes-Bug: #1928031
    Change-Id: Idce1ec4e160c5a7f8532b57f577b9518a06b0dd0
    (cherry picked from commit 95d80a2757e23b6e8cdb44576f3913eedb400810)

tags: added: in-stable-victoria
Revision history for this message
Hemanth Nakkina (hemanth-n) wrote :

Just for completeness, patch on openvswitch side is merged until branch 2.13, thanks to Bodo Petermann for the ovs patch

https://patchwork.ozlabs.<email address hidden>/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/ussuri)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/799406
Committed: https://opendev.org/openstack/neutron/commit/f06e76d532ded21eb152be6fae805b5688feddfd
Submitter: "Zuul (22348)"
Branch: stable/ussuri

commit f06e76d532ded21eb152be6fae805b5688feddfd
Author: Hemanth Nakkina <email address hidden>
Date: Wed Jun 16 15:00:44 2021 +0530

    [OVN] neutron-ovn-metadat-agent add retry logic for sb_idl

    Add retry logic to retrieve OvsdbSb IDL object.
    Add port fork event in MetadataProxyHandler to wait for the
    OvsdbSb IDL object.

    Closes-Bug: #1928031
    Change-Id: Idce1ec4e160c5a7f8532b57f577b9518a06b0dd0
    (cherry picked from commit 95d80a2757e23b6e8cdb44576f3913eedb400810)

tags: added: in-stable-ussuri
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 16.4.0

This issue was fixed in the openstack/neutron 16.4.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 17.2.0

This issue was fixed in the openstack/neutron 17.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 18.1.0

This issue was fixed in the openstack/neutron 18.1.0 release.

tags: added: neutron-proactive-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 19.0.0.0rc1

This issue was fixed in the openstack/neutron 19.0.0.0rc1 release candidate.

Changed in neutron (Ubuntu):
status: Triaged → Fix Released
Revision history for this message
shanyunfan (shanyunfan) wrote :

How can I access #4

Revision history for this message
Erlon R. Cruz (sombrafam) wrote :
Revision history for this message
Erlon R. Cruz (sombrafam) wrote :

@Bodo, regarding your fix in openvswitch, I believe I'm seeing it happening in a customer environment, the error is identical to the logs linked in #25.
To test your fix, were you able to reproduce the problem?

Revision history for this message
shanyunfan33 (shanyunfan33) wrote :

@Erlon R.Cruz, you can go https://review.opendev.org/c/openstack/neutron/+/821927 to see the solution about 'MetadataAgent' object has no attribute 'sb_idl'

Revision history for this message
Hemanth Nakkina (hemanth-n) wrote :

Added openvswitch (Ubuntu) to the affected projects.

Fix in openvswitch side mentioned in #11 is available on 2.13.5, 2.15.2 upstream.

Ubuntu Focal/UCA Ussuri is on 2.13.3-0ubuntu0.20.04.2 and UCA Victoria is on 2.15.0-0ubuntu3.1.
SRU required for Focal and UCA Ussuri/Victoria.

Revision history for this message
James Page (james-page) wrote :

ussuri/victoria -> hirsute + ussuri/wallaby

ussuri victoria did not ship with an new OVS version.

James Page (james-page)
Changed in openvswitch (Ubuntu):
status: New → Fix Released
Revision history for this message
James Page (james-page) wrote :

focal: bug 1956754
hirsute: bug 1956752

Changed in neutron (Ubuntu Hirsute):
status: New → Won't Fix
Changed in openvswitch (Ubuntu Hirsute):
status: New → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers