Race condition during init may lead to neutron server malfunction

Bug #1824299 reported by Oleg Bondarev on 2019-04-11
16
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
High
Oleg Bondarev

Bug Description

release: Queens

quite a lot of advanced services enabled:
"service_plugins = neutron.services.l3_router.l3_router_plugin.L3RouterPlugin,metering,lbaasv2,neutron.services.qos.qos_plugin.QoSPlugin,trunk,networking_l2gw.services.l2gateway.plugin.L2GatewayPlugin,bgpvpn"

Neutron server fails to start with repeating sqlalchemy errors "Class 'networking_bgpvpn.neutron.db.bgpvpn_db.BGPVPNPortAssociationRoute' is not mapped" or "Class 'networking_bgpvpn.neutron.db.bgpvpn_db.BGPVPN' is not mapped".

The errors happen on handling state reports from agents. So if stop all neutron agents, start server, wait server initialization, and only then start agents - everything is ok.

Also it appears that if place 'bgpvpn' in "service_plugins" config closer to beginning:
"service_plugins = neutron.services.l3_router.l3_router_plugin.L3RouterPlugin,bgpvpn,metering,lbaasv2,neutron.services.qos.qos_plugin.QoSPlugin,trunk,networking_l2gw.services.l2gateway.plugin.L2GatewayPlugin"
 - no errors happen, even if not stop neutron agents during server restart.

Full log near error trace:

2019-04-09 13:42:27,194.194 15604 INFO sqlalchemy.orm.mapper.Mapper [req-0392737c-e6c8-472f-8685-91190f882862 - - - - -] (BGPVPNPortAssociation|bgpvpn_port_associations) initialize prop routes
2019-04-09 13:42:27,197.197 15604 INFO sqlalchemy.orm.mapper.Mapper [req-917b59e9-2b38-4214-a808-6bf2872d708f - - - - -] (BGPVPNPortAssociationRoute|bgpvpn_port_association_routes) _configure_property(port_association, RelationshipProperty)
2019-04-09 13:42:27,197.197 15604 INFO sqlalchemy.orm.mapper.Mapper [req-917b59e9-2b38-4214-a808-6bf2872d708f - - - - -] (BGPVPNPortAssociationRoute|bgpvpn_port_association_routes) _configure_property(bgpvpn, RelationshipProperty)
2019-04-09 13:42:27,198.198 15604 INFO sqlalchemy.orm.mapper.Mapper [req-917b59e9-2b38-4214-a808-6bf2872d708f - - - - -] (BGPVPNPortAssociationRoute|bgpvpn_port_association_routes) _configure_property(id, Column)
2019-04-09 13:42:27,199.199 15604 INFO sqlalchemy.orm.mapper.Mapper [req-917b59e9-2b38-4214-a808-6bf2872d708f - - - - -] (BGPVPNPortAssociationRoute|bgpvpn_port_association_routes) _configure_property(port_association_id, Column)
2019-04-09 13:42:27,200.200 15604 INFO sqlalchemy.orm.mapper.Mapper [req-917b59e9-2b38-4214-a808-6bf2872d708f - - - - -] (BGPVPNPortAssociationRoute|bgpvpn_port_association_routes) _configure_property(type, Column)
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server [req-0392737c-e6c8-472f-8685-91190f882862 - - - - -] Exception during message handling: UnmappedClassError: Class 'networking_bgpvpn.neutron.db.bgpvpn_db.BGPVPNPortAssociationRoute' is not mapped
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", line 160, in _process_incoming
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message)
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 213, in dispatch
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args)
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 183, in _do_dispatch
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args)
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/neutron/db/api.py", line 161, in wrapped
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server return method(*args, **kwargs)
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/neutron/db/api.py", line 91, in wrapped
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server setattr(e, '_RETRY_EXCEEDED', True)
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server self.force_reraise()
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/neutron/db/api.py", line 87, in wrapped
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server return f(*args, **kwargs)
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 147, in wrapper
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server ectxt.value = e.inner_exc
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server self.force_reraise()
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 135, in wrapper
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server return f(*args, **kwargs)
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/neutron/db/api.py", line 126, in wrapped
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server LOG.debug("Retry wrapper got retriable exception: %s", e)
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server self.force_reraise()
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/neutron/db/api.py", line 122, in wrapped
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server return f(*dup_args, **dup_kwargs)
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/neutron/db/agents_db.py", line 439, in report_state
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server context, agent_state)
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/neutron/db/api.py", line 161, in wrapped
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server return method(*args, **kwargs)
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/neutron/db/api.py", line 91, in wrapped
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server setattr(e, '_RETRY_EXCEEDED', True)
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server self.force_reraise()
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/neutron/db/api.py", line 87, in wrapped
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server return f(*args, **kwargs)
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 147, in wrapper
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server ectxt.value = e.inner_exc
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server self.force_reraise()
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 135, in wrapper
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server return f(*args, **kwargs)
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/neutron/db/api.py", line 126, in wrapped
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server LOG.debug("Retry wrapper got retriable exception: %s", e)
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server self.force_reraise()
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/neutron/db/api.py", line 122, in wrapped
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server return f(*dup_args, **dup_kwargs)
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/neutron/db/agents_db.py", line 346, in create_or_update_agent
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server agent.update()
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/neutron/objects/base.py", line 314, in decorator
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server res = func(self, *args, **kwargs)
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/neutron/objects/base.py", line 808, in update
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server self._get_composite_keys()))
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/neutron/objects/db/api.py", line 80, in update_object
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server db_obj = _safe_get_object(obj_cls, context, **kwargs)
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/neutron/objects/db/api.py", line 68, in _safe_get_object
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server db_obj = get_object(obj_cls, context, **kwargs)
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/neutron/objects/db/api.py", line 33, in get_object
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server return _get_filter_query(obj_cls, context, **kwargs).first()
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/neutron/objects/db/api.py", line 28, in _get_filter_query
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server context, obj_cls.db_model, filters)
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/neutron/db/_model_query.py", line 256, in get_collection_query
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server collection = query_with_hooks(context, model)
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/neutron/db/_model_query.py", line 137, in query_with_hooks
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server query = context.session.query(model)
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 1260, in query
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server return self._query_cls(entities, self, **kwargs)
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line 110, in __init__
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server self._set_entities(entities)
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line 120, in _set_entities
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server self._set_entity_selectables(self._entities)
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line 150, in _set_entity_selectables
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server ent.setup_entity(*d[entity])
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line 3446, in setup_entity
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server self._with_polymorphic = ext_info.with_polymorphic_mappers
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/sqlalchemy/util/langhelpers.py", line 754, in __get__
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server obj.__dict__[self.__name__] = result = self.fget(obj)
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/mapper.py", line 1893, in _with_polymorphic_mappers
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server configure_mappers()
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/mapper.py", line 2770, in configure_mappers
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server mapper._post_configure_properties()
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/mapper.py", line 1710, in _post_configure_properties
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server prop.init()
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/interfaces.py", line 183, in init
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server self.do_init()
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/relationships.py", line 1628, in do_init
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server self._process_dependent_arguments()
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/relationships.py", line 1685, in _process_dependent_arguments
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server self.target = self.mapper.mapped_table
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/sqlalchemy/util/langhelpers.py", line 754, in __get__
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server obj.__dict__[self.__name__] = result = self.fget(obj)
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/relationships.py", line 1607, in mapper
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server configure=False)
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/base.py", line 426, in class_mapper
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server raise exc.UnmappedClassError(class_)
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server UnmappedClassError: Class 'networking_bgpvpn.neutron.db.bgpvpn_db.BGPVPNPortAssociationRoute' is not mapped
2019-04-09 13:42:27,201.201 15604 ERROR oslo_messaging.rpc.server
2019-04-09 13:42:27,201.201 15604 INFO sqlalchemy.orm.mapper.Mapper [req-917b59e9-2b38-4214-a808-6bf2872d708f - - - - -] (BGPVPNPortAssociationRoute|bgpvpn_port_association_routes) _configure_property(local_pref, Column)
2019-04-09 13:42:27,209.209 15604 INFO sqlalchemy.orm.mapper.Mapper [req-917b59e9-2b38-4214-a808-6bf2872d708f - - - - -] (BGPVPNPortAssociationRoute|bgpvpn_port_association_routes) _configure_property(prefix, Column)
2019-04-09 13:42:27,209.209 15604 INFO sqlalchemy.orm.mapper.Mapper [req-917b59e9-2b38-4214-a808-6bf2872d708f - - - - -] (BGPVPNPortAssociationRoute|bgpvpn_port_association_routes) _configure_property(bgpvpn_id, Column)
2019-04-09 13:42:27,210.210 15604 INFO sqlalchemy.orm.mapper.Mapper [req-917b59e9-2b38-4214-a808-6bf2872d708f - - - - -] (BGPVPNPortAssociationRoute|bgpvpn_port_association_routes) Identified primary key columns: ColumnSet([Column('id', String(length=36), table=<bgpvpn_port_association_routes>, primary_key=True, nullable=False, default=ColumnDefault(<function <lambda> at 0x7f5ef1342b90>))])
2019-04-09 13:42:27,211.211 15604 INFO sqlalchemy.orm.mapper.Mapper [req-917b59e9-2b38-4214-a808-6bf2872d708f - - - - -] (BGPVPNPortAssociationRoute|bgpvpn_port_association_routes) constructed
2019-04-09 13:42:27,215.215 15604 INFO sqlalchemy.orm.mapper.Mapper [req-917b59e9-2b38-4214-a808-6bf2872d708f - - - - -] (BGPVPN|bgpvpns) _configure_property(network_associations, RelationshipProperty)
2019-04-09 13:42:27,215.215 15604 INFO sqlalchemy.orm.mapper.Mapper [req-917b59e9-2b38-4214-a808-6bf2872d708f - - - - -] (BGPVPN|bgpvpns) _configure_property(router_associations, RelationshipProperty)
2019-04-09 13:42:27,216.216 15604 INFO sqlalchemy.orm.mapper.Mapper [req-917b59e9-2b38-4214-a808-6bf2872d708f - - - - -] (BGPVPN|bgpvpns) _configure_property(port_associations, RelationshipProperty)
2019-04-09 13:42:27,217.217 15604 INFO sqlalchemy.orm.mapper.Mapper [req-917b59e9-2b38-4214-a808-6bf2872d708f - - - - -] (BGPVPN|bgpvpns) _configure_property(tenant_id, SynonymProperty)
2019-04-09 13:42:27,218.218 15604 INFO sqlalchemy.orm.mapper.Mapper [req-917b59e9-2b38-4214-a808-6bf2872d708f - - - - -] (BGPVPN|bgpvpns) _configure_property(project_id, Column)
2019-04-09 13:42:27,219.219 15604 INFO sqlalchemy.orm.mapper.Mapper [req-917b59e9-2b38-4214-a808-6bf2872d708f - - - - -] (BGPVPN|bgpvpns) _configure_property(id, Column)
2019-04-09 13:42:27,219.219 15604 INFO sqlalchemy.orm.mapper.Mapper [req-917b59e9-2b38-4214-a808-6bf2872d708f - - - - -] (BGPVPN|bgpvpns) _configure_property(name, Column)
2019-04-09 13:42:27,220.220 15604 INFO sqlalchemy.orm.mapper.Mapper [req-917b59e9-2b38-4214-a808-6bf2872d708f - - - - -] (BGPVPN|bgpvpns) _configure_property(type, Column)
2019-04-09 13:42:27,221.221 15604 INFO sqlalchemy.orm.mapper.Mapper [req-917b59e9-2b38-4214-a808-6bf2872d708f - - - - -] (BGPVPN|bgpvpns) _configure_property(route_targets, Column)
2019-04-09 13:42:27,222.222 15604 INFO sqlalchemy.orm.mapper.Mapper [req-917b59e9-2b38-4214-a808-6bf2872d708f - - - - -] (BGPVPN|bgpvpns) _configure_property(import_targets, Column)
2019-04-09 13:42:27,223.223 15604 INFO sqlalchemy.orm.mapper.Mapper [req-917b59e9-2b38-4214-a808-6bf2872d708f - - - - -] (BGPVPN|bgpvpns) _configure_property(export_targets, Column)
2019-04-09 13:42:27,224.224 15604 INFO sqlalchemy.orm.mapper.Mapper [req-917b59e9-2b38-4214-a808-6bf2872d708f - - - - -] (BGPVPN|bgpvpns) _configure_property(route_distinguishers, Column)
2019-04-09 13:42:27,224.224 15604 INFO sqlalchemy.orm.mapper.Mapper [req-917b59e9-2b38-4214-a808-6bf2872d708f - - - - -] (BGPVPN|bgpvpns) _configure_property(vni, Column)
2019-04-09 13:42:27,225.225 15604 INFO sqlalchemy.orm.mapper.Mapper [req-917b59e9-2b38-4214-a808-6bf2872d708f - - - - -] (BGPVPN|bgpvpns) _configure_property(local_pref, Column)
2019-04-09 13:42:27,226.226 15604 INFO sqlalchemy.orm.mapper.Mapper [req-917b59e9-2b38-4214-a808-6bf2872d708f - - - - -] (BGPVPN|bgpvpns) Identified primary key columns: ColumnSet([Column('id', String(length=36), table=<bgpvpns>, primary_key=True, nullable=False, default=ColumnDefault(<function <lambda> at 0x7f5ef1342b90>))])
2019-04-09 13:42:27,227.227 15604 INFO sqlalchemy.orm.mapper.Mapper [req-917b59e9-2b38-4214-a808-6bf2872d708f - - - - -] (BGPVPN|bgpvpns) constructed

Oleg Bondarev (obondarev) wrote :

In log above messages "(BGPVPNPortAssociationRoute|bgpvpn_port_association_routes) constructed" and "(BGPVPN|bgpvpns) constructed" can be seen after the error.
But if we wait till all BGPVPN objects got constructed and start agents, we're good.

Oleg Bondarev (obondarev) wrote :

So, it appears that the race happens in case relationship is defined before the class it relates to. For example in bgpvpn case:

  class BGPVPNPortAssociationRoute:
    ...
    bgpvpn = orm.relationship("BGPVPN",...)

  class BGPVPN(...)
    ...

in order to fix the race should be changed to:

  class BGPVPN(...)
    ...

  class BGPVPNPortAssociationRoute:
    ...
    bgpvpn = orm.relationship("BGPVPN",...)

and alike.

Fix proposed to branch: master
Review: https://review.opendev.org/659511

Changed in neutron:
assignee: nobody → Oleg Bondarev (obondarev)
status: New → In Progress
Mike Bayer (zzzeek) wrote :

it's not possible for the target class to be imported but not be mapped, since it is using the declarative metaclass that maps the class immediately. The order of relationships has nothing to do with it, and the stack trace here occurs within the configure_mappers() phase which assumes all classes have been imported, and therefore it should not be possible that they exist but are not mapped. I need to look at this much more closely.

Mike Bayer (zzzeek) wrote :

ah yes, this is a bug, probably in Neutron's import system though there may be a way to address it in SQLAlchemy by adding a mutex, it's a race. note that the stack trace is generated on a different request id while the SQLAlchemy ORM logging shows it is still in the process of mapping the BGPVPNPortAssociationRoute class itself.

What I need to understand is how is the bgvpn_db.py module being imported? That is, above the logging given above, I assume there's some kind of dynamic module importing that occurs within a service thread and is not simply when the whole application starts up?

I'd like you to try this patch in SQLAlchemy assuming you can reproduce, if you can tell me the specific SQLAlchemy version that's here I can reformat this patch against that version, this is against latest 1.2, the method in question is the "def mapper()" inside of relationships.py:

diff --git a/lib/sqlalchemy/orm/relationships.py b/lib/sqlalchemy/orm/relationships.py
index 356b673b66..014004f005 100644
--- a/lib/sqlalchemy/orm/relationships.py
+++ b/lib/sqlalchemy/orm/relationships.py
@@ -45,7 +45,7 @@ from ..sql.util import ClauseAdapter
 from ..sql.util import join_condition
 from ..sql.util import selectables_overlap
 from ..sql.util import visit_binary_product
-
+from .mapper import _CONFIGURE_MUTEX

 def remote(expr):
     """Annotate a portion of a primaryjoin expression
@@ -1724,23 +1724,29 @@ class RelationshipProperty(StrategizedProperty):
         This is a lazy-initializing static attribute.

         """
- if util.callable(self.argument) and not isinstance(
- self.argument, (type, mapperlib.Mapper)
- ):
- argument = self.argument()
- else:
- argument = self.argument

- if isinstance(argument, type):
- mapper_ = mapperlib.class_mapper(argument, configure=False)
- elif isinstance(self.argument, mapperlib.Mapper):
- mapper_ = argument
- else:
- raise sa_exc.ArgumentError(
- "relationship '%s' expects "
- "a class or a mapper argument (received: %s)"
- % (self.key, type(argument))
- )
+ _CONFIGURE_MUTEX.acquire()
+ try:
+ if util.callable(self.argument) and not isinstance(
+ self.argument, (type, mapperlib.Mapper)
+ ):
+ argument = self.argument()
+ else:
+ argument = self.argument
+
+ if isinstance(argument, type):
+ mapper_ = mapperlib.class_mapper(argument, configure=False)
+ elif isinstance(self.argument, mapperlib.Mapper):
+ mapper_ = argument
+ else:
+ raise sa_exc.ArgumentError(
+ "relationship '%s' expects "
+ "a class or a mapper argument (received: %s)"
+ % (self.key, type(argument))
+ )
+ finally:
+ _CONFIGURE_MUTEX.release()
+
         return mapper_

     @util.memoized_property

Mike Bayer (zzzeek) wrote :

OK so , the mutex in question would solve this very particular race, but it isn't a generalized solution. I have two test scripts that reproduce the issue, one is fixed by a mutex in mapper() above, but the other is not. The essential problem is that neutron is importing modules while simultaneously trying to build queries against the things being imported. Neutron should have mutexes during its dynamic module importing that prevents it from proceeding with API requests within this process.

Mike Bayer (zzzeek) wrote :

I've created https://github.com/sqlalchemy/sqlalchemy/issues/4686 to think about a possible mitigation here, however I haven't had time to consider both the full span of races that can occur if classes are used while imports happen, as well as if there are certain races that are much more likely and therefore worth mutexing. In particular, the test script I have on my github issue produces this error in a slightly different way, and already shows that the above mutex I have in relationship.mapper() is not enough, depending on where the class setup process stalls.

Mike Bayer (zzzeek) wrote :

Current plan is to fix into the 1.3.x series, the mutex around the declarative process from the point at which the class is present in the registry (where relationship() resolves the string "RelatedClass") up until the class has a mapper() applied. the mutex is shared with that of the configure process itself and is an RLock.

Oleg Bondarev (obondarev) wrote :

@zzzeek so I reverted https://review.opendev.org/659511 and applied your above patch adding mutex into SQLAlchemy: in 1 out of 3 neutron server restarts I can still see "Class 'neutron.db.models.flavor.FlavorServiceProfileBinding' is not mapped" errors.
With just https://review.opendev.org/659511 - I did 5 restarts and no errors seen.
I understand https://review.opendev.org/659511 is not the right fix, but maybe it can help debug the race more. Thanks

Mike Bayer (zzzeek) wrote :

Hi Oleg -

can you please try the proposed patch upstream at https://gerrit.sqlalchemy.org/#/c/sqlalchemy/sqlalchemy/+/1271/. With that patch, it's not possible to get the class from the declarative registry without it being mapped and should resolve this issue.

Mike Bayer (zzzeek) wrote :

and if it doesn't, if you can share the stack trace and surrounding logging assuming level of INFO for "sqlalchemy.orm" logger, thanks

Oleg Bondarev (obondarev) wrote :

Ok, so my sqlalchemy version is 1.0.13 but the patch stays essentially the same. After applying I can see following errors: https://pastebin.com/tVT7N4f4 - so it clearly suggests "adding this relationship() to the <class 'neutron.db.models.flavor.Flavor'> class after both dependent classes have been defined"

Mike Bayer (zzzeek) wrote :

wow neutron really wants to race on this! so yes, that's the best SQLAlchemy can do. you are importing modules while trying to serve requests. that's the key architectural issue. you can push the gerrit that you have which works around it but I personally wouldn't consider the architecrual issue here closed.

Mike Bayer (zzzeek) wrote :

but also, using backref() does have the effect that the other class won't need to see this related class, until this related class comes online and attaches its relationship to the other one. it's very strange this happens all in one file like this. so there is still a chance you can get a Flavor class or object that does not have a .service_profiles attribute yet. since you're managing to access the class in between those two events.

Reviewed: https://review.opendev.org/659511
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=38daf9eaae8538c74933b19c952ad0612751c027
Submitter: Zuul
Branch: master

commit 38daf9eaae8538c74933b19c952ad0612751c027
Author: Oleg Bondarev <email address hidden>
Date: Thu May 16 13:45:56 2019 +0400

    Define orm relationships after db classes

    This is to fix race conditions on neutron server init.
    Please see bug for details.

    Change-Id: I943e6397319b9a4a7fc1a5b3acb721920ddffb02
    Partial-Bug: #1824299

tags: added: neutron-proactive-backport-potential
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.