error in callback mechanism

Bug #2031085 reported by yaoguang
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Expired
Undecided
Unassigned

Bug Description

Version: train
Deatils:
  when create or delete trunk, agent do not receive notification. neutron server error log output:
Error during notification for neutron.services.trunk.rpc.backend.ServerSideRpcBackend.process_event-8351463 trunk, after_create: AttributeError: 'ServerSideRpcBackend' object has no attribute '_stub'
Traceback (most recent call last):
  File "/usr/lib/python3.7/site-packages/neutron_lib/callbacks/manager.py", line 199, in _notify_loop
  callback(resource, event, trigger, **kwargs)
File "/usr/lib/python3.7/site-packages/neutron/services/trunk/rpc/backend.py", line 58, in process_event
  events.AFTER_CREATE: self._stub.trunk_created,
AttributeError: 'ServerSideRpcBackend' object has no attribute '_stub'

Root reason: ServerSideRpcBackend decorated by neutron_lib.callbacks.registry.has_registry_receivers and replace __new__ method, when create ServerSideRpcBackend instance and throw exception in __init__, the replacement __new__ method does not know and already registerd callbacks.

Example:

from neutron_lib.callbacks import registry

class BClass(object):
    def __init__(self):
        raise IOError()

@registry.has_registry_receivers
class AClass(object):

    def __init__(self):
        self.b = BClass()
        self.a = 1

    @registry.receives("test", ["after_create"])
    def test(self):
        pass

if __name__ == '__main__':
    try:
        a = AClass()
    except BaseException:
        print('error')
    print(registry._CALLBACK_MANAGER._callbacks)

yaoguang (yaoguang100)
description: updated
Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Hello Yaoguang:

The "__new__" method is called once when the class is instantiated. Then the "__init__" method is called during the initialization. This is when the callbacks are registered. At this point, self._stub is instantiated too.

Sorry but I'm not following your description of the problem and I can't reproduce it neither. Can you describe how is that happening? What command are you executing? Is that happening just after a Neutron server restart?

Regards.

Changed in neutron:
status: New → Incomplete
assignee: nobody → Rodolfo Alonso (rodolfo-alonso-hernandez)
Revision history for this message
yaoguang (yaoguang100) wrote (last edit ):

Can you run my example first? Apparently, an exception occurred when the class AClass was instantiated, but a callback method was registered, which resulted in a null pointer error. Similarly, when neutron-server is started, if rabbitmq is restarted and the password is reset, the ServerSideRpcBackend object fails to be initialized. However, the callback method is registered. As a result, the preceding error call stack is displayed when create trunk or add sub port.

Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Hello Yaoguang:

First of all, any service (rabbitmq, SQL server, etc) can be restarted during the Neutron server operation. It will take some time but eventually the Neutron server will reconnect. But what is not considered nor accepted is that any service changes its configuration. If you are resetting the rabbitmq password, then you'll need to reconfigure Neutron server and restart it.

About the issue during the initialization of "ServerSideRpcBackend". The event registry is done at the end of the "__init__" method. Can you describe how "ServerSideRpcBackend" is failing during the initialization? If that happens, the Neutron server won't start. Please provide more information and a reproducer on the Neutron server code (the example provided is just a cooked code that doesn't reflect what happens in "ServerSideRpcBackend" class).

Regards.

Revision history for this message
yaoguang (yaoguang100) wrote :

Firstly , the event registry is done before '__init__' method. When neutron-server and rabbitmq are started at the same time, rabbitmq resets the password for a short period of time and then restores it to the configured password. This feature may be unique to us. However, the ServerSideRpcBackend object is instantiated in the callback method. If the _skeleton attribute fails to be initialized, an exception is thrown and captured. Neutron-server does not retry for this part.

error log:
[manager.py:209 _notify_loop] Error or during notifiction for neutron.services.trunk.drivers.openvswitch.driver.DriverBase.register-1004450 trunk_plugin, after_init: amqp.exceptions.AccessRefused: (0, 0): (403) ACCESS_REFUSED - Login was refused using authentication mechanism AMQPLAIN. For details see the broker logfile.

Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Hello Yaoguang:

Please provide information about the statement "the event registry is done before '__init__' method".

Regards.

Revision history for this message
yaoguang (yaoguang100) wrote :

python example.py
error
defaultdict(<class 'dict'>, {'test': {'after_create': [(55550000, {'__main__.AClass.test-7895332': <bound method AClass.test of <__main__.AClass object at 0x15415c5450d0>>})]}})

As my example shows, variable a fails to be instantiated, but registry._CALLBACK_MANAGER has a registered callback method.

For details, see
https://docs.python.org/3/reference/datamodel.html#special-method-names

Partial References:
If __new__() is invoked during object construction and it returns an instance of cls, then the new instance’s __init__() method will be invoked like __init__(self[, ...]), where self is the new instance and the remaining arguments are the same as were passed to the object constructor.

Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Hello Yaoguang:

I'm not talking about your example, I'm talking about the "ServerSideRpcBackend". I would like to know how is possible that the "ServerSideRpcBackend" initialization method fails but the events are registered.

Your example differs from the "ServerSideRpcBackend" implementation in:
* Your code is indeed raising an exception during the initialization method, but you are catching it and dismissing. We don't have this is the Neutron code. If the initialization method fails, the Neutron API does not start.
* In your example you are using decorators to create the event subscriptions:
  @registry.receives("test", ["after_create"])
  In the neutron code we are explicitly calling the "registry.subscribe" method in the initialization method.

Please provide, in the Neutron code, an example of how "ServerSideRpcBackend" initialization method can fail without stopping the Neutron API or a reproducer.

Regards.

Revision history for this message
yaoguang (yaoguang100) wrote :

Neutron version: train

I'll describe the method invocation process of the problem.

neutron load service plugin
...
neutron.services.trunk.plugin.TrunkPlugin __init__
neutron.services.trunk.drivers.openvswitch.driver.OVSDriver.create()
registry.publish(resources.TRUNK_PLUGIN, events.AFTER_INIT, self)
neutron_lib.callbacks.manager.CallbacksManager._notify_loop
neutron.services.trunk.drivers.base.DriverBase register
ServerSideRpcBackend __init__ self._skeleton = server.TrunkSkeleton()
neutron.services.trunk.rpc.server.TrunkSkeleton __init__ self._connection.consume_in_threads()
throw amqp.exceptions.AccessRefused, then neutron_lib.callbacks.manager.CallbacksManager._notify_loop just log it and ignore the exception. I've given the log output.

Although this is rare, the mechanism for calling back registrations is clearly problematic.

Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Please provide not the callback error but the subscription error. The ``CallbacksManager.publish`` method not only logs the errors but should also raise an ``CallbackFailure`` exception in case of errors. With these logs we would be able to reproduce the error or at least check why this exception was dismissed.

Revision history for this message
yaoguang (yaoguang100) wrote (last edit ):

No subscription error, only BEFORE and PRECOMMIT event will throw CallbackFailure.
###
neutron.services.trunk.plugin.TrunkPlugin.__init__
    registry.publish(resources.TRUNK_PLUGIN, events.AFTER_INIT, self)

neutron_lib.callbacks.manager.CallbacksManager.notify
    errors = self._notify_loop(resource, event, trigger, **kwargs)
        if errors:
    errors = self._notify_loop(resource, event, trigger, **kwargs)
        if errors:
            if event.startswith(events.BEFORE):
                abort_event = event.replace(
                    events.BEFORE, events.ABORT)
                self._notify_loop(resource, abort_event, trigger, **kwargs)

                raise exceptions.CallbackFailure(errors=errors)

            if event.startswith(events.PRECOMMIT):
                raise exceptions.CallbackFailure(errors=errors)

###
I've described the problem as completely as possible, and if you still have doubts, please review it with other core maintainers.
Thanks !

Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Hello Yaoguang:

Thanks for the report. What you are describing is not reproducible. If during the "ServerSideRpcBackend" initialization, the "TrunkSkeleton" fails in the "self._connection.consume_in_threads()", the subscription for the events AFTER_CREATE and AFTER_DELETE won't be created and the "ServerSideRpcBackend.process_event" won't be called.

Manually adding a "amqp.exceptions.AccessRefused" exception during the "TrunkSkeleton" initialization doesn't stop the server from starting correctly although with an exception logged. But the "ServerSideRpcBackend.process_event" is never called when a trunk is created or deleted.

I'll keep the status of this bug and unassign it.

Regards.

Changed in neutron:
assignee: Rodolfo Alonso (rodolfo-alonso-hernandez) → nobody
Revision history for this message
yaoguang (yaoguang100) wrote :

As you have said, this is a problem. In earlier versions, the callback method is not registered. In later train versions, although the callback method is registered, the execution fails. As a result, sub-interfaces are added to the trunk, and the agent cannot receive messages and does not deliver configurations.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for neutron because there has been no activity for 60 days.]

Changed in neutron:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.