Mistral fails on Rabbit restart

Bug #1718883 reported by Andras Kovi
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mistral
Fix Released
High
Andras Kovi

Bug Description

On RabbitMQ failures, Mistral seems to hang on to the stale connections and becomes irresponsive:

ERROR (app) RemoteError: Remote error: MessageDeliveryFailure Unable to connect to AMQP server on 127.0.0.1:5672 after None tries: 'NoneType' object has no attribute '__getitem__'
[u'Traceback (most recent call last):
', u' File "/home/akovi/openstack/mistral/.tox/venv/local/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 165, in _process_incoming
    res = self.dispatcher.dispatch(message)
', u' File "/home/akovi/openstack/mistral/.tox/venv/local/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 222, in dispatch
    return self._do_dispatch(endpoint, method, ctxt, args)
', u' File "/home/akovi/openstack/mistral/.tox/venv/local/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 192, in _do_dispatch
    result = func(ctxt, **new_args)
', u' File "/home/akovi/openstack/mistral/mistral/engine/engine_server.py", line 135, in start_action
    **params
', u' File "/home/akovi/openstack/mistral/mistral/engine/action_queue.py", line 74, in decorate
    res = func(*args, **kw)
', u' File "/home/akovi/openstack/mistral/mistral/engine/default_engine.py", line 78, in start_action
    output = action.run(action_input, target, save=False)
', u' File "/home/akovi/openstack/mistral/.tox/venv/local/lib/python2.7/site-packages/osprofiler/profiler.py", line 157, in wrapper
    result = f(*args, **kwargs)
', u' File "/home/akovi/openstack/mistral/mistral/engine/actions.py", line 286, in run
    async_=False
', u' File "/home/akovi/openstack/mistral/.tox/venv/local/lib/python2.7/site-packages/osprofiler/profiler.py", line 157, in wrapper
    result = f(*args, **kwargs)
', u' File "/home/akovi/openstack/mistral/mistral/rpc/clients.py", line 331, in run_action
    return rpc_client_method(auth_ctx.ctx(), \'run_action\', **rpc_kwargs)
', u' File "/home/akovi/openstack/mistral/mistral/rpc/oslo/oslo_client.py", line 38, in sync_call
    **kwargs
', u' File "/home/akovi/openstack/mistral/.tox/venv/local/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 174, in call
    retry=self.retry)
', u' File "/home/akovi/openstack/mistral/.tox/venv/local/lib/python2.7/site-packages/oslo_messaging/transport.py", line 123, in _send
    timeout=timeout, retry=retry)
', u' File "/home/akovi/openstack/mistral/.tox/venv/local/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 578, in send
    retry=retry)
', u' File "/home/akovi/openstack/mistral/.tox/venv/local/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 564, in _send
    msg=msg, timeout=timeout, retry=retry)
', u' File "/home/akovi/openstack/mistral/.tox/venv/local/lib/python2.7/site-packages/oslo_messaging/_drivers/impl_rabbit.py", line 1271, in topic_send
    retry=retry)
', u' File "/home/akovi/openstack/mistral/.tox/venv/local/lib/python2.7/site-packages/oslo_messaging/_drivers/impl_rabbit.py", line 1154, in _ensure_publishing
    self.ensure(method, retry=retry, error_callback=_error_callback)
', u' File "/home/akovi/openstack/mistral/.tox/venv/local/lib/python2.7/site-packages/oslo_messaging/_drivers/impl_rabbit.py", line 825, in ensure
    raise exceptions.MessageDeliveryFailure(msg)
', u"MessageDeliveryFailure: Unable to connect to AMQP server on 127.0.0.1:5672 after None tries: 'NoneType' object has no attribute '__getitem__'
"].

Changed in mistral:
importance: Undecided → High
milestone: none → queens-1
Andras Kovi (akovi)
Changed in mistral:
assignee: nobody → Andras Kovi (akovi)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to mistral (master)

Fix proposed to branch: master
Review: https://review.openstack.org/508733

Changed in mistral:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to mistral (master)

Reviewed: https://review.openstack.org/508733
Committed: https://git.openstack.org/cgit/openstack/mistral/commit/?id=68a44fd724d502352804e756f5bdf6abfc61c469
Submitter: Jenkins
Branch: master

commit 68a44fd724d502352804e756f5bdf6abfc61c469
Author: Andras Kovi <email address hidden>
Date: Sun Oct 1 14:34:29 2017 +0200

    Mistral fails on RabbitMQ restart

    Turns on the 'confirmation' for message publishing in the
    Kombu RPC client.

    Fixes the a race condition in the Kombu RPC client between the
    reply queue being declared and the listener being started and
    the reply being sent by the server side.

    Fixes the Kombu RPC server not resetting the sleep timer after
    successful connection to the MQ service.

    Change-Id: I0db1cb4c2de7f2c7415825b28e961076870038bf
    Closes-Bug: 1718883

Changed in mistral:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to mistral (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.openstack.org/510573

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to mistral (stable/pike)

Reviewed: https://review.openstack.org/510573
Committed: https://git.openstack.org/cgit/openstack/mistral/commit/?id=db997eec8212b43c83bc1f4196200836066d76fb
Submitter: Jenkins
Branch: stable/pike

commit db997eec8212b43c83bc1f4196200836066d76fb
Author: Andras Kovi <email address hidden>
Date: Sun Oct 1 14:34:29 2017 +0200

    Mistral fails on RabbitMQ restart

    Turns on the 'confirmation' for message publishing in the
    Kombu RPC client.

    Fixes the a race condition in the Kombu RPC client between the
    reply queue being declared and the listener being started and
    the reply being sent by the server side.

    Fixes the Kombu RPC server not resetting the sleep timer after
    successful connection to the MQ service.

    Change-Id: I0db1cb4c2de7f2c7415825b28e961076870038bf
    Closes-Bug: 1718883
    (cherry picked from commit 68a44fd724d502352804e756f5bdf6abfc61c469)

tags: added: in-stable-pike
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/mistral 6.0.0.0b1

This issue was fixed in the openstack/mistral 6.0.0.0b1 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/mistral 5.2.0

This issue was fixed in the openstack/mistral 5.2.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.