Allow to configure the behavior of the rpc servers reconnection

Bug #1282639 reported by Mehdi Abaakouk
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
oslo.messaging
High
Mehdi Abaakouk

Bug Description

Currently all drivers reconnect to the rpc servers forever and this is not really configurable.

The rabbitmq driver allow to configure a max_retries value before raise a generic rpc_common.RPCException but this is not really usable.

But an application (ie: ceilometer) just needs to fail when all servers are not available instead of blocking forever and block the application (in the ceilometer use case, all swift requests are blocked until the server come back).

ie: ceilometer already have a hack/workaround to do this that using the max_retries of rabbitmq driver, see

https://github.com/openstack/ceilometer/blob/master/ceilometer/publisher/rpc.py#L72 and https://github.com/openstack/ceilometer/blob/master/ceilometer/publisher/rpc.py#L200

Oslo.messaging needs a API to configure if the drivers should retry forever or just 1 times before raise a correctly named exception to the application.

This will allow to remove hack/workaround from ceilometer that use internal code of oslo.messaging and work only with rabbitmq

Cheers,

Mehdi Abaakouk (sileht)
Changed in oslo.messaging:
assignee: nobody → Mehdi Abaakouk (sileht)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to oslo.messaging (master)

Fix proposed to branch: master
Review: https://review.openstack.org/75365

Changed in oslo.messaging:
status: New → In Progress
Mark McLoughlin (markmc)
Changed in oslo.messaging:
importance: Undecided → High
Mark McLoughlin (markmc)
Changed in oslo.messaging:
milestone: none → juno-1
Revision history for this message
Openstack Gerrit (openstack-gerrit) wrote : Related fix proposed to oslo.messaging (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/92373

Thierry Carrez (ttx)
Changed in oslo.messaging:
milestone: juno-1 → juno-2
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to oslo.messaging (master)

Reviewed: https://review.openstack.org/75365
Committed: https://git.openstack.org/cgit/openstack/oslo.messaging/commit/?id=948c05417c7a44f85d3e77bae35e02f5c31834d7
Submitter: Jenkins
Branch: master

commit 948c05417c7a44f85d3e77bae35e02f5c31834d7
Author: Mehdi Abaakouk <email address hidden>
Date: Fri Feb 21 11:50:45 2014 +0100

    Add transport reconnection retries

    When a rpc client try to make a RPC call and the server is unreachable
    The rpc call hang until the server come back.

    In most case this is the desired behavior.

    But sometimes, we can prefer that the library raise an exception after a
    certain number of retries.

    For example in ceilometer, when publishing a
    storage.objects.incoming.bytes sample from the Swift middleware to an
    AMQP topic, you might not want to block the Swift client if the AMQP broker
    is unavailable - instead, you might have a queueing policy whereby
    if a single reconection attempt fails we queue the sample in memory and
    try again when another sample is to be published.

    This patch is the oslo.messaging part that allow this.

    Closes bug #1282639
    Co-Authored-By: Ala Rezmerita <email address hidden>

    Change-Id: I32086d0abf141c368343bf225d4b021da496c020

Changed in oslo.messaging:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to oslo.messaging (master)

Reviewed: https://review.openstack.org/92373
Committed: https://git.openstack.org/cgit/openstack/oslo.messaging/commit/?id=1ea9c35ab4d7446cc819490b33be802e5b2886ea
Submitter: Jenkins
Branch: master

commit 1ea9c35ab4d7446cc819490b33be802e5b2886ea
Author: Mehdi Abaakouk <email address hidden>
Date: Tue May 6 13:47:12 2014 +0200

    Transport reconnection retries for notification

    This patch add support of reconnection retries for the
    messaging notifier.

    Related bug #1282639
    Change-Id: Ia30331f8306ff0f6952d83ef42ff8bee6b900427

Changed in oslo.messaging:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers