Bug #1519851 “Nonoptimal failover strategy can lead to RPC timeo...” : Bugs : oslo.messaging

Dmitry Mescheryakov (dmitrymex) on 2015-11-25

Changed in oslo.messaging:
assignee:	nobody → Dmitry Mescheryakov (dmitrymex)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-11-25: Fix proposed to oslo.messaging (master)

#1

Fix proposed to branch: master
Review: https://review.openstack.org/249849

Changed in oslo.messaging:
status:	New → In Progress

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-11-30: Fix merged to oslo.messaging (master)

#2

Reviewed: https://review.openstack.org/249849
Committed: https://git.openstack.org/cgit/openstack/oslo.messaging/commit/?id=6ae46796a61fc97467450b5bdd51dc6a0c86f9f4
Submitter: Jenkins
Branch: master

commit 6ae46796a61fc97467450b5bdd51dc6a0c86f9f4
Author: Dmitry Mescheryakov <email address hidden>
Date: Mon Nov 23 17:27:24 2015 +0300

Use round robin failover strategy for Kombu driver

    Shuffle strategy we use right now leads to increased reconnection time
    and provides no benefit. Sometimes it might lead to RPC operations
    timeout because the strategy provides no guarantee on how long the
    reconnection process will take. See the referenced bug for details.

    On the other side, round-robin strategy provides least achievable
    reconnection time. It also provides guarantee that if K of N RabbitMQ
    hosts are alive, it will take at most N - K + 1 attempts to
    successfully reconnect to RabbitMQ cluster.

    With shuffle strategy during failover clients connect to random hosts
    and so the load is distributed evenly between alive RabbitMQs.
    But since we shuffle list of hosts before providing it to Kombu, load
    will be distributed evenly with round-robin strategy as well.

    DocImpact
    A new configuration option kombu_failover_strategy for Kombu driver is
    added. It determines how the next RabbitMQ node is chosen in case the
    one we are currently connected to becomes unavailable. It takes effect
    only if more than one RabbitMQ node is provided in config. Available
    options are:

     * round-robin: each RabbitMQ host in the list is tried in cycle until
       oslo.messaging successfully connects. Since oslo.messaging
       shuffles list of RabbitMQ hosts, the order of hosts in the cycle
       will be random and will not depend on order provided in config.

     * shuffle: oslo.messaging selects a random host from the list and
       tries to connect to it. If connection fails, oslo.messaging repeats
       attempt to connect to another random host. Oslo.messaging stops
       once it successfully connects to a host. Note that in each
       iteration a host to connect is selected independently of previous
       iterations, i.e. it might happen that oslo.messaging will try to
       connect to the same host several times in a row.

    The option's default value is round-robin. Before the option was
    introduced, the default strategy was shuffle. For the reasoning,
    see the main body of the commit message and the referenced bug.

Closes-Bug: #1519851
Change-Id: I9a510c86bd5a6ce8b707734385af1a83de82804e

Reviewed:  https://review.openstack.org/249849
Committed: https://git.openstack.org/cgit/openstack/oslo.messaging/commit/?id=6ae46796a61fc97467450b5bdd51dc6a0c86f9f4
Submitter: Jenkins
Branch:    master

commit 6ae46796a61fc97467450b5bdd51dc6a0c86f9f4
Author: Dmitry Mescheryakov <dmescheryakov@mirantis.com>
Date:   Mon Nov 23 17:27:24 2015 +0300

Use round robin failover strategy for Kombu driver
    
    Shuffle strategy we use right now leads to increased reconnection time
    and provides no benefit. Sometimes it might lead to RPC operations
    timeout because the strategy provides no guarantee on how long the
    reconnection process will take. See the referenced bug for details.
    
    On the other side, round-robin strategy provides least achievable
    reconnection time. It also provides guarantee that if K of N RabbitMQ
    hosts are alive, it will take at most N - K + 1 attempts to
    successfully reconnect to RabbitMQ cluster.
    
    With shuffle strategy during failover clients connect to random hosts
    and so the load is distributed evenly between alive RabbitMQs.
    But since we shuffle list of hosts before providing it to Kombu, load
    will be distributed evenly with round-robin strategy as well.
    
    DocImpact
    A new configuration option kombu_failover_strategy for Kombu driver is
    added. It determines how the next RabbitMQ node is chosen in case the
    one we are currently connected to becomes unavailable. It takes effect
    only if more than one RabbitMQ node is provided in config. Available
    options are:
    
     * round-robin: each RabbitMQ host in the list is tried in cycle until
       oslo.messaging successfully connects. Since oslo.messaging
       shuffles list of RabbitMQ hosts, the order of hosts in the cycle
       will be random and will not depend on order provided in config.
    
     * shuffle: oslo.messaging selects a random host from the list and
       tries to connect to it. If connection fails, oslo.messaging repeats
       attempt to connect to another random host. Oslo.messaging stops
       once it successfully connects to a host. Note that in each
       iteration a host to connect is selected independently of previous
       iterations, i.e. it might happen that oslo.messaging will try to
       connect to the same host several times in a row.
    
    The option's default value is round-robin. Before the option was
    introduced, the default strategy was shuffle. For the reasoning,
    see the main body of the commit message and the referenced bug.
    
    Closes-Bug: #1519851
    Change-Id: I9a510c86bd5a6ce8b707734385af1a83de82804e

Changed in oslo.messaging:
status:	In Progress → Fix Committed

Davanum Srinivas (DIMS) (dims-v) on 2015-12-07

Changed in oslo.messaging:
milestone:	none → 3.1.0
status:	Fix Committed → Fix Released

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-12-14: Fix proposed to oslo.messaging (feature/pika)

#3

Fix proposed to branch: feature/pika
Review: https://review.openstack.org/257373

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-12-15: Fix merged to oslo.messaging (feature/pika)

#4

Download full text (39.3 KiB)

Reviewed: https://review.openstack.org/257373
Committed: https://git.openstack.org/cgit/openstack/oslo.messaging/commit/?id=cc0f8cc8a9ff25c9fb081cac5366c12a0c06ec53
Submitter: Jenkins
Branch: feature/pika

commit a5d78891745b6b9e5827271dc305f00acae1392f
Author: OpenStack Proposal Bot <email address hidden>
Date: Fri Dec 11 15:24:05 2015 +0000

Updated from global requirements

Change-Id: Ifd78016c067740477a82dbe06d74d5944ba91893

commit 17ccb2306d03a74304c57d31716a54ba2b3b4311
Author: Mehdi Abaakouk <email address hidden>
Date: Fri Dec 11 10:59:54 2015 +0100

Move to debug a too verbose log

    When a client is gone (died/restart) and somes replies cannot be sent because
    the the exchange of this client will never comeback. We log one message per
    reply every 0.25 messages during 60 seconds. When the only useful log
    is the one where we decide to drop this replies.

This change moves the less important message to debug level.

Change-Id: I508787c0db4dcec2c0027b89eb4e65c4f98022b9
Related-bug: #1524418

commit 46daf858144202a072c4bf8580aeafec11d20e13
Author: Davanum Srinivas <email address hidden>
Date: Fri Dec 11 11:04:13 2015 +0300

Cleanup parameter docstrings

Change-Id: I301fdd51446bf0c0a6dd0d05b26da0556db8367d

commit 3ee86964fa460882d8fcac8686edd0e6bfb12008
Author: Mehdi Abaakouk <email address hidden>
Date: Wed Dec 9 19:37:40 2015 +0100

Revert "default of kombu_missing_consumer_retry_timeout"

This reverts commit 8c03a6db6c0396099e7425834998da5478a1df7c.

Closes-bug: #1524418
Change-Id: I35538a6c15d6402272e4513bc1beaa537b0dd7b9

commit e72599435c59c09277a9da7686b32aa4f9df7ba4
Author: Mehdi Abaakouk <email address hidden>
Date: Wed Dec 9 18:49:19 2015 +0100

Don't trigger error_callback for known exc

    When AMQPDestinationNotFound is raised, we must not
    call the error_callback method. The exception is logged
    only if needed in upper layer (amqpdriver.py).

Related-bug: #1524418

Change-Id: Ic1ddec2d13172532dbaa572d04a4c22c97ac4fe7

commit 185693a6ed57e02b2f94b0fb8f14a91471605969
Author: Mehdi Abaakouk <email address hidden>
Date: Wed Dec 9 11:23:52 2015 +0100

Improves comment

Change-Id: Idc8002e6d622435aac48304857985c0f82be3e32

commit 148e8380ce1cc4f60716300b95104aaa2cf8c543
Author: Mehdi Abaakouk <email address hidden>
Date: Fri Dec 4 14:57:03 2015 +0100

Fix reconnection when heartbeat is missed

    When a heartbeat is missing we call ensure_connection()
    that runs a dummy method to trigger the reconnection
    code in kombu. But also the code is triggered only if the
    channel is None.

In case of the heartbeat threads we didn't reset the channel
before reconnecting, so the dummy method doesn't do anything.

This change sets the channel to None to ensure the connection
is reestablished before the dummy method is run.

Also it replaces the dummy method by checking the kombu connection
object. So we are sure the connection is reestablished.

Change-Id: I39f8cd23c5a5498e6f4c1aa3236ed27f3b5d7c9a
Closes-bug: #1493890

commit 05002...

Reviewed:  https://review.openstack.org/257373
Committed: https://git.openstack.org/cgit/openstack/oslo.messaging/commit/?id=cc0f8cc8a9ff25c9fb081cac5366c12a0c06ec53
Submitter: Jenkins
Branch:    feature/pika

commit a5d78891745b6b9e5827271dc305f00acae1392f
Author: OpenStack Proposal Bot <openstack-infra@lists.openstack.org>
Date:   Fri Dec 11 15:24:05 2015 +0000

Updated from global requirements
    
    Change-Id: Ifd78016c067740477a82dbe06d74d5944ba91893

commit 17ccb2306d03a74304c57d31716a54ba2b3b4311
Author: Mehdi Abaakouk <sileht@redhat.com>
Date:   Fri Dec 11 10:59:54 2015 +0100

Move to debug a too verbose log
    
    When a client is gone (died/restart) and somes replies cannot be sent because
    the the exchange of this client will never comeback. We log one message per
    reply every 0.25 messages during 60 seconds. When the only useful log
    is the one where we decide to drop this replies.
    
    This change moves the less important message to debug level.
    
    Change-Id: I508787c0db4dcec2c0027b89eb4e65c4f98022b9
    Related-bug: #1524418

commit 46daf858144202a072c4bf8580aeafec11d20e13
Author: Davanum Srinivas <davanum@gmail.com>
Date:   Fri Dec 11 11:04:13 2015 +0300

Cleanup parameter docstrings
    
    Change-Id: I301fdd51446bf0c0a6dd0d05b26da0556db8367d

commit 3ee86964fa460882d8fcac8686edd0e6bfb12008
Author: Mehdi Abaakouk <sileht@redhat.com>
Date:   Wed Dec 9 19:37:40 2015 +0100

Revert "default of kombu_missing_consumer_retry_timeout"
    
    This reverts commit 8c03a6db6c0396099e7425834998da5478a1df7c.
    
    Closes-bug: #1524418
    Change-Id: I35538a6c15d6402272e4513bc1beaa537b0dd7b9

commit e72599435c59c09277a9da7686b32aa4f9df7ba4
Author: Mehdi Abaakouk <sileht@redhat.com>
Date:   Wed Dec 9 18:49:19 2015 +0100

Don't trigger error_callback for known exc
    
    When AMQPDestinationNotFound is raised, we must not
    call the error_callback method. The exception is logged
    only if needed in upper layer (amqpdriver.py).
    
    Related-bug: #1524418
    
    Change-Id: Ic1ddec2d13172532dbaa572d04a4c22c97ac4fe7

commit 185693a6ed57e02b2f94b0fb8f14a91471605969
Author: Mehdi Abaakouk <sileht@redhat.com>
Date:   Wed Dec 9 11:23:52 2015 +0100

Improves comment
    
    Change-Id: Idc8002e6d622435aac48304857985c0f82be3e32

commit 148e8380ce1cc4f60716300b95104aaa2cf8c543
Author: Mehdi Abaakouk <sileht@redhat.com>
Date:   Fri Dec 4 14:57:03 2015 +0100

Fix reconnection when heartbeat is missed
    
    When a heartbeat is missing we call ensure_connection()
    that runs a dummy method to trigger the reconnection
    code in kombu. But also the code is triggered only if the
    channel is None.
    
    In case of the heartbeat threads we didn't reset the channel
    before reconnecting, so the dummy method doesn't do anything.
    
    This change sets the channel to None to ensure the connection
    is reestablished before the dummy method is run.
    
    Also it replaces the dummy method by checking the kombu connection
    object. So we are sure the connection is reestablished.
    
    Change-Id: I39f8cd23c5a5498e6f4c1aa3236ed27f3b5d7c9a
    Closes-bug: #1493890

commit 050024f7984397010c38cbfb8626112d33cbec43
Author: Mehdi Abaakouk <sileht@redhat.com>
Date:   Tue Dec 8 08:03:55 2015 +0000

Fix notifier options registration
    
    Change-Id: I37082f6f349e89af6b74e6ec5e5c416902299263

commit 185f94c013442d87edcea3d81b133d26fdf8a945
Author: Mehdi Abaakouk <sileht@redhat.com>
Date:   Tue Dec 1 10:27:23 2015 +0100

notif: Check the driver features in dispatcher
    
    The transport/driver features check is done into the get listener
    methods.
    So when these methods are not used the driver features checks is not
    done.
    
    This change moves it into the dispatcher layer to ensure the
    requirements are always checked.
    
    This changes a bit the behavior of when the check occurs. Before
    it was during the listener object initialisation. Now this
    when the listener server start.
    
    Change-Id: I4d81a4e8496f04d62e48317829d5dd8b942d501c

commit 4dd644ac201ee0fe247d648a2f735998416bf2c7
Author: Mehdi Abaakouk <sileht@redhat.com>
Date:   Sun Aug 2 10:26:02 2015 +0200

batch notification listener
    
    Gnocchi performs better if measurements are write in batch
    When Ceilometer is used with Gnocchi, this is not possible.
    
    This change introduce a new notification listener that allows that.
    
    On the driver side, a default batch implementation is provided.
    It's just call the legacy poll method many times.
    
    Driver can override it to provide a better implementation.
    For example, kafka handles batch natively and take benefit of this.
    
    Change-Id: I16184da24b8661aff7f4fba6196ecf33165f1a77

commit a1fb6b9776dd635cf53d3bb03867de879cb4ed89
Author: OpenStack Proposal Bot <openstack-infra@lists.openstack.org>
Date:   Tue Dec 8 02:32:50 2015 +0000

Updated from global requirements
    
    Change-Id: Ie3e254b5b37a1d74eeb24ce1ae179ca9b4e84707

commit bdf287e847024368e20f5f806380e97070c9561c
Author: Mehdi Abaakouk <sileht@redhat.com>
Date:   Tue Dec 1 09:34:57 2015 +0100

creates a dispatcher abstraction
    
    This change creates a dispatcher abstraction
    to document the interface of a dispatcher.
    
    And also allows in the futur to have attributes with default values.
    
    Change-Id: I9a7e5e03f89635a3790b3851f492a1a7aab58feb

commit 2a4f915891eec8adabc1caaff3948dbde0ef6bbe
Author: Stanisław Pitucha <stanislaw.pitucha@hp.com>
Date:   Mon Dec 7 15:11:50 2015 +1100

Remove unnecessary quote
    
    Change-Id: I6ec2297495c1a7ce409ea0de9a92a9720b6e2dca

commit 5561a6fd0fbe8dc7defb8fb1f7432fc460954aa4
Author: Stanisław Pitucha <stanislaw.pitucha@hp.com>
Date:   Mon Dec 7 15:09:43 2015 +1100

Fix multiline strings with missing spaces
    
    Change-Id: Ide9999f6bb80f0f87500270a4fc024462bce0dbf

commit 52ccff7cbc26af0738d7a0a7d6e99330421b61d1
Author: Oleksii Zamiatin <ozamiatin@mirantis.com>
Date:   Fri Dec 4 21:47:18 2015 +0200

Properly skip zmq tests without ZeroMQ being installed
    
    In this change import_zmq() doesn't raise ImportError any more
    for the benefit of skipping tests.
    Alarm about zmq unavailability moved to driver's init.
    
    Change-Id: I6e6acc39f42c979333510064d9e845228400d233
    Closes-Bug: #1522920

commit c1d0412e2d5b437b06d8729bbe2cdaea594427be
Author: Mehdi Abaakouk <sileht@redhat.com>
Date:   Wed Dec 2 10:13:18 2015 +0100

kombu: remove compat of folsom reply format
    
    This change removes codepath where _reply_q is not
    present in the message dict.
    
    This kind of messages have been deprecated in grizzly and cannot
    be emitted since havana.
    
    https://github.com/openstack/oslo-incubator/commit/70891c271e011f59792933eaf65c3214493ef14a
    
    Change-Id: I20558d9fae8f56970c967aa0def77cfb2a1ca3ec

commit 6ad70713a3316dd2003ff1f73db573b674a6f20f
Author: Mehdi Abaakouk <sileht@redhat.com>
Date:   Wed Dec 2 10:02:01 2015 +0100

Follow the plan about the single reply message
    
    This change removes the "send_single_reply" option as planned in the bp:
    
    http://specs.openstack.org/openstack/oslo-specs/specs/liberty/oslo.messaging-remove-double-reply.html
    
    Change-Id: Ib88de71cb2008a49a25f302d5e47ed587154d402

commit 8c03a6db6c0396099e7425834998da5478a1df7c
Author: Mehdi Abaakouk <sileht@redhat.com>
Date:   Wed Dec 2 09:39:37 2015 +0100

default of kombu_missing_consumer_retry_timeout
    
    This change the default of kombu_missing_consumer_retry_timeout
    
    The initial value of 60 seconds, have been chosen because the default
    rpc_response_timeout is 60. That means, the client doesn't wait for
    its reply after rpc_response_timeout is reach, so we don't need
    to retry it send it its reply more than rpc_response_timeout.
    
    But the real intent of kombu_missing_consumer_retry_timeout is
    to mitigate the side effect when the rabbitmq server(s) died/failover/restart.
    
    So the question is more how long we expect the server(s) to come back
    and all the oslo.messaging applications to reconnect.
    
    In that case 60 seconds looks a bit high.
    
    Also this 60 seconds have a sad side effect when we can't send the reply
    when the rpc client is really gone (like nova-compute restart).
    The rabbitmq connection to send the reply is hold during 60 seconds.
    
    I propose 5 seconds because,i n case of failover or restart I expect
    everything because normal in less that 5 seconds.
    
    Change-Id: I2ec174e440eb91e950d9453a9de8b97ed5888968

commit 18d1708711191a8cfee479499ac066828355d47f
Author: Mehdi Abaakouk <sileht@redhat.com>
Date:   Wed Dec 2 09:36:02 2015 +0100

rename kombu_reconnect_timeout option
    
    This change renames kombu_reconnect_timeout to missing_consumer_retry_timeout.
    And improves its documentation.
    
    Change-Id: I961cf96108db2f392b13d159f516baac9ff4e989

commit 822b803fb0d7e260628230c680b4975c1a3e0900
Author: Kenneth Giusti <kgiusti@gmail.com>
Date:   Thu Dec 3 14:22:50 2015 -0500

Skip Cyrus SASL tests if proton does not support Cyrus SASL
    
    Change-Id: I265d17a2c92b97777a5a97683b95427825872d3a
    Closes-Bug: #1508523

commit 74a0ec8b1c4b3ca6c700d110d6bfd77348cc970a
Author: Davanum Srinivas <davanum@gmail.com>
Date:   Wed Dec 2 14:43:32 2015 -0500

setUp/tearDown decorator for set/clear override
    
    Problem with recursion shows up only in full runs
    of Nova for example. So split the code that sets
    up the decorator and add a method to cleanup
    the decorated set_override during teardown.
    
    Also add a decorator for clear_override similar to
    the one for set_override.
    
    Added more tests for all the above.
    
    Change-Id: Ib16af2e770e96d971aef7f5c5d48ffd781477cfe

commit b6ad95e1caa19a755e11077facaa2022b64d0cf0
Author: Davanum Srinivas <davanum@gmail.com>
Date:   Tue Dec 1 17:23:20 2015 -0500

Support older notifications set_override keys
    
    Neutron and Ceilometer use set_override to set
    the older deprecated key. We should support them
    using the ConfFixture
    
    Closes-Bug: #1521776
    Change-Id: I2bd77284f80bc4525f062f313b1ec74f2b54b395

commit daddb82788918296f8b34d6cdeb40d01620fb183
Author: Mehdi Abaakouk <sileht@redhat.com>
Date:   Wed Dec 2 11:38:27 2015 +0100

Don't hold the connection when reply fail
    
    This change moves the reply retry code to upper layer
    to be able to release the connection while we wait between
    two retries.
    
    In the worse scenario, a client waits for more than 30 replies
    and died/restart, the server tries to send this 30 replies to this
    this client and can wait too 60s per replies. During this
    replies for other clients are just stuck.
    
    This change fixes that.
    
    Related-bug: #1477914
    Closes-bug: #1521958
    
    Change-Id: I0d3c16ea6d2c1da143de4924b3be41d1cea159bd

commit cc97ba2e17d48cc5fa02f8d9c32b2e0ffacab1a6
Author: Mehdi Abaakouk <sileht@redhat.com>
Date:   Tue Dec 1 15:50:50 2015 +0100

doc: explain rpc call/cast expection
    
    This change adds some doc about remote method execution expectation
    when rpc call/cast is used.
    
    Change-Id: Idb26413fc9a6747ebcd6fd32b82f63ea97bfae16

commit 67c63031f5cb1a675686fc2648ce27f6e36ee254
Author: Komei Shimamura <komei.t.f@gmail.com>
Date:   Fri Jun 5 23:05:29 2015 -0700

Add a driver for Apache Kafka
    
    Adding a driver for Apache Kafka connection, supporting
    notification via Kafka. This driver is experimental
    until having functional and integration tests
    
    Change-Id: I7a5d8e3259b21d5e29ed3b795d04952e1d13ad08
    Implements: blueprint adding-kafka-support

commit 33c1010c3281804456a22b769c4bac5ac6a7cca1
Author: Davanum Srinivas <davanum@gmail.com>
Date:   Tue Nov 24 19:56:16 2015 -0500

Option group for notifications
    
    In change Ief6f95ea906bfd95b3218a930c9db5d8a764beb9, we
    decoupled RPC and Notifications a bit. We should take another
    step and separate out the options for notifications into
    its own group.
    
    Change-Id: Ib51e2839f9035d0cc0e3f459939d9f9003a8c810

commit 357dcb75abdfe1fc78e034d1913f478357cde18f
Author: Davanum Srinivas <davanum@gmail.com>
Date:   Sun Nov 29 18:26:32 2015 -0500

Move ConnectionPool and ConnectionContext outside amqp.py
    
    ConnectionPool and ConnectionContext can be used by other
    drivers (like Kafka) and hence should be outside of amqp.py.
    * Moving ConnectionPool to pool.py
    * Moving ConnectionContext to common.py
    * Moving a couple of global variables to common.py
    
    No other logic changes, just refactoring
    
    Change-Id: I85154509a361690426772ef116590d38a965ca8d

commit 6ae46796a61fc97467450b5bdd51dc6a0c86f9f4
Author: Dmitry Mescheryakov <dmescheryakov@mirantis.com>
Date:   Mon Nov 23 17:27:24 2015 +0300

Use round robin failover strategy for Kombu driver
    
    Shuffle strategy we use right now leads to increased reconnection time
    and provides no benefit. Sometimes it might lead to RPC operations
    timeout because the strategy provides no guarantee on how long the
    reconnection process will take. See the referenced bug for details.
    
    On the other side, round-robin strategy provides least achievable
    reconnection time. It also provides guarantee that if K of N RabbitMQ
    hosts are alive, it will take at most N - K + 1 attempts to
    successfully reconnect to RabbitMQ cluster.
    
    With shuffle strategy during failover clients connect to random hosts
    and so the load is distributed evenly between alive RabbitMQs.
    But since we shuffle list of hosts before providing it to Kombu, load
    will be distributed evenly with round-robin strategy as well.
    
    DocImpact
    A new configuration option kombu_failover_strategy for Kombu driver is
    added. It determines how the next RabbitMQ node is chosen in case the
    one we are currently connected to becomes unavailable. It takes effect
    only if more than one RabbitMQ node is provided in config. Available
    options are:
    
     * round-robin: each RabbitMQ host in the list is tried in cycle until
       oslo.messaging successfully connects. Since oslo.messaging
       shuffles list of RabbitMQ hosts, the order of hosts in the cycle
       will be random and will not depend on order provided in config.
    
     * shuffle: oslo.messaging selects a random host from the list and
       tries to connect to it. If connection fails, oslo.messaging repeats
       attempt to connect to another random host. Oslo.messaging stops
       once it successfully connects to a host. Note that in each
       iteration a host to connect is selected independently of previous
       iterations, i.e. it might happen that oslo.messaging will try to
       connect to the same host several times in a row.
    
    The option's default value is round-robin. Before the option was
    introduced, the default strategy was shuffle. For the reasoning,
    see the main body of the commit message and the referenced bug.
    
    Closes-Bug: #1519851
    Change-Id: I9a510c86bd5a6ce8b707734385af1a83de82804e

commit 6cd1dcebc0801dc16db5f64c81baf1fe17165c88
Author: Davanum Srinivas (dims) <davanum@gmail.com>
Date:   Sun Nov 29 02:21:46 2015 +0000

Revert "serializer: remove deprecated RequestContextSerializer"
    
    This reverts commit fb2037bcb492137ee7de5488c30ef8941b914e13.
    
    Change-Id: I9b32708340c232369940738ade14cb6cbb02b331

commit 067cac36be9d57fe4d92490cb1f17b683213a81b
Author: OpenStack Proposal Bot <openstack-infra@lists.openstack.org>
Date:   Fri Nov 27 17:47:01 2015 +0000

Updated from global requirements
    
    Change-Id: I6670a181e2e2fe3ed1c96d0a16525c2d6eada436

commit 47baebde1f04b1dbd657250084b64f0990b768c2
Author: Oleksii Zamiatin <ozamiatin@mirantis.com>
Date:   Tue Nov 24 13:21:54 2015 +0200

[zmq] Random failure with ZmqPortRangeExceededException
    
    Don't restrict ports range for all the other unit tests.
    
    Change-Id: I47d566f3488d371cff604dac72c94c775d729487
    Closes-Bug: #1519312

commit eea60cfb36e1f429185da0d801b9818322e5a73b
Author: Oleksii Zamiatin <ozamiatin@mirantis.com>
Date:   Mon Nov 16 15:07:15 2015 +0200

[zmq] Driver optimizations for CALL
    
    New DEALER-based publisher provided for CALL.
    Futures-based reply waiting makes possible to
    refuse using of REQ blocking socket and also
    reduce number of openned sockets to a socket-per-target
    instead of socket-per-message as it was for CALL.
    
    Closes-Bug: #1517999
    
    Optimized redis requests - request once instead of
    per each message. This should be elaborated with an
    autonomous nodes discovery mechanism to be correct
    in general case.
    
    Closes-Bug: #1517993
    
    Reduced number of INFO log messages. Most of them switched
    to the DEBUG level.
    
    Closes-Bug: #1517997
    
    Change-Id: Id017e79368cdc68613ddd7adef26411ea422dc8c

commit a811cf3e8bcc1704604082075a9c89d828a0e72d
Author: OpenStack Proposal Bot <openstack-infra@lists.openstack.org>
Date:   Wed Nov 25 22:27:46 2015 +0000

Updated from global requirements
    
    Change-Id: Idc6649634438a2a0f33cc463594ad347a3674bc2

commit 533a0f8f3de497a7c46a97c98df89029703ae173
Author: ZhiQiang Fan <aji.zqfan@gmail.com>
Date:   Thu Oct 22 01:30:33 2015 -0600

Use oslo_config new type PortOpt for port options
    
    The oslo_config library provides new type PortOpt to validate the
    range of port now.
    
    Note, rpc_zmq_max_port is a socket port upper limit, so leave it untouched.
    
    Change-Id: Icac28141bcb09e2b894662de1a6497766351919d
    Closes-Bug: #1518256
    ref: https://github.com/openstack/oslo.config/blob/2.6.0/oslo_config/cfg.py#L1114

commit fb2037bcb492137ee7de5488c30ef8941b914e13
Author: Julien Danjou <julien@danjou.info>
Date:   Thu Nov 12 17:25:36 2015 +0100

serializer: remove deprecated RequestContextSerializer
    
    This also allows us to drop the oslo.context dependency!
    
    Change-Id: I1434caf6323fb417ff99ceff865a0d43799e89b2

commit a33c761e6261347864f0284f59022c1d340611d1
Author: Kui Shi <skuicloud@gmail.com>
Date:   Wed Nov 25 09:02:45 2015 +0000

Add log info for AMQP client
    
    If librabbitmq is installed and can be imported correctly, it will
    be used as AMQP client to connect RabbitMQ. Mark it in the log file.
    
    Change-Id: I33807df67dd2fffa13c675324c8cc7ae716a210e
    Signed-off-by: Kui Shi <skuicloud@gmail.com>

commit 4ea583b718e87a875567ae33690bc1f87df3eed9
Author: OpenStack Proposal Bot <openstack-infra@lists.openstack.org>
Date:   Tue Nov 24 14:45:11 2015 +0000

Updated from global requirements
    
    Change-Id: I6bc59397619d4cf06b8e18b8f48e27b2e0c8c862

commit 6dba2ed591c357a722076db498a6c29657f8760d
Author: Davanum Srinivas <davanum@gmail.com>
Date:   Sat Nov 21 09:50:35 2015 -0500

Add Warning when we cannot notify
    
    Calling Notifier.audit() won't work with LogDriver as the
    logger does not have a audit level anymore, So let's at
    least not fail silently and let the operators know that
    we are dropping stuff on the floor.
    
    Closes-Bug: #1518170
    Change-Id: I74002c72e6763ea8b5df7d97d722619bd4a1950b

commit 16f956d6f336bee259f194db2480821777a8058b
Author: ZhiQiang Fan <aji.zqfan@gmail.com>
Date:   Mon Nov 23 18:32:35 2015 -0700

ignore .eggs directory
    
    This directory contains eggs that were downloaded by setuptools
    to build, test, and run plug-ins.
    
    This directory caches those eggs to prevent repeated downloads.
    
    We need to ignore it.
    
    Change-Id: Idd164c7c8952c70487253e5691ba2da33345059a

commit 196980dace199c412dfaec34568c2a2d66b95a45
Author: Julien Danjou <julien@danjou.info>
Date:   Thu Nov 12 12:02:56 2015 +0100

serializer: deprecate RequestContextSerializer
    
    We plan to remove it to avoid dependency on oslo.context.
    
    Change-Id: I21d76abf1c30b9e89cb8a6c20d1c0cc79cd83a3f

commit 47a906aff3cee8bc2f3e8102bf9f30802cf7974d
Author: Julien Danjou <julien@danjou.info>
Date:   Thu Nov 12 17:24:55 2015 +0100

middleware: remove oslo.context usage
    
    There's nothing interesting in the fake context we're passing. Let's
    stop depending on oslo.context all together here.
    
    Change-Id: I7784bc3b818e67118e03905857c39eac66765fad

commit 925eb734a9d3cb46bb89a89ec1a78281d2d7afe9
Author: Flavio Percoco <flaper87@gmail.com>
Date:   Tue Nov 3 22:34:22 2015 -0200

Remove qpidd's driver from the tree
    
    Back in liberty we marked this driver as deprecated. This patch removes
    it from the tree. The patch also removes tests, options and other
    references in the documentation. Note that one script is being kept
    because it's required by the amqp driver.
    
    Depends-On: If4b1773334e424d1f4a4e112bd1f10aca62682a9
    Change-Id: I4a9cba314c4a2f24307504fa7b5427424268b114

commit 15cd99050c6d2714b90059e0faad9f9e3409eaaa
Author: Matt Riedemann <mriedem@us.ibm.com>
Date:   Thu Nov 19 10:42:34 2015 -0800

Provide alias to oslo_messaging.notify._impl_messaging
    
    Ifb96c2ae9868426cac2700bf4917c27c02c90b15 moved the _impl_messaging
    module to oslo_messaging.notify.messaging which breaks neutron.
    
    Neutron is fixed on master for mitaka but neutron on stable/liberty
    is broken, and changing neutron on stable/liberty to use the new
    path would require a global-requirements minimum version bump for
    oslo.messaging to 2.6.0, which we want to avoid for people that have
    already shipped liberty.
    
    So provide an alias to the moved module so neutron in stable/liberty
    continues to work. We deprecate the module so consumers know they
    need to upgrade and move off this. We may need to cap oslo.messaging
    in global-requirements on stable/liberty at some point when we remove
    the deprecated alias module.
    
    Change-Id: I29453e0fbf30b0a571c2b1afc7cc81d1a11535f0
    Closes-Bug: #1513630

commit 9843641b862fa7a2ce07673af3d2e0ff39eb032f
Author: Sean Dague <sean@dague.net>
Date:   Thu Nov 19 14:09:38 2015 -0500

make pep8 faster
    
    This builds a stripped down tox target for pep8 that doesn't need a
    giant venv with all the things. Works fast and lean, and makes julien
    fries.
    
    Change-Id: Id5b7671fb7f2b8cbf88745fd12f9238b3c0bb2dd

commit 52be09af4da2264358f6ec4e413ea83a7a919f31
Author: OpenStack Proposal Bot <openstack-infra@lists.openstack.org>
Date:   Thu Nov 19 16:00:26 2015 +0000

Updated from global requirements
    
    Change-Id: Ib8cbf18578b4bfd1be8d5f529d646645bf3e8c71

commit 00d07f5205c758757bb854372c5576e62a5f57d6
Author: Matthew Booth <mbooth@redhat.com>
Date:   Mon Oct 19 14:11:23 2015 +0100

Robustify locking in MessageHandlingServer
    
    This change formalises locking in MessageHandlingServer, which closes
    several bugs:
    
    * It adds locking for internal state when using the blocking executor,
      which closes a number of races.
    * It does not hold a lock while executing server functions,
      which removes a potential cause of deadlock if the server does its
      own locking.
    * It fixes a regression introduced in change
      gI3cfbe1bf02d451e379b1dcc23dacb0139c03be76. If multiple threads
      called wait() simultaneously, only 1 of them would wait and the
      others would return immediately, despite message handling not having
      completed.  With this change only 1 will call the underlying wait,
      but all will wait on its completion.
    
    Additionally, it introduces some new functionality:
    
    * It allows the user to make calls in any order and it will ensure,
      with locking, that these will be reordered appropriately.
    * The caller can pass a `timeout` argument to any server method, which
      will cause it to raise an exception if it waits too long.
    * The caller can pass a `log_after` argument to any server method,
      which will cause it to raise a log message if it waits too long. It
      can also be used to disable logging when waiting is intentional.
    
    We remove DummyCondition as it no longer has any users.
    
    This change was originally committed as change
    I9d516b208446963dcd80b75e2d5a2cecb1187efa, but was reverted as it
    caused a hang in a Nova test. This was caused by the locking behaviour
    for handling restarting a previously stopped server. The original
    patch caused the state to 'wrap' immediately after the user called
    wait(). This caused a hang in tests which redundantly called stop()
    and wait() multiple times. This new patch only wraps when the user
    calls start() again. Callers who do not restart a server will
    therefore not be affected by the wrapping behaviour. Callers who do
    restart a server will be no worse than before. We add a deprecation
    warning on restart, as this operation is inherently racy with this api
    and there is a simple, safe alternative.
    
    This new version has been successfully tested against the unit and
    functional tests of nova, cinder, glance, and ceilometer.
    
    Change-Id: Ic79f87e7b069c1f62d6121486fd6cafd732fdde7

commit bfa6f5af038ce9e86521dcfefb7f1536b9cf5b32
Author: OpenStack Proposal Bot <openstack-infra@lists.openstack.org>
Date:   Tue Nov 17 02:30:04 2015 +0000

Updated from global requirements
    
    Change-Id: Ic5436cf41d5e6c10e4080bc6c42694d2542c970c

commit a2ff73cb801344737032475820a0976e356f47ad
Author: Davanum Srinivas <davanum@gmail.com>
Date:   Sat Nov 14 23:06:55 2015 -0500

cleanup tox.ini
    
    * Remove requirements.txt from deps, as this is
      already added automatically
    
    
    Change-Id: I696dd69ba1d59ab026180e8e3cb864fe37442e10

commit 4a3ddce05bac17903fb768a47b042d4bc17fd0d9
Author: Oleksii Zamiatin <ozamiatin@mirantis.com>
Date:   Wed Nov 11 14:12:20 2015 +0200

[zmq] Add config options to specify dynamic ports range
    
    Operators may need a possibility to restrict ports ranges for a specified services
    in order to distinguish ports related to zmq messaging from all other ports in a system.
    
    Change-Id: Ibe5b02c1211b16859ff58bc02a59d96e1d2fa660
    Closes-Bug: #1511181

commit 018dfcd6415258c59df70d8012d8c7dc73a0752b
Author: Oleksii Zamiatin <ozamiatin@mirantis.com>
Date:   Thu Nov 12 15:14:01 2015 +0200

[zmq] Make bind address configurable
    
    Makes use of exisiting 'rpc_zmq_bind_address' option in
    order to make binding address configurable.
    
    Change-Id: Ia46fa03e54b0e92d3504d9a0ebd65171a283e073
    Closes-Bug: #1515267

commit 517ae12b17afd6fdfa4358bf17cc4f4e109b4036
Author: Oleksii Zamiatin <ozamiatin@mirantis.com>
Date:   Wed Nov 11 12:13:44 2015 +0200

[zmq][matchmaker] Distinguish targets by listener types
    
    In order to have a possibility to pass messages via different
    pipelines (not over DEALER/ROUTER only) we need an information
    in the name service about socket type assigned to the target.
    
    Change-Id: I7cdba6c2c91af7f63ecca30c94faecef2c2eff8b
    Closes-Bug: #1497326

commit d571b6642513974be84be115663f0bdaf441b3fe
Author: Oleksii Zamiatin <ozamiatin@mirantis.com>
Date:   Wed Nov 11 16:25:24 2015 +0200

[zmq] Update zmq-deployment guide according to the new driver
    
    New driver introduced some new options and changed its architecture.
    The first update of the deployment guide after the driver being
    reimplemented. Following driver updates should be reflected in the
    guide as well.
    
    Change-Id: Id8629907560e335dfcff688082fe943b3657568c
    Closes-Bug: #1497278

commit 26d9362e5d8075af8feebb0f108b2c18a2c6a08a
Author: Dmitry Tantsur <dtantsur@redhat.com>
Date:   Wed Nov 11 10:39:06 2015 +0100

Make "Connect(ing|ed) to AMQP server" log messages DEBUG level
    
    According to our guidelines, INFO level is for unit-of-work messages
    valueable for an operator. This rules out "connecting" message.
    As to "connected", while it might fall under guidelines, it seems
    to flood logs without too much value, see for example:
    http://logs.openstack.org/98/219298/9/check/gate-tempest-dsvm-ironic-pxe_ipa/53784fb/logs/screen-ir-api.txt.gz?level=INFO
    
    Change-Id: I65e0f19590c42d25e5551d45af37416a01a7d638

commit 8130e833a4863d3945dffcb0818ae1cbab42ba45
Author: OpenStack Proposal Bot <openstack-infra@lists.openstack.org>
Date:   Wed Nov 11 17:23:28 2015 +0000

Updated from global requirements
    
    Change-Id: I3fe04f3751e79f87517d4078faea03344fe8f68e

commit 6621b9010e6c46cfff72be4ce0f7f5c2cad81b45
Author: Davanum Srinivas <davanum@gmail.com>
Date:   Fri Oct 9 15:44:35 2015 -0700

Decouple transport for RPC and Notification
    
    Add a new configuration option for setting up
    an alternate notification_transport_url that
    can be used for notifications. This allows
    operators to separate the transport mechanisms
    used for RPC and Notifications.
    
    DocImpact
    
    Closes-Bug: #1504622
    Change-Id: Ief6f95ea906bfd95b3218a930c9db5d8a764beb9

commit 33243c26aca82474a719c74a0e8f29fab644a3f5
Author: BANASHANKAR KALEBELAGUNDI VEERA <bkalebe@us.ibm.com>
Date:   Mon Nov 9 12:15:53 2015 -0800

Fixing the server example code
    Added server.stop() before server.wait()
    
    Change-Id: I9764c77e0aa076b6a7b9bb9715e2ead89b12126f

commit e5b15ce642775c00bc163d89f6961b505911715f
Author: Sean Dague <sean@dague.net>
Date:   Tue Nov 10 15:33:14 2015 +0000

Revert "Robustify locking in MessageHandlingServer"
    
    This reverts commit d700c382791b6352bb80a0dc455589085881669f.
    
    This commit is causing a timeout/lock wait condition when using the in
    memory rpc bus. It exposed in the Nova unit / functional tests which use
    this extensively.
    
    Change-Id: I9610a5533383955f926dbbb78ab679f45cd7bcdb
    Closes-Bug: #1514876

commit 64518fa170f3c36529453782a82d855c3e98f8af
Author: Clint Byrum <clint@fewbar.com>
Date:   Fri Oct 30 14:09:37 2015 -0700

Move supported messaging drivers in-tree
    
    Up until now it has only been available in the OpenStack spec, but it is
    a living document and I believe we can maintain it in oslo.messaging's
    tree.
    
    Change-Id: I7bb9e5f02004f857d8f75909fcc0d05f2882a77d

commit 2786a9ded35fdd427e22133abd050fc7bbebdcab
Author: Cyril Roelandt <cyril@redhat.com>
Date:   Mon Oct 26 11:01:11 2015 +0100

Add a "bandit" target to tox.ini
    
    This will allow us to find potential security issues, such as those fixed by
    52e624891fc500c8ab9f3f10ef45258ce740916a and
    c4a7ac0b653543e8a3ba10060cabdb114fb6672b .
    
    Change-Id: I21aa0ca79232784069e55da46920eb43250d8939

commit f5d189723eac0215758766a8750233d8a26b38f3
Author: OpenStack Proposal Bot <openstack-infra@lists.openstack.org>
Date:   Sat Oct 24 00:26:49 2015 +0000

Updated from global requirements
    
    Change-Id: If7fdb576d25c0742ce47824209ee70cba5d78d33

commit 9e5fb5697d3f7259f01e3416af0582090d20859a
Author: Cyril Roelandt <cyril@redhat.com>
Date:   Fri Oct 23 20:07:39 2015 +0200

Remove a useless statement
    
    This statement is useless since both 'username' and 'password' are set to None
    in the for loop, and that they are not used outside of the loop.
    
    Removing this line also help us getting rid of a false positive thrown by
    bandit.
    
    Change-Id: I2aa1a16f30928b77aa40c5a900e35b7bf752658a

commit d700c382791b6352bb80a0dc455589085881669f
Author: Matthew Booth <mbooth@redhat.com>
Date:   Mon Oct 19 14:11:23 2015 +0100

Robustify locking in MessageHandlingServer
    
    This change formalises locking in MessageHandlingServer. It allows the
    user to make calls in any order and it will ensure, with locking, that
    these will be reordered appropriately. It also adds locking for
    internal state when using the blocking executor, which closes a number
    of races.
    
    It fixes a regression introduced in change
    gI3cfbe1bf02d451e379b1dcc23dacb0139c03be76. If multiple threads called
    wait() simultaneously, only 1 of them would wait and the others would
    return immediately, despite message handling not having completed.
    With this change only 1 will call the underlying wait, but all will
    wait on its completion.
    
    We add a common logging mechanism when waiting too long. Specifically,
    we now log a single message when waiting on any lock for longer than
    30 seconds.
    
    We remove DummyCondition as it no longer has any users.
    
    Change-Id: I9d516b208446963dcd80b75e2d5a2cecb1187efa

commit 52e624891fc500c8ab9f3f10ef45258ce740916a
Author: Cyril Roelandt <cyril@redhat.com>
Date:   Fri Oct 23 15:01:33 2015 +0200

Use "secret=True" for password-related options
    
    This makes sure the value of the option is not leaked in the logs.
    
    Found using bandit.
    
    Change-Id: I6db2eea1d3f1ad3cacb749dbb9766c5d32cf047f

commit a162d65a23814c5dc8d7bbc27a5db65dab3e7a33
Author: OpenStack Proposal Bot <openstack-infra@lists.openstack.org>
Date:   Fri Oct 23 06:27:02 2015 +0000

Imported Translations from Zanata
    
    For more information about this automatic import see:
    https://wiki.openstack.org/wiki/Translations/Infrastructure
    
    Change-Id: I260a5e3ae5d394c8c01dc80751f979296650af18

commit d05278f762dd44d8e328ba11888e524e42a42c3f
Author: Dina Belova <dbelova@mirantis.com>
Date:   Fri Oct 16 17:32:55 2015 +0300

Modify simulator.py tool
    
    Introduce mechanism of generating real life messages to the tool
    using the information gathered during Rally testing. This change
    allows to generate messages of the specfic length due to the
    distribution observed on real environment.
    
    messages_length.txt file contains lengths of string JSON objects
    that were later sent through the MQ layer during deployment and
    deletion of 50 VMs.
    
    simulator.py was modified to use this data as a baseline to generate
    random string messages of the required length with the needed
    probability.
    
    Change-Id: Iae21f90b5ca202bf0e83f1149baef8b42c64eb55

commit ea106e9a090aee65879ef6c676c248f54d08a46f
Author: Oleksii Zamiatin <ozamiatin@mirantis.com>
Date:   Thu Oct 8 22:26:01 2015 +0300

Fix target resolution mismatch in neutron, nova, heat
    
    Some tempest tests were failing because of NoSuchMethod,
    UnsupportedVersion and other missed endpoint errors.
    
    This fix provides new listener per each target and
    more straight-forward matchmaker target resolution logic.
    
    Change-Id: I4bfb42048630a0eab075e462ad1e22ebe9a45820
    Closes-Bug: #1501682

commit c4a7ac0b653543e8a3ba10060cabdb114fb6672b
Author: Cyril Roelandt <cyril@redhat.com>
Date:   Wed Oct 21 17:08:12 2015 +0200

Use yaml.safe_load instead of yaml.load
    
    We currently use yaml.load to read a user-written config file. This can
    lead to malicious code execution, so we should use yaml.safe_load
    instead.
    
    Found using bandit.
    
    Change-Id: I27792f0435bc3cb9b9d31846d07a8d47a1e7679d

commit a6da2a98c367de2c1c023acf914c3f4ad5a41783
Author: Matthew Booth <mbooth@redhat.com>
Date:   Fri Oct 16 16:16:23 2015 +0100

Trivial locking cleanup in test_listener
    
    ListenerSetupMixin.ThreadTracker was reading self._received_msgs
    unlocked and sleep/looping until the desired value was reached.
    Replaced this pattern with a threading.Condition.
    
    Change-Id: Id4731caee2104bdb231e78e7b460905a0aaf84bf

commit f9b14c03377cde4abbd6280b60478eeb91634c29
Author: Matthew Booth <mbooth@redhat.com>
Date:   Wed Oct 21 11:36:28 2015 +0100

Remove unused event in ServerThreadHelper
    
    Change-Id: Ib9ebe363f29cf9a0034550ad852882c2cde8bb49

commit 3f3c489aafc1461835b6c266c8c1e742d88d725b
Author: Matthew Booth <mbooth@redhat.com>
Date:   Mon Oct 19 13:04:37 2015 +0100

Fix a race calling blocking MessageHandlingServer.start()
    
    This fixes a race due to the quirkiness of the blocking executor. The
    blocking executor does not create a separate thread, but is instead
    explicitly executed in the calling thread. Other threads will,
    however, continue to interact with it.
    
    In the non-blocking case, the executor will have done certain
    initialisation in start() before starting a worker thread and
    returning control to the caller. That is, the caller can be sure that
    this initialisation has occurred when control is returned. However, in
    the blocking case, control is never returned. We currently work round
    this by setting self._running to True before executing executor.start,
    and by not doing any locking whatsoever in MessageHandlingServer.
    However, this current means there is a race whereby executor.stop()
    can run before executor.start(). This is fragile and extremely
    difficult to reason about robustly, if not currently broken.
    
    The solution is to split the initialisation from the execution in the
    blocking case. executor.start() is no longer a blocking operation for
    the blocking executor. As for the non-blocking case, executor.start()
    returns as soon as initialisation is complete, indicating that it is
    safe to subsequently call stop(). Actual execution is done explicitly
    via the new execute() method, which blocks.
    
    In doing this, we also make FakeBlockingThread a more complete
    implementation of threading.Thread. This fixes a related issue in
    that, previously, calling server.wait() on a blocking executor from
    another thread would not wait for the completion of the executor. This
    has a knock-on effect in test_server's ServerSetupMixin. This mixin
    created an endpoint with a stop method which called server.stop().
    However, as this is executed by the executor, and also joins the
    executor thread, which is now blocking, this results in a deadlock. I
    am satisfied that, in general, this is not a sane thing to do.
    However, it is useful for these tests. We fix the tests by making the
    stop method non-blocking, and do the actual stop and wait calls from
    the main thread.
    
    Change-Id: I0d332f74c06c22b44179319432153e15b69f2f45

commit 9d74ee40c6080a8ee656e0783fc72287e0d55e4d
Author: Matthew Booth <mbooth@redhat.com>
Date:   Mon Oct 19 10:40:15 2015 +0100

Fix assumptions in test_server_wait_method
    
    test_server_wait_method was calling server.wait without having
    previously called server.start and server.stop. This happened to work
    because it also injected server._executor_obj. This is problematic,
    though, as it assumes internal details of the server and does not
    represent the calling contract of server.wait, which is that it must
    follow server.stop (which must itself also follow server.start).
    
    This change makes the necessary changes to call server.wait in the
    correct sequence.
    
    Change-Id: I205683ac6e0f2d64606bb06d08d3d1419f7645f4

commit aec50602d557a8b1b4130f9886991f18db596e54
Author: Matthew Booth <mbooth@redhat.com>
Date:   Mon Oct 19 10:31:50 2015 +0100

Rename MessageHandlingServer._executor for readability
    
    MessageHandlingServer has both MessageHandlingServer.executor, which
    is the name of an executor type, and MessageHandlingServer._executor,
    which is an instance of that type. Ideally we would rename
    MessageHandlingServer.executor, but as this is referenced from outside
    the class we change _executor instead to _executor_obj.
    
    Change-Id: Id69ba7a0729cc66d266327dac2fd4eab50f2814c

commit a3fa8ffec975bc56c306e92b763cdf902dd45613
Author: OpenStack Proposal Bot <openstack-infra@lists.openstack.org>
Date:   Mon Oct 19 23:32:00 2015 +0000

Updated from global requirements
    
    Change-Id: I882a029c98255087bee77ac54c0bf0538f53db33

commit e7873550d0dfa229b5171f776826948d0b87e9c7
Author: Joshua Harlow <harlowja@yahoo-inc.com>
Date:   Fri Oct 16 10:44:16 2015 -0700

Some executors are not async so update docstring to reflect that
    
    Change-Id: I84db5adf5af0372d521e05f7c4277e1fb570f881

tags:

added: in-feature-pika

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-02-10: Fix proposed to oslo.messaging (stable/liberty)

#5

Fix proposed to branch: stable/liberty
Review: https://review.openstack.org/278462

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-02-11: Fix merged to oslo.messaging (stable/liberty)

#6

Reviewed: https://review.openstack.org/278462
Committed: https://git.openstack.org/cgit/openstack/oslo.messaging/commit/?id=26c85209af04c73246e1aa695a79bb45793fe6b4
Submitter: Jenkins
Branch: stable/liberty

commit 26c85209af04c73246e1aa695a79bb45793fe6b4
Author: Dmitry Mescheryakov <email address hidden>
Date: Mon Nov 23 17:27:24 2015 +0300

Use round robin failover strategy for Kombu driver

    Shuffle strategy we use right now leads to increased reconnection time
    and provides no benefit. Sometimes it might lead to RPC operations
    timeout because the strategy provides no guarantee on how long the
    reconnection process will take. See the referenced bug for details.

    On the other side, round-robin strategy provides least achievable
    reconnection time. It also provides guarantee that if K of N RabbitMQ
    hosts are alive, it will take at most N - K + 1 attempts to
    successfully reconnect to RabbitMQ cluster.

    With shuffle strategy during failover clients connect to random hosts
    and so the load is distributed evenly between alive RabbitMQs.
    But since we shuffle list of hosts before providing it to Kombu, load
    will be distributed evenly with round-robin strategy as well.

    DocImpact
    A new configuration option kombu_failover_strategy for Kombu driver is
    added. It determines how the next RabbitMQ node is chosen in case the
    one we are currently connected to becomes unavailable. It takes effect
    only if more than one RabbitMQ node is provided in config. Available
    options are:

     * round-robin: each RabbitMQ host in the list is tried in cycle until
       oslo.messaging successfully connects. Since oslo.messaging
       shuffles list of RabbitMQ hosts, the order of hosts in the cycle
       will be random and will not depend on order provided in config.

     * shuffle: oslo.messaging selects a random host from the list and
       tries to connect to it. If connection fails, oslo.messaging repeats
       attempt to connect to another random host. Oslo.messaging stops
       once it successfully connects to a host. Note that in each
       iteration a host to connect is selected independently of previous
       iterations, i.e. it might happen that oslo.messaging will try to
       connect to the same host several times in a row.

    The option's default value is round-robin. Before the option was
    introduced, the default strategy was shuffle. For the reasoning,
    see the main body of the commit message and the referenced bug.

    Closes-Bug: #1519851
    Change-Id: I9a510c86bd5a6ce8b707734385af1a83de82804e
    (cherry picked from commit 6ae46796a61fc97467450b5bdd51dc6a0c86f9f4)

Reviewed:  https://review.openstack.org/278462
Committed: https://git.openstack.org/cgit/openstack/oslo.messaging/commit/?id=26c85209af04c73246e1aa695a79bb45793fe6b4
Submitter: Jenkins
Branch:    stable/liberty

commit 26c85209af04c73246e1aa695a79bb45793fe6b4
Author: Dmitry Mescheryakov <dmescheryakov@mirantis.com>
Date:   Mon Nov 23 17:27:24 2015 +0300

Use round robin failover strategy for Kombu driver
    
    Shuffle strategy we use right now leads to increased reconnection time
    and provides no benefit. Sometimes it might lead to RPC operations
    timeout because the strategy provides no guarantee on how long the
    reconnection process will take. See the referenced bug for details.
    
    On the other side, round-robin strategy provides least achievable
    reconnection time. It also provides guarantee that if K of N RabbitMQ
    hosts are alive, it will take at most N - K + 1 attempts to
    successfully reconnect to RabbitMQ cluster.
    
    With shuffle strategy during failover clients connect to random hosts
    and so the load is distributed evenly between alive RabbitMQs.
    But since we shuffle list of hosts before providing it to Kombu, load
    will be distributed evenly with round-robin strategy as well.
    
    DocImpact
    A new configuration option kombu_failover_strategy for Kombu driver is
    added. It determines how the next RabbitMQ node is chosen in case the
    one we are currently connected to becomes unavailable. It takes effect
    only if more than one RabbitMQ node is provided in config. Available
    options are:
    
     * round-robin: each RabbitMQ host in the list is tried in cycle until
       oslo.messaging successfully connects. Since oslo.messaging
       shuffles list of RabbitMQ hosts, the order of hosts in the cycle
       will be random and will not depend on order provided in config.
    
     * shuffle: oslo.messaging selects a random host from the list and
       tries to connect to it. If connection fails, oslo.messaging repeats
       attempt to connect to another random host. Oslo.messaging stops
       once it successfully connects to a host. Note that in each
       iteration a host to connect is selected independently of previous
       iterations, i.e. it might happen that oslo.messaging will try to
       connect to the same host several times in a row.
    
    The option's default value is round-robin. Before the option was
    introduced, the default strategy was shuffle. For the reasoning,
    see the main body of the commit message and the referenced bug.
    
    Closes-Bug: #1519851
    Change-Id: I9a510c86bd5a6ce8b707734385af1a83de82804e
    (cherry picked from commit 6ae46796a61fc97467450b5bdd51dc6a0c86f9f4)

tags:

added: in-stable-liberty

oslo.messaging

Nonoptimal failover strategy can lead to RPC timeout

Bug Description

Other bug subscribers

Remote bug watches