HA for RPC queues leads to triple load on RabbitMQ without significant benefit

Bug #1550303 reported by Dmitry Mescheryakov on 2016-02-26
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
High
Dmitry Mescheryakov
8.0.x
High
Dmitry Mescheryakov

Bug Description

Right now we have queue mirroring enabled for all RabbitMQ queues. On cluster of 3 nodes that leads to triple load on RabbitMQ cluster providing limited benefit only during failover. Right now even relatively small OpenStack cluster (100-200) nodes experiences problems with RabbitMQ under considerable load. It is preferable to keep OpenStack cluster operational during normal work, rather then to try to provide a little more resilience during rare incidents. Hence we should disable HA for RPC queues.

Changed in fuel:
status: Confirmed → In Progress

Reviewed: https://review.openstack.org/277948
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=37daf71031b54b53d53d4bca29bc572e13f7f250
Submitter: Jenkins
Branch: master

commit 37daf71031b54b53d53d4bca29bc572e13f7f250
Author: Dmitry Mescheryakov <email address hidden>
Date: Tue Feb 9 19:37:26 2016 +0300

    Disable HA for RPC queues by default

    RPC wihout HA was tested on a big scale and we found that it greatly
    reduces load OpenStack puts on RabbitMQ. Hence it is valuable to
    disable it by default.

    DocImpact

    It should be noted in the release notes that starting from 8.0 queue
    mirroring is disabled by default for RPC queues, but it is still
    enabled for Ceilometer ones.

    Also, the change should be reflected in our reference architecture
    guide. Specifically, there is a sentence here starting with
    "RabbitMQ provides active/active high availability ...", which will
    be incorrect after the given change is merged.

    Users are still provided with means to enable mirroring. Details
    could be found in description of that commit:
    https://review.openstack.org/#/c/249180/

    Closes-Bug: #1550303
    Change-Id: Iffa4173c2e6bb54e411defc7bdc44254669be5fd

Changed in fuel:
status: In Progress → Fix Committed
tags: added: rabbitmq
Alexey Galkin (agalkin) on 2016-04-21
tags: added: on-verification
Alexey Galkin (agalkin) wrote :

Verified as fixed in 9.0-220

tags: removed: on-verification
Changed in fuel:
status: Fix Committed → Fix Released

Reviewed: https://review.openstack.org/277964
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=a36844699885478a8024d1f54df26a6c294d42c7
Submitter: Jenkins
Branch: stable/8.0

commit a36844699885478a8024d1f54df26a6c294d42c7
Author: Dmitry Mescheryakov <email address hidden>
Date: Tue Feb 9 19:37:26 2016 +0300

    Disable HA for RPC queues by default

    RPC wihout HA was tested on a big scale and we found that it greatly
    reduces load OpenStack puts on RabbitMQ. Hence it is valuable to
    disable it by default.

    Change-Id: Iffa4173c2e6bb54e411defc7bdc44254669be5fd
    Closes-Bug: #1550303

tags: added: on-verification
TatyanaGladysheva (tgladysheva) wrote :

Verified on 8.0 + mu4 updates.

Steps to verify:
1. Deploy cluster with 3 controllers and with Ceilometer
2. Ssh to controller
3. 'pcs resource show p_rabbitmq-server'
4. 'rabbitmqctl list_queues slave_pids name'

Before the fix:
3. rabbitmq resource has the following attributes 'enable_rpc_ha=true', 'enable_notifications_ha=true'.
4. All queues have slave replicas.
Please see http://paste.openstack.org/show/605738/

After the fix:
3. rabbitmq resource has the following attributes 'enable_rpc_ha=false', 'enable_notifications_ha=true'.
4. All queues besides ceilometer queues (starts with 'event.', 'metering.' and 'notifications.') don't have any slave replicas.
Please see http://paste.openstack.org/show/605736/

tags: removed: on-verification
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers