HA for RPC queues leads to triple load on RabbitMQ without significant benefit

Bug #1550303 reported by Dmitry Mescheryakov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
High
Dmitry Mescheryakov
8.0.x
Fix Released
High
Dmitry Mescheryakov

Bug Description

Right now we have queue mirroring enabled for all RabbitMQ queues. On cluster of 3 nodes that leads to triple load on RabbitMQ cluster providing limited benefit only during failover. Right now even relatively small OpenStack cluster (100-200) nodes experiences problems with RabbitMQ under considerable load. It is preferable to keep OpenStack cluster operational during normal work, rather then to try to provide a little more resilience during rare incidents. Hence we should disable HA for RPC queues.

Tags: rabbitmq
Changed in fuel:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/277948
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=37daf71031b54b53d53d4bca29bc572e13f7f250
Submitter: Jenkins
Branch: master

commit 37daf71031b54b53d53d4bca29bc572e13f7f250
Author: Dmitry Mescheryakov <email address hidden>
Date: Tue Feb 9 19:37:26 2016 +0300

    Disable HA for RPC queues by default

    RPC wihout HA was tested on a big scale and we found that it greatly
    reduces load OpenStack puts on RabbitMQ. Hence it is valuable to
    disable it by default.

    DocImpact

    It should be noted in the release notes that starting from 8.0 queue
    mirroring is disabled by default for RPC queues, but it is still
    enabled for Ceilometer ones.

    Also, the change should be reflected in our reference architecture
    guide. Specifically, there is a sentence here starting with
    "RabbitMQ provides active/active high availability ...", which will
    be incorrect after the given change is merged.

    Users are still provided with means to enable mirroring. Details
    could be found in description of that commit:
    https://review.openstack.org/#/c/249180/

    Closes-Bug: #1550303
    Change-Id: Iffa4173c2e6bb54e411defc7bdc44254669be5fd

Changed in fuel:
status: In Progress → Fix Committed
tags: added: rabbitmq
Alexey Galkin (agalkin)
tags: added: on-verification
Revision history for this message
Alexey Galkin (agalkin) wrote :

Verified as fixed in 9.0-220

tags: removed: on-verification
Changed in fuel:
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/8.0)

Reviewed: https://review.openstack.org/277964
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=a36844699885478a8024d1f54df26a6c294d42c7
Submitter: Jenkins
Branch: stable/8.0

commit a36844699885478a8024d1f54df26a6c294d42c7
Author: Dmitry Mescheryakov <email address hidden>
Date: Tue Feb 9 19:37:26 2016 +0300

    Disable HA for RPC queues by default

    RPC wihout HA was tested on a big scale and we found that it greatly
    reduces load OpenStack puts on RabbitMQ. Hence it is valuable to
    disable it by default.

    Change-Id: Iffa4173c2e6bb54e411defc7bdc44254669be5fd
    Closes-Bug: #1550303

tags: added: on-verification
Revision history for this message
TatyanaGladysheva (tgladysheva) wrote :

Verified on 8.0 + mu4 updates.

Steps to verify:
1. Deploy cluster with 3 controllers and with Ceilometer
2. Ssh to controller
3. 'pcs resource show p_rabbitmq-server'
4. 'rabbitmqctl list_queues slave_pids name'

Before the fix:
3. rabbitmq resource has the following attributes 'enable_rpc_ha=true', 'enable_notifications_ha=true'.
4. All queues have slave replicas.
Please see http://paste.openstack.org/show/605738/

After the fix:
3. rabbitmq resource has the following attributes 'enable_rpc_ha=false', 'enable_notifications_ha=true'.
4. All queues besides ceilometer queues (starts with 'event.', 'metering.' and 'notifications.') don't have any slave replicas.
Please see http://paste.openstack.org/show/605736/

tags: removed: on-verification
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.