sweeping all amqp related units on every hook event does not scale

Bug #1698340 reported by Edward Hope-Morley
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack RabbitMQ Server Charm
Fix Released
Medium
Edward Hope-Morley

Bug Description

Currently, on each and every hook fired, we call update_clients() which in turn calls amqp_changed() which in turn calls (amongst other things and for leader only) configure_amqp() which in turn calls rabbitmqctl commands (that are slow to execute) for every unit that is related to the charm amqp relation. This of course includes the update-status hook which fires every 15 minutes. In larger deployments which could easily have hundreds if not thousands of units related (e.g. nova-compute), this call will take a very long time to complete. I understand the motivation for this is to ensure that the correct settings are applied at all times and that no change should result in no effect but since executing the commands themselves, even if idempotent, is an expensive operation I believe there are some obvious optimisations that could be implemented in order to mitigate the effects of these actions.

Tags: openstack sts
Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

Sorry to nitpick, but I think update-status is every 5 minutes, which makes the problem worse. I would agree that you could drop it from the 'update-status' hook, and probably from every other hook except for leader changed and amqp-relation-changed.

I expect that the 'update-clients()' function, however, is a workaround race-hazards or other async like problems with rabbitmq which we've had problems in the past with, and update-status hook 'fixes' it after the fact. Not sure how you would progress this.

Revision history for this message
Edward Hope-Morley (hopem) wrote :

@ajkavanagh sorry yes you're absolutely correct. Basically the amqp_changed() is a bit of a legacy patchwork. I am having a go at making it less noisy.

Revision history for this message
Edward Hope-Morley (hopem) wrote :

Fix proposed to branch: master

Review: https://review.openstack.org/474958

Changed in charm-rabbitmq-server:
status: Triaged → In Progress
assignee: nobody → Edward Hope-Morley (hopem)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-rabbitmq-server (master)

Reviewed: https://review.openstack.org/474958
Committed: https://git.openstack.org/cgit/openstack/charm-rabbitmq-server/commit/?id=da3a7063bbcf0c262957149d5c6c2fa3d0525599
Submitter: Jenkins
Branch: master

commit da3a7063bbcf0c262957149d5c6c2fa3d0525599
Author: Edward Hope-Morley <email address hidden>
Date: Fri Jun 16 13:37:20 2017 +0100

    Refactor amqp_changed to make it less noisy

    Currently each and every hook executed results in a sweep of all
    amqp relations and their related units to ensure config consistency
    both in rabbit and for remote units. This scales very poorly since
    rabbtimqctl commands take a while to complete and the majority of
    the time they are not needed. This patch aims to reduce the impact
    of performing this set of operations and the amount of time it takes
    to do them.

    Change-Id: Ia060ce34052cd543a63d045944b35e4188279f05
    Closes-Bug: 1698340

Changed in charm-rabbitmq-server:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-rabbitmq-server (stable/17.02)

Fix proposed to branch: stable/17.02
Review: https://review.openstack.org/479339

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-rabbitmq-server (stable/17.02)

Reviewed: https://review.openstack.org/479339
Committed: https://git.openstack.org/cgit/openstack/charm-rabbitmq-server/commit/?id=364d517600b7bbf8a5dd87d142dce0ddc5be467d
Submitter: Jenkins
Branch: stable/17.02

commit 364d517600b7bbf8a5dd87d142dce0ddc5be467d
Author: Edward Hope-Morley <email address hidden>
Date: Fri Jun 16 13:37:20 2017 +0100

    Refactor amqp_changed to make it less noisy

    Currently each and every hook executed results in a sweep of all
    amqp relations and their related units to ensure config consistency
    both in rabbit and for remote units. This scales very poorly since
    rabbtimqctl commands take a while to complete and the majority of
    the time they are not needed. This patch aims to reduce the impact
    of performing this set of operations and the amount of time it takes
    to do them.

    Change-Id: Ia060ce34052cd543a63d045944b35e4188279f05
    Closes-Bug: 1698340
    (cherry picked from commit da3a7063bbcf0c262957149d5c6c2fa3d0525599)

James Page (james-page)
Changed in charm-rabbitmq-server:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.