os-collect-config polling can end up synchronized across multiple servers causing performance problems

Bug #1677314 reported by Alex Schultz
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
os-collect-config
Fix Released
Undecided
Alex Schultz
tripleo
Fix Released
Medium
Alex Schultz

Bug Description

Given the nature of the design, the os-collect-config polling can end up synchronized if os-collect-config is started at the same time on multiple nodes. The 30 sec period causes cpu utilization to go to 100% every 30 seconds for several seconds each poll period resulting in high utilization on the source (if configured to use a remote source).

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to os-collect-config (master)

Fix proposed to branch: master
Review: https://review.openstack.org/451478

Changed in os-collect-config:
assignee: nobody → Alex Schultz (alex-schultz)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to os-collect-config (master)

Reviewed: https://review.openstack.org/451478
Committed: https://git.openstack.org/cgit/openstack/os-collect-config/commit/?id=0762fcb0b5872f4fbbf6ce8e5086140c65edf185
Submitter: Jenkins
Branch: master

commit 0762fcb0b5872f4fbbf6ce8e5086140c65edf185
Author: Alex Schultz <email address hidden>
Date: Wed Mar 29 10:25:13 2017 -0600

    Add splay option to offset polling intervals

    When the os-collect-config process is started on multiple systems at the
    same time, the polling intervals can line up to cause performance
    problems against the configuration source. To reduce the impact, this
    change adds a splay option to allow the operator to configure a random
    delay prior to the polling to attempt to offset the polling
    syncronization.

    Change-Id: I1a8be3345d783da9014eca7ea26da19d57e767c0
    Closes-Bug: #1677314

Changed in os-collect-config:
status: In Progress → Fix Released
Changed in tripleo:
status: New → Triaged
importance: Undecided → Medium
milestone: none → pike-2
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (master)

Fix proposed to branch: master
Review: https://review.openstack.org/454358

Changed in tripleo:
assignee: nobody → Alex Schultz (alex-schultz)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/os-collect-config 7.0.0.0b1

This issue was fixed in the openstack/os-collect-config 7.0.0.0b1 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (master)

Reviewed: https://review.openstack.org/454358
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=0d59488780da6269dc71072581f075f6859606ea
Submitter: Jenkins
Branch: master

commit 0d59488780da6269dc71072581f075f6859606ea
Author: Alex Schultz <email address hidden>
Date: Thu Apr 6 15:36:51 2017 -0600

    Enable splay for os-collect-config

    At scale, having the os-collect-config instances all check in at the
    same time can cause performance problems. This change enables splay and
    sets it to a default maximum random sleep of 30 seconds prior to the
    os-collect-config polling.

    Change-Id: Iab8b51f4e5fb4727b8aa7e081f5cbfcbf11f7fcb
    Depends-On: I88f623c9e8db9ed4a186918206a63faec8f7f673
    Closes-Bug: #1677314

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 7.0.0.0b2

This issue was fixed in the openstack/tripleo-heat-templates 7.0.0.0b2 development milestone.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.