jitter doesn't work

Bug #1734898 reported by gordon chung
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ceilometer
Fix Released
Medium
Julien Danjou

Bug Description

a long time ago, we added jitter[1] so that polling agent would not hit all endpoints at once. this would be great, but i'm pretty sure it doesn't do anything.

iiuc, all pollsters are grouped by polling interval so if the polling agent is configured to get all possible meters, every 15s. there will be one polling task which has all the pollsters in it.

the jitter added originally, will add a delay to a polling task, not the pollsters internally. so it will add a delay to the 15s polling task in the above scenario.

it really seems like the jitter only solves a very rare scenario of ensuring that multiple agents that somehow were started at the exact same time and polling the exact same endpoints, that in this scenario, the jitter will attempt to unsynchronise them.

[1] I12e3f104fc92fe15adc05e2b981627f31ee5bfaa

gordon chung (chungg)
Changed in ceilometer:
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ceilometer (master)

Fix proposed to branch: master
Review: https://review.openstack.org/523892

Changed in ceilometer:
assignee: nobody → Julien Danjou (jdanjou)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to ceilometer (master)

Reviewed: https://review.openstack.org/522503
Committed: https://git.openstack.org/cgit/openstack/ceilometer/commit/?id=faac031a9b6893963375674f031e28a8c486c2a8
Submitter: Zuul
Branch: master

commit faac031a9b6893963375674f031e28a8c486c2a8
Author: Julien Danjou <email address hidden>
Date: Thu Nov 23 10:41:03 2017 +0100

    Remove shuffle_time_before_polling_task option

    The problem that shuffle_time_before_polling_task tries to solve is the startup
    of a horde of Ceilometer instances that would start polling the same thing at
    the same time.

    It's actually unlikely they would all start at the same right second, and the
    correct fix would be to do that each time.

    Related-Bug: #1734898

    Change-Id: If8141f6b48657c06e8e782eeef9b209dabb2097c

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ceilometer (master)

Reviewed: https://review.openstack.org/523892
Committed: https://git.openstack.org/cgit/openstack/ceilometer/commit/?id=1630d30a922c745a43f26824efe8d8b4b91eda7f
Submitter: Zuul
Branch: master

commit 1630d30a922c745a43f26824efe8d8b4b91eda7f
Author: Julien Danjou <email address hidden>
Date: Wed Nov 29 15:38:50 2017 +0100

    polling: iter randomly over sources and pollsters when polling

    By polling in a random order sources and pollster, it is more likely than
    different Ceilometer agents will not hit the same e.g. API endpoint at the same
    time.

    Change-Id: I754a67a8adfb97f8950c666f9aab3bc3d435e2ac
    Closes-Bug: #1734898

Changed in ceilometer:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/ceilometer 10.0.0

This issue was fixed in the openstack/ceilometer 10.0.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.