ingestion latency of metricd is not constant

Bug #1543121 reported by Mehdi Abaakouk
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Gnocchi
Fix Released
Low
gordon chung

Bug Description

Hi,

The current metricd implementation for carbonara based drivers doesn't ensure that metricd handles resource the same order than Gnocchi API have written them.

This make the latency between the moment of Gnocchi API put the datapoints of an metric in the storage and the moment these datapoints are handles by metricd unpredictable.

In worse scenario, if metricd always have data to ingest, some datapoints can never been proceeded.

Cheers,

Julien Danjou (jdanjou)
Changed in gnocchi:
status: New → Triaged
importance: Undecided → Low
Changed in gnocchi:
assignee: nobody → gordon chung (chungg)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to gnocchi (master)

Reviewed: https://review.openstack.org/279659
Committed: https://git.openstack.org/cgit/openstack/gnocchi/commit/?id=6474be2318f472b9ae4f2d0fae1174d9a5abf113
Submitter: Jenkins
Branch: master

commit 6474be2318f472b9ae4f2d0fae1174d9a5abf113
Author: gordon chung <email address hidden>
Date: Fri Feb 12 11:28:58 2016 -0500

    partition unprocessed measures across workers

    this patch assigned each worker to work on a specific partition of
    measures backlog. it uses the metric reporting worker to dynamically
    set the size of each partition.

    by default each processing worker will handle a specific block of 128
    metrics. the reporting worker will periodically check backlog size
    and broadcast to all processing workers whether to grab larger chunks
    (if backlog is big) or smaller chunks (when backlog is small).

    this minimises overlap between workers. it may result in no-op
    workers if backlog is very small. overlap may occur as the workers
    query backlog at different times and may possibly have different block
    sizes. overlap will occur the most in ceph driver because there is no
    way to grab metrics as a whole but it should still minimise the overlap
    to an extent.

    a minimum block size is set at 16 metrics and a maximum of 256.
    in theory if 8 workers are set per service, it will grab roughly 128
    to 2048 unique metrics at a time.

    Closes-Bug: #1543121
    Change-Id: Iad7b6b9a56fefcbe85e343e38c0738c2b33efb92

Changed in gnocchi:
status: In Progress → Fix Committed
Julien Danjou (jdanjou)
Changed in gnocchi:
milestone: none → 2.0.0
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.