data corruption errors when getting unaggregated

Bug #1676519 reported by gordon chung
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Gnocchi
Fix Released
Undecided
gordon chung

Bug Description

i am sending individual timestamp+value posts across 20K metrics over an hour with minute granularity.
if i stop and start a metricd agent, i get quite a few data corruption errors when i start it back up. this only seems to happen after a restart. still debugging why

2017-03-27 13:10:52.890 21207 ERROR gnocchi.storage._carbonara [-] Data corruption detected for 56106987-85b6-4db3-8224-86872d6e62f0 unaggregated timeserie

Revision history for this message
gordon chung (chungg) wrote :

seems related to new lz4. prior to 0.9.0, it allowed compression/decompression of empty value to be valid. for example, lz4.loads(lz4.dumps(b'')) works in 0.8.2 but in 0.9.0, it will throw an error instead.

in theory, it actually is an error that unaggregated object is empty... i imagine it's because metricd is killed in process of initial aggregation and saving of unaggregated.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to gnocchi (master)

Fix proposed to branch: master
Review: https://review.openstack.org/450439

Changed in gnocchi:
assignee: nobody → gordon chung (chungg)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to gnocchi (master)

Reviewed: https://review.openstack.org/450439
Committed: https://git.openstack.org/cgit/openstack/gnocchi/commit/?id=4ac9d53383b0db8fd07f8073df63d61334e22cd6
Submitter: Jenkins
Branch: master

commit 4ac9d53383b0db8fd07f8073df63d61334e22cd6
Author: gord chung <email address hidden>
Date: Mon Mar 27 19:54:57 2017 +0000

    don't raise error if unaggregated empty

    new lz4 library doesn't like handling empty binary. if we kill
    agent during computation of aggregates, the unaggregated object
    might have been created (in ceph/redis case) but it may not have saved
    unaggregated measures leaving the object blank.

    this patch returns None and let's workflow proceed as if new if
    object is empty since in scenario above, the original raw measures
    will not have been cleared from unprocessed so they will still be
    processed again.

    also, fixes redis issue where passing in None makes the redis actually
    store 'None'.

    Change-Id: I358e50ccadff721348630688c47544db6553e96b
    Closes-Bug: #1676519

Changed in gnocchi:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.