The last received value for a datapoint should be written into whisper file

Bug #973420 reported by Anton Tolchanov on 2012-04-04
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
carbon
Fix Committed
Medium
Unassigned

Bug Description

I have multiple clients sending metrics to one carbon-aggregator that does some simple aggregation and pushes the metrics to a single carbon-cache. All client metrics are generated each minute and have the same timestamp value, however metrics from different servers do not reach carbon-aggregator simultaneously - there can be several seconds delay. For example, the 'server1.event' metric for 08:00:00 is received by carbon-aggregator at 08:00:02, while 'server2.event' metric reaches carbon-aggregator at 08:00:25. Carbon-aggregator generates a 'total' aggregated metric with 60-second aggregation interval:

total.events (60) = sum server*.event

While debugging metric flow, I am seeing that, as carbon-aggregator receives metrics from clients, it sends the same aggregated metric to carbon-cache several times. For example:

23/12/2011 13:39:22 :: total.events 1324643940.0 48136.0
23/12/2011 13:40:31 :: total.events 1324643940.0 251980.0

Obviously, the last sent value (251980 in this case) is the correct one.

When carbon-cache gets several values for the same metric+timestamp it seems to store all of them internally. When graphite web interface gets those metrics from carbon-cache RAM cache, the correct last value is being displayed. However, when carbon-cache writes those metrics to whisper file, the first received value is written to the file (48136 in this case), which is incorrect. As the result, "fresh" metrics (that have not been dumped to whisper files yet) are graphed correctly, while older values (that are fetched from whisper files rather than from carbon-cache) are incorrect.

I was able to fix this by a simple 1-line patch to whisper.py (see attached).

Thanks,
Anton.

Anton Tolchanov (knyar) wrote :
Michael Leinartas (mleinartas) wrote :

Indeed, documented behavior is that the last point sent should win. Thanks for the good find

Michael Leinartas (mleinartas) wrote :

Committed in r736

Changed in graphite:
importance: Undecided → Medium
milestone: none → 0.9.10
status: New → Fix Committed
Sidnei da Silva (sidnei) on 2012-05-08
affects: graphite → carbon
Changed in carbon:
milestone: 0.9.10 → none
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers