Whisper aggregation method cannot be specified in storage schema

Bug #853955 reported by Jeremy Thurgood
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Graphite
Fix Released
Undecided
Unassigned

Bug Description

We have a variety of metrics for which averaging to get the value for the next retention level is not appropriate. whisper seems to support a variety of different aggregation methods, but these need to be set manually (from the command line or through the cache management protocol) which is problematic for us.

I would like to add a parameter to the storage schema definition that sets the aggregation method for newly-created whisper files. Would I be stepping on any toes if I submitted a patch that does this?

Related branches

Revision history for this message
Jeremy Thurgood (jerith) wrote :

While I'm at it, I'll add a config option for xFilesFactor as well.

Revision history for this message
chrismd (chrismd) wrote :

Initially we considered making the aggregation method a parameter of the storage schema for this very reason but the problem is that there is (in general) little correspondence between the ways storage schemas are defined and the way aggregation methods are defined. For example, often times users configure their storage schemas based on how frequently they collect data, which varies by data source not by data type (ie. you might collect all your apache metrics minutely, but for request and and latency metrics you'd want different aggregation methods). So the solution we decided on (but haven't yet implemented) is to have a separate configuration file that defines what aggregation methods should be applied to metrics matching certain patterns. That way your storage-schemas.conf can define storage schemas for "applications.apache.*" while your aggregation-rules.conf (or whatever) might define sum aggregation for '.*requests' and average aggregation for '.*latency', etc.

Revision history for this message
Jeremy Thurgood (jerith) wrote :

That actually makes more sense. Should I go ahead and implement this instead?

We need it fairly urgently, so I'm going to have to implement something for our own use, but I'd prefer it to be something suitable for general usage rather than just our own immediate needs.

Revision history for this message
Jeremy Thurgood (jerith) wrote :

Please see the attached branch. It adds the extra config file and some documentation to go with it.

Revision history for this message
chrismd (chrismd) wrote :

Branch merged, thanks Jeremy

Changed in graphite:
status: New → Fix Committed
Revision history for this message
Bruce Lsyik (blysik) wrote :

How do I use this in 0.9.9? I have a counter, where I'd like to see the MAX, I think, instead of AVERAGE, for each interval. How do I configure this?

Revision history for this message
Pablo de Leon belloc (pablolb) wrote :

Hi,

We are using Carbon/Graphite and Logster (https://login.launchpad.net/+forgot_password)

Logster parses our error and access logs and produces the output which is fed to carbon, but if it does NOT find any lines, it will not output any values.

This means that most of the WSP is filled up with "None" values.

And when it aggregates, the default xFilesFactor of 0.5 is so high that all other higher intervals are None values.

My question is: if we use xFilesFactor = 0 , the None values are treated as 0 ? e.g. avg(None, None, 3) will be treated as avg(0,0,3) = 0.333 or avg(3) = 3?

Thanks,
Pablo

Revision history for this message
Michael Leinartas (mleinartas) wrote :

Released in 0.9.9

Changed in graphite:
status: Fix Committed → Fix Released
Revision history for this message
Michael Leinartas (mleinartas) wrote :

Bruce: check out the docs for storage-aggregation.conf here: http://readthedocs.org/docs/graphite/en/latest/config-carbon.html#storage-aggregation-conf
For existing metrics, you'll need to use the utility included in whisper called 'set-aggregation-method.py'

Pablo: if you set xFilesFactor to 0, the None values will be ignored while calculating the average ( None, None, and 3 will average to 3), but values will then always propagate to the next retention rather than being dropped when less that 50% of the period has values.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.