SampleCount statistic tied to the rolling sum
Bug #1069840 reported by
Eoghan Glynn
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Synaps |
New
|
Undecided
|
Eoghan Glynn |
Bug Description
The SampleCount statistic is tied to the rolling sum maintained by pandas, as opposed to a rolling count of the observations received within the window. e.g.:
https:/
https:/
As a result it would appear that this statistic value would generally be incorrect, except in the special case where the average of the datapoint stream is constantly one.
Changed in synaps: | |
assignee: | nobody → Eoghan Glynn (eglynn) |
To post a comment you must log in.
Let's assume that we put metric data like below.
Timestamp / Metric Value
2012-10-23 00:00:02 / 30.0
2012-10-23 00:00:59 / 10.0
2012-10-23 00:02:01 / 11.5
2012-10-23 00:03:03 / 14.2
Currently, Synaps aggregates those raw metric data into the dataframe which has 1 minute resolution.
Timestamp / SampleCount / Average / Min / Max / Sum
2012-10-23 00:00:00 / 2 / 20.0 / 10.0 / 30.0 / 40.0
2012-10-23 00:01:00 / NaN / NaN / NaN / NaN / NaN
2012-10-23 00:02:00 / 1 / 11.5 / 11.5 / 11.5 / 11.5
2012-10-23 00:03:00 / 1 / 14.2 / 14.2 / 14.2 / 14.2
Synaps will evaluate alarms based on the dataframe above, appling rolling functions that is provided by pandas.
When it rolls up the SampleCount data using 'rolling_count' with 2 minutes of window, the result will be like below.
I call it 'Rolling Sample Count'.
result of rolling count (window: 2min)
2012-10-23 00:00:00 / 1
2012-10-23 00:01:00 / 1
2012-10-23 00:02:00 / 1
2012-10-23 00:03:00 / 2
And if it rolls them up using 'rolling_sum' with 2 minutes of window, the result will be like below.
Before rolling, I filled NaN as 0.
I call it 'Total Sample Count'
result of rolling sum (window: 2min)
2012-10-23 00:00:00 / 2
2012-10-23 00:01:00 / 2
2012-10-23 00:02:00 / 1
2012-10-23 00:03:00 / 2
I think both kinds of sample count are valuable.
'Total Sample Count' tells total sample counts of raw metric data.
'Rolling Sample Count' tells total sample counts of aggregated data.
Currently, Synaps provides 'Total Sample Count' for SampleCount.
That's why it uses rolling sum function for Sample Count.