Graphite needs a scatter graph mode

Bug #886411 reported by Nicholas Leskiw
26
This bug affects 5 people
Affects Status Importance Assigned to Milestone
Graphite
Won't Fix
Undecided
Nicholas Leskiw

Bug Description

"The way one gets around this in gnuplot is to simply plot all data as dots rather than lines - normally I plot lines but sometimes I use dots. Then if you have a couple of outliers during the same interval you still see both of them. I totally understand what graphite is doing and most users are probably very satisfied with what it does - it sure looks good to me for showing trends in the data." -Mark Seger

IF GNU-PLOT CAN, SO CAN WE.

Changed in graphite:
assignee: nobody → Nicholas Leskiw (nleskiw)
Revision history for this message
Geoff Flarity (geoff-flarity) wrote :

This would be great, but there may be an easier way to make people happy.

The problem I have is that outliers don't get plotted. I'm assuming there's some quantization going on using averages? If we could specify max/min instead of average for graph quantization this would solve my issues. Though scatter graph would be awesome as well.

Revision history for this message
Mark Seger (mjseger) wrote :

When I first reported this problem to rrd, as this was the first place I saw this happening, it was recommended I do min/max. While at first thought this sounds like a reasonable solution it's still not great. Consider an interval with 1 normal point and 99 spikes. You'd see 2 point, a min and a max. Now consider 99 normal points and 1 spike. You'd see the identical data. Now it you scatter everything, idential highs or lows would overlay each other but experience has shown data points are not identical and so the scatter plot allows you to see just how many outliers there are.

btw - with gnuplot I still use connected plots and still see the outliers and normal data w/o the need to scatter them.

-mark

Revision history for this message
Nicholas Leskiw (nleskiw) wrote :

I actually got a scatter graph working, but it turns out that the data aggregation happens in the carbon API. So I was left with the same problem. This is a very non-trivial change, and I don't know when or even if it can be implemented. All kinds of functions, especially those that operate on two series at the same time, will break because the lists are different lengths. That's the real reason that the aggregation happens, so that all these neat math functions will work.

Example: If one metric is collected once a minute, another once every 10 seconds, you can't sum() them, there's six data points during the same period.
Lots of functions assume that the data lists are the same length.

I'm not sure how to handle this. My initial thought is that a new function must be added to carbon to give all the data over sans aggregation, and a new webapp function must be made that will prevent multiple series with different length lists from being used in functions together.

What does everyone think? This has been a pain point for many people and I'd like to do something to fix it.

Revision history for this message
Francois Mikus (fmikus) wrote :

Forward decaying priority sampling . To apply statistically representative algorithm with a recency bias.

This would be a method to represent the underlying trend that is visually represented by a scatted graph.

Revision history for this message
Francois Mikus (fmikus) wrote :

scatted = scatter

Revision history for this message
Michael Leinartas (mleinartas) wrote :

I'll note for those watching this that a way to get some of the behavior wanted here is to play with the minXStep parameter which is now documented here: http://readthedocs.org/docs/graphite/en/0.9.x/render_api.html#minxstep

If you reduce this to zero, *every* point will be drawn - the disadvantage is that they will be drawn so close together that the graph lines will be smooshed (for lack of a better word) together. This can be somewhat compensated for by reducing the lineWidth.

This doesn't solve the issue Nick refers to though caused by the behavior of Whisper where in the case of multiple retentions it will return only the first retention that completely satisfies the requested time - that is, if you have 1secondly data for a day and minutely afterwards, if you draw a 1-week graph you'll only get the minutely data returned.

Revision history for this message
Nicholas Leskiw (nleskiw) wrote :

SO there's an update.

cumulative() can show min and max now in the master branch.
It doesn't completely solve the issue, but if you apply cumulative(foo.bar,"max") and you have your storage-aggregation.conf set to max, you'll keep your peaks.

the min would work in the opposite direction as well.

not a 100% fix, but better than nothing...

Revision history for this message
Nicholas Leskiw (nleskiw) wrote :

Example:
max min and avg for the same data, 14 day graph.

http://i.imgur.com/W264e.png

Revision history for this message
Francois Mikus (fmikus) wrote :

To have a scattergraph type information I will reiterate the suggestion from above :

 Forward decaying priority sampling.
Example code here, git://gist.github.com/904980.git, can also be found in codahale/metrics.
More representative method of rollup than the typical RRD: (min, max, avg) and now Graphite (min,max,avg).

Straight-Line-Interpolative aggregation is also a good way to be faithful to a time-series within +/- max_deviation of the actual value in rollup.

These are methods that can be done on time-series prior to sending to graphite or by graphite itself.

This of course does nothing for the alignment/averaging problem when displaying or doing math on different length series. But it does help in keeping the rolled-up data more accurate! Which eases the pain of mixing accurate and rolled up data.

Nice example Nicholas.

Revision history for this message
Michael Leinartas (mleinartas) wrote :
Changed in graphite:
status: New → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.