Y-values in graph are cut as graphed period gets longer

Bug #850475 reported by Chris D
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Graphite
Incomplete
Undecided
Unassigned

Bug Description

I have a counter, sent every 2 minutes. In this case, it's 371.

If I graph it at 15 hours or less, it graphs correctly.

At 20 hours, it graphs the 371 at around 190.

URL /render?from=-20hours&until=now&width=1024&height=768&yMax=500&areaMode=first&lineMode=staircase&hideLegend=true&target=blah

Images are attached.

Revision history for this message
Chris D (78luphr0rnk2nuqimstywepozxn9kl19tq-pjhhh3o25-a811i2i3ytqlsztthjth0svbccw8inm65t) wrote :
Revision history for this message
Chris D (78luphr0rnk2nuqimstywepozxn9kl19tq-pjhhh3o25-a811i2i3ytqlsztthjth0svbccw8inm65t) wrote :

Forgot to mention, I checked with rawData=true and the underlying raw data are correct (as you'd expect from it graphing correctly at all).

chrismd (chrismd)
Changed in graphite:
status: New → Invalid
Revision history for this message
chrismd (chrismd) wrote :

This behavior is actually by design. What is happening is that in the 20 hour graph you're requesting more datapoints than there are pixels available in the graph, try the same request with a wider image and it'll go back to the values you expect. Whenever this happens (more datapoints than pixels), graphite has to aggregate the datapoints. Graphite doesn't know how you want to aggregate the datapoints (yet, this is in the works) so by default it simply averages them. You can change this to summing instead by applying the cumulative() function to the expression but this still leaves you with the problem of not being entirely sure exactly how much aggregation has taken place. It stinks but its the best we can do for default behavior not know more about the user's intentions. What you can do to explicitly specify your desired aggregation behavior is to use the summarize() function.

For some reason summarize() is missing from the docs despite having a proper docstring so I'll look into that, in the mean time the basic usage is this:

summarize(foo.bar.mymetric, "15m", "sum")

The first arg is obviously your metric expression.
The second arg is the size of the time period that the aggregate datapoints wil cover, 15 minutes for each datapoint in this case.
The third arg is the aggregation function and can be: sum, avg, max, min, or last.

I hope that helps.

Changed in graphite:
status: Invalid → Incomplete
Revision history for this message
Chris D (78luphr0rnk2nuqimstywepozxn9kl19tq-pjhhh3o25-a811i2i3ytqlsztthjth0svbccw8inm65t) wrote :

Thanks for the prompt response. Unfortunately, I'm more confused than before, for a couple reasons:

1) Why is Graphite doing anything with the values at all? I don't want to aggregate them, I just want to graph them: [t1, 371], [t2, 375], [t3, 410]. Connecting them with a pretty line would be nice, but at this point I'm confused why just doing a straight-up graph is a problem.

2) I tried summarize() as you suggested:

/render?from=-3days&until=now&width=1024&height=800&target=summarize(stats_counts.blah.worker.count,%2215m%22,%22sum%22)

Which produces this (strange since I only perceive 3 arguments in my summarize() call):

    TypeError: summarize() takes exactly 3 arguments (4 given)

If Graphite won't produce a straight up plot without munging the data, I feel like I'm missing something fundamental about what Graphite is doing. :-/ I'd appreciate any help in understanding what's going on.

Changed in graphite:
status: Incomplete → Opinion
status: Opinion → Incomplete
Revision history for this message
Nicholas Leskiw (nleskiw) wrote : Re: [Bug 850475] Re: Y-values in graph are cut as graphed period gets longer

Imagine if your graph was 50 pixels wide. Now imagine your data was
[1000,5,1000,5,1000,5...] and you were graphing the past 100 minutes.

What should the y-value be at pixel 1 of the graph? One horizontal pixel
must represent TWO datapoints (In this example 1000 and 5.)

Graphite averages them by default. You'd get 502.5 at that pixel.
If you want to add the values (1000+5 = 1005) then wrap it in the
cumulative() function. (Chris probably meant cumulative, I think...)

Adding the values doesn't make sense when you're tracking latency or
checking the free memory on a server.

Either way, summing or averaging by default, someone's unhappy.

-Nick

On Thu, Sep 15, 2011 at 4:47 AM, Chris D <email address hidden> wrote:

> Thanks for the prompt response. Unfortunately, I'm more confused than
> before, for a couple reasons:
>
> 1) Why is Graphite doing anything with the values at all? I don't want
> to aggregate them, I just want to graph them: [t1, 371], [t2, 375], [t3,
> 410]. Connecting them with a pretty line would be nice, but at this
> point I'm confused why just doing a straight-up graph is a problem.
>
> 2) I tried summarize() as you suggested:
>
>
> /render?from=-3days&until=now&width=1024&height=800&target=summarize(stats_counts.blah.worker.count,%2215m%22,%22sum%22)
>
> Which produces this (strange since I only perceive 3 arguments in my
> summarize() call):
>
> TypeError: summarize() takes exactly 3 arguments (4 given)
>
> If Graphite won't produce a straight up plot without munging the data, I
> feel like I'm missing something fundamental about what Graphite is
> doing. :-/ I'd appreciate any help in understanding what's going on.
>
> --
> You received this bug notification because you are subscribed to
> Graphite.
> https://bugs.launchpad.net/bugs/850475
>
> Title:
> Y-values in graph are cut as graphed period gets longer
>
> Status in Graphite - Enterprise scalable realtime graphing:
> Incomplete
>
> Bug description:
> I have a counter, sent every 2 minutes. In this case, it's 371.
>
> If I graph it at 15 hours or less, it graphs correctly.
>
> At 20 hours, it graphs the 371 at around 190.
>
> URL
>
> /render?from=-20hours&until=now&width=1024&height=768&yMax=500&areaMode=first&lineMode=staircase&hideLegend=true&target=blah
>
> Images are attached.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/graphite/+bug/850475/+subscriptions
>

Revision history for this message
Chris D (78luphr0rnk2nuqimstywepozxn9kl19tq-pjhhh3o25-a811i2i3ytqlsztthjth0svbccw8inm65t) wrote :

Huh. I guess I can see that. Thanks.

However, both summarize() and cumulative() produce a graph that says "No Data", and they're not mentioned in the URL API doc. Is there a more modern source of documentation than http://graphite.wikidot.com/url-api-reference (last edited in 2008)?

If not, I'll probably spend some time today updating the documentation based on the features I find in the Dashboard composer.

Revision history for this message
Nicholas Leskiw (nleskiw) wrote :

I'm sorry, those two functions may only be in trunk. It's relatively easy to
install from trunk. Get the source tree with Bazaar, and run a few python
install scripts:

bzr co lp:graphite
cd graphite
./check-dependancies.py
cd ./whisper
sudo python ./setup.py install --force
cd ../carbon/
python ./setup.py install --force # no sudo necessary
cd ../
python ./setup.py install
cd /opt/graphite/webapp/graphite/
python ./manage.py syncdb

Also, the docs page has changed. It's now
http://readthedocs.org/docs/graphite/en/1.0/

The URL API and the data functions have been updated (i think that summarize
is still missing from the docs, we're working on it...)

On Thu, Sep 15, 2011 at 12:26 PM, Chris D <email address hidden> wrote:

> Huh. I guess I can see that. Thanks.
>
> However, both summarize() and cumulative() produce a graph that says "No
> Data", and they're not mentioned in the URL API doc. Is there a more
> modern source of documentation than http://graphite.wikidot.com/url-api-
> reference (last edited in 2008)?
>
> If not, I'll probably spend some time today updating the documentation
> based on the features I find in the Dashboard composer.
>
> --
> You received this bug notification because you are subscribed to
> Graphite.
> https://bugs.launchpad.net/bugs/850475
>
> Title:
> Y-values in graph are cut as graphed period gets longer
>
> Status in Graphite - Enterprise scalable realtime graphing:
> Incomplete
>
> Bug description:
> I have a counter, sent every 2 minutes. In this case, it's 371.
>
> If I graph it at 15 hours or less, it graphs correctly.
>
> At 20 hours, it graphs the 371 at around 190.
>
> URL
>
> /render?from=-20hours&until=now&width=1024&height=768&yMax=500&areaMode=first&lineMode=staircase&hideLegend=true&target=blah
>
> Images are attached.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/graphite/+bug/850475/+subscriptions
>

Revision history for this message
chrismd (chrismd) wrote :

Yea Nick is right, I was referencing the improved summarize() function which is on trunk and soon to be released with 0.9.9. There is something wrong with the default graphite.readthedocs.org page, so for now continue to use the 1.0 link Nick gave, I'll see if I can figure out why 'latest' is wrong.

Revision history for this message
Dieter P (dieter-plaetinck) wrote :

As an alternative aggregating style, consider this:
Say you have 3 datapoints with values 100,50,500, and all these must be represented on one pixel.
Suppose the background is black (RGB 000) and the graph is white (RGB FFF). If you want a graph that is filled (entirely colored from 0 until the value), you could draw a vertical line (1 pixel wide) from 0-50 in FFF, a line from 50-100 in "white divided by two" (BBB) and a line from 100 to 500 in "white divided by three" (555).
If you want a standard looking graph (a plot of 1 pixel), you could just put one dot at 500, in color 555.

This way you have a visual cue that there is a high spike: you can see the highest achieved value, and the amount of fade on the color suggests how strong the spike is. this seems very useful to me.

Only catch is, if you have many different graphs on one plot, it may be confusing to see which points belong to which graph. But that's nothing new.

Revision history for this message
Dieter P (dieter-plaetinck) wrote :

correction: when i said "divided by two" I meant "two thirds of"

Revision history for this message
Nicholas Leskiw (nleskiw) wrote :

I looked into this, and unfortunately it's not that simple. When you call
whisper, the data returned is already aggregated. This is so you can
perform math on many series at once (they must have the same interval for,
say, average() to work right)

-Nick

On Tue, Apr 10, 2012 at 6:00 AM, Dieter P <email address hidden> wrote:

> correction: when i said "divided by two" I meant "two thirds of"
>
> --
> You received this bug notification because you are subscribed to
> Graphite.
> https://bugs.launchpad.net/bugs/850475
>
> Title:
> Y-values in graph are cut as graphed period gets longer
>
> Status in Graphite - Enterprise scalable realtime graphing:
> Incomplete
>
> Bug description:
> I have a counter, sent every 2 minutes. In this case, it's 371.
>
> If I graph it at 15 hours or less, it graphs correctly.
>
> At 20 hours, it graphs the 371 at around 190.
>
> URL
>
> /render?from=-20hours&until=now&width=1024&height=768&yMax=500&areaMode=first&lineMode=staircase&hideLegend=true&target=blah
>
> Images are attached.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/graphite/+bug/850475/+subscriptions
>

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.