Aggregation across metrics behavior for autoscaling

Bug #1522434 reported by Gaël Lambert
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Gnocchi
Fix Released
Medium
Mehdi Abaakouk
1.3
Fix Released
Medium
Unassigned

Bug Description

Aggregation across metrics behavior for autoscaling.

I tried this scenario :
As an cloud administrator I have 2 VMs (A and B).

A is running and send regular datas about number of customer are connected on apache. The autoscaling spawn a second VM B.

Data timeline :
 A : 1 1 1 1
 B : . . 10 10

I have an issue on VM A. So I kill the VM and let B running
 A : 1 1 1 1 . .
 B : . . 10 10 10 10

Now I want the report the number of customer connecter for this period of time with the sum of both VMs.
I use aggregat with sum, and expect something like :
  1 1 11 11 10 10

Let's try (I let you all curl call to let me know if I miss something)

Create archive_policy

    curl -i -H "Accept: application/json" -H "Content-Type: application/json" -X POST -d '
    {
      "back_window": 0,
      "definition": [
        {
          "granularity": "1s",
          "timespan": "45 day"
        }
      ],
      "name": "low"
    }' http://localhost:8041/v1/archive_policy

Add policy rule
    curl -i -H "Accept: application/json" -H "Content-Type: application/json" -X POST -d '
    {
      "archive_policy_name": "low",
      "metric_pattern": "*",
      "name": "default_rule"
    }' http://localhost:8041/v1/archive_policy_rule

Create 2 metrics
    curl -i -H "Accept: application/json" -H "Content-Type: application/json" -X POST -d '{"name": "metric1"}' http://localhost:8041/v1/metric
    curl -i -H "Accept: application/json" -H "Content-Type: application/json" -X POST -d '{"name": "metric2"}' http://localhost:8041/v1/metric

    export metric1=75b615fc-3586-468e-999e-55afed661af2
    export metric2=1c629767-d298-46d3-b6ff-565c6b1c5125

Display my server date :

 date
 Thu Dec 3 14:13:48 UTC 2015

Send datas in metric matching the timeline :
 1 1 1 1 . .
    . . 10 10 10 10

 curl -i -H "Content-Type: application/json" -X POST -d '
 [
   {
  "timestamp": "2015-12-03T13:19:15+0100",
  "value": 1
   },
   {
  "timestamp": "2015-12-03T13:20:15+0100",
  "value": 1
   },
   {
  "timestamp": "2015-12-03T13:21:15+0100",
  "value": 1
   },
   {
  "timestamp": "2015-12-03T13:22:15+0100",
  "value": 1
   }
 ]' http://localhost:8041/v1/metric/$metric1/measures

 curl -i -H "Content-Type: application/json" -X POST -d '
 [
   {
  "timestamp": "2015-12-03T13:21:15+0100",
  "value": 10
   },
   {
  "timestamp": "2015-12-03T13:22:15+0100",
  "value": 10
   },
   {
  "timestamp": "2015-12-03T13:23:15+0100",
  "value": 10
   },
   {
  "timestamp": "2015-12-03T13:24:15+0100",
  "value": 10
   }
 ]' http://localhost:8041/v1/metric/$metric2/measures

Get aggregate sum

 curl -H "Accept: application/json" -H "Content-Type: application/json" "http://localhost:8041/v1/aggregation/metric?metric=$metric1&metric=$metric2&start=2015-11-27T17:00&aggregation=sum" | json_pp
 [
    [
    "2015-12-03T12:19:15+00:00",
    1,
    1
    ],
    [
    "2015-12-03T12:20:15+00:00",
    1,
    1
    ],
    [
    "2015-12-03T12:21:15+00:00",
    1,
    11
    ],
    [
    "2015-12-03T12:22:15+00:00",
    1,
    11
    ]
]

I was expecting
  1 1 11 11 10 10
And I obtain only
  1 1 11 11

Not sure this the expected behavior of aggregat.

Julien Danjou (jdanjou)
Changed in gnocchi:
status: New → Triaged
importance: Undecided → Medium
Mehdi Abaakouk (sileht)
Changed in gnocchi:
assignee: nobody → Mehdi Abaakouk (sileht)
Revision history for this message
Gaël Lambert (gael-lambert) wrote :

Update with an other archive_policy :

 curl -i -H "Accept: application/json" -H "Content-Type: application/json" -X POST -d '
  {
    "back_window": 0,
    "definition": [
      {
        "granularity": "1s",
        "timespan": "1 hour"
      },
      {
        "points": 48,
        "timespan": "1 day"
      }
    ],
    "name": "low2"
  }' http://localhost:8041/v1/archive_policy

The result of the aggregat :
 curl -H "Accept: application/json" -H "Content-Type: application/json" "http://localhost:8041/v1/aggregation/metric?metric=$metric1&metric=$metric2&start=2015-11-27T17:00&aggregation=sum" | json_pp
 [
    [
    "2015-12-03T12:00:00+00:00",
    1800,
    44
    ],
    [
    "2015-12-03T12:19:15+00:00",
    1,
    1
    ],
    [
    "2015-12-03T12:20:15+00:00",
    1,
    1
    ],
    [
    "2015-12-03T12:21:15+00:00",
    1,
    11
    ],
    [
    "2015-12-03T12:22:15+00:00",
    1,
    11
    ]
 ]

    [
    "2015-12-03T12:00:00+00:00",
    1800,
    44
    ],

I don't know why but the first field was not displayed with the other policy.
With the sum of all point (44).
  1 1 11 11 10 10 = 44
But the displayed output is only
  1 1 11 11 = 24

Revision history for this message
Mehdi Abaakouk (sileht) wrote :

Yes we have a issue, but the expected behavior of your request should be HTTP 400 because some points are missing
from one of the timeseries to be able to aggregate them.

For you particular case you should do use the "start" and "stop" boundary to tell Gnocchi that you want all points between and also set needed_overlap=0 to allow gnocchi to compute the aggregation even some points are missing.

I will add a bit of documentation about that.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to gnocchi (master)

Fix proposed to branch: master
Review: https://review.openstack.org/253035

Changed in gnocchi:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to gnocchi (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/253051

Revision history for this message
Gaël Lambert (gael-lambert) wrote :
Download full text (3.9 KiB)

Cool, seems good with **stop** and **needed_overlap**. Add this example in the doc should be great.
 curl -H "Accept: application/json" -H "Content-Type: application/json" "http://localhost:8041/v1/aggregation/metric?metric=$metric1&metric=$metric2&start=2015-12-03T12:19:15&stop=2015-12-03T12:25:15&needed_overlap=0&aggregation=sum" | json_pp
 [
    [
    "2015-12-03T12:19:15+00:00",
    1,
    1
    ],
    [
    "2015-12-03T12:20:15+00:00",
    1,
    1
    ],
    [
    "2015-12-03T12:21:15+00:00",
    1,
    11
    ],
    [
    "2015-12-03T12:22:15+00:00",
    1,
    11
    ],
    [
    "2015-12-03T12:23:15+00:00",
    1,
    10
    ],
    [
    "2015-12-03T12:24:15+00:00",
    1,
    10
    ]
 ]

I have just one more thing with aggregate. I send datas with my timezone "2015-12-03T13:19:15+0100". In gnocchi this point is converted to UTC "2015-12-03T12:19:15+00:00".

But for the start and stop flag in aggregat. To get my value added with 13:19:15+0100, I need to query start 12:19:15.

It's not a big deal but I added value with timezone, using gnocchi, I was expected to be able to query aggregate range with timezone also.

I tried to give the timezone with (+ encoded to %2B)
 curl -H "Accept: application/json" -H "Content-Type: application/json" "http://localhost:8041/v1/aggregation/metric?metric=$metric1&metric=$metric2&start=2015-12-03T13:19:15%2B0100&stop=2015-12-03T13:30:15%2B0100&needed_overlap=0&aggregation=sum" | json_pp
 []

And keeping the + char :
 curl -H "Accept: application/json" -H "Content-Type: application/json" "http://localhost:8041/v1/aggregation/metric?metric=$metric1&metric=$metric2&start=2015-12-03T13:19:15+0100&stop=2015-12-03T13:30:15+0100&needed_overlap=0&aggregation=sum" | json_pp

   File "/share/gnocchi/gnocchi/carbonara.py", line 388, in aggregated
  timeserie_raw = timeserie.fetch(from_timestamp, to_timestamp)
   File "/share/gnocchi/gnocchi/carbonara.py", line 344, in fetch
  points = self[from_timestamp:to_timestamp]
   File "/share/gnocchi/gnocchi/carbonara.py", line 97, in __getitem__
  return self.ts[key]
   File "/usr/local/lib/python2.7/dist-packages/pandas/core/series.py", line 597, in __getitem__
  return self._get_with(key)
   File "/usr/local/lib/python2.7/dist-packages/pandas/core/series.py", line 602, in _get_with
  indexer = self.index._convert_slice_indexer(key, kind='getitem')
   File "/usr/local/lib/python2.7/dist-packages/pandas/core/index.py", line 950, in _convert_slice_indexer
  key, is_index_slice=is_index_slice)
   File "/usr/local/lib/python2.7/dist-packages/pandas/core/index.py", line 885, in _convert_slice_indexer_getitem
  return self._convert_slice_indexer(key)
   File "/usr/local/lib/python2.7/dist-packages/pandas/core/index.py", line 972, in _convert_slice_indexer
  indexer = self.slice_indexer(start, stop, step)
   File "/usr/local/lib/python2.7/dist-packages/pandas/tseries/index.py", line 1436, in slice_indexer
  return Index.slice_indexer(self, start, end, step)
   File "/usr/local/lib/python2.7/dist-packages/pandas/core/index.py", line 2570, in slice_indexer
  start_slice, end_slice = self.slice_locs(start, end, step=step, kind=kind)
   File "/usr/local/lib/python2.7/dist-packag...

Read more...

Revision history for this message
Mehdi Abaakouk (sileht) wrote :

Yep can you open another ticket for %2B and + issue ?

Revision history for this message
Gaël Lambert (gael-lambert) wrote :

#1523549

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to gnocchi (master)

Reviewed: https://review.openstack.org/252983
Committed: https://git.openstack.org/cgit/openstack/gnocchi/commit/?id=ff6e7f5433878c417d4e06ce1cf07b4864e6559e
Submitter: Jenkins
Branch: master

commit ff6e7f5433878c417d4e06ce1cf07b4864e6559e
Author: Mehdi Abaakouk <email address hidden>
Date: Thu Dec 3 15:45:48 2015 +0100

    Adds aggregation across metrics tests

    Related-bug: #1522434
    Change-Id: I78b9712aabc31c26f39a5c9cc13e58c21a08942a

Changed in gnocchi:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to gnocchi (master)

Reviewed: https://review.openstack.org/253035
Committed: https://git.openstack.org/cgit/openstack/gnocchi/commit/?id=444656dcb2dbdb9d3d8fc5c4c83cecf687fe5a13
Submitter: Jenkins
Branch: master

commit 444656dcb2dbdb9d3d8fc5c4c83cecf687fe5a13
Author: Mehdi Abaakouk <email address hidden>
Date: Thu Dec 3 16:38:32 2015 +0100

    Checks percent_of_overlap when one boundary is set

    Aggregation across metrics have different behavior depending
    on if boundary are set and if needed_percent_of_overlap is set.

    If boundaries are not set, Carbonara makes the aggregation only with points
    at timestamp present in all timeseries.

    But when boundaries are set, Carbonara expects that we have certain
    percent of timestamps common between timeseries, this percent is controlled
    by needed_percent_of_overlap (defaulted with 100%).

    This change fixes a weird behavior when only one boundary is set,
    the needed_percent_of_overlap wasn't checked.

    Change-Id: Ifc85a9004e864d14d42fd482ec144e4d27dd615b
    Closes-bug: #1522434

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to gnocchi (master)

Reviewed: https://review.openstack.org/253036
Committed: https://git.openstack.org/cgit/openstack/gnocchi/commit/?id=db6a9bd0a3dd99e2d65f2982b421bd4a49fe8190
Submitter: Jenkins
Branch: master

commit db6a9bd0a3dd99e2d65f2982b421bd4a49fe8190
Author: Mehdi Abaakouk <email address hidden>
Date: Thu Dec 3 16:57:10 2015 +0100

    Adds some docs about aggregation across metrics

    Related-bug: #1522434
    Change-Id: I8ae9903aed31641a89f8f247bb777755a8f15d42

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to gnocchi (stable/1.3)

Related fix proposed to branch: stable/1.3
Review: https://review.openstack.org/255763

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to gnocchi (stable/1.3)

Fix proposed to branch: stable/1.3
Review: https://review.openstack.org/255764

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to gnocchi (stable/1.3)

Reviewed: https://review.openstack.org/255763
Committed: https://git.openstack.org/cgit/openstack/gnocchi/commit/?id=69387777d907629c3853123643d9460794d3fd1a
Submitter: Jenkins
Branch: stable/1.3

commit 69387777d907629c3853123643d9460794d3fd1a
Author: Mehdi Abaakouk <email address hidden>
Date: Thu Dec 3 15:45:48 2015 +0100

    Adds aggregation across metrics tests

    Related-bug: #1522434
    Change-Id: I78b9712aabc31c26f39a5c9cc13e58c21a08942a
    (cherry picked from commit ff6e7f5433878c417d4e06ce1cf07b4864e6559e)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to gnocchi (stable/1.3)

Reviewed: https://review.openstack.org/255764
Committed: https://git.openstack.org/cgit/openstack/gnocchi/commit/?id=abf9ce3c201fa16b6bd96b226ac78ec77ff836ec
Submitter: Jenkins
Branch: stable/1.3

commit abf9ce3c201fa16b6bd96b226ac78ec77ff836ec
Author: Mehdi Abaakouk <email address hidden>
Date: Thu Dec 3 16:38:32 2015 +0100

    Checks percent_of_overlap when one boundary is set

    Aggregation across metrics have different behavior depending
    on if boundary are set and if needed_percent_of_overlap is set.

    If boundaries are not set, Carbonara makes the aggregation only with points
    at timestamp present in all timeseries.

    But when boundaries are set, Carbonara expects that we have certain
    percent of timestamps common between timeseries, this percent is controlled
    by needed_percent_of_overlap (defaulted with 100%).

    This change fixes a weird behavior when only one boundary is set,
    the needed_percent_of_overlap wasn't checked.

    Change-Id: Ifc85a9004e864d14d42fd482ec144e4d27dd615b
    Closes-bug: #1522434
    (cherry picked from commit 444656dcb2dbdb9d3d8fc5c4c83cecf687fe5a13)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to gnocchi (master)

Reviewed: https://review.openstack.org/253051
Committed: https://git.openstack.org/cgit/openstack/gnocchi/commit/?id=cac25312a9b6aa9ff1aa1bf52fbf65256f800696
Submitter: Jenkins
Branch: master

commit cac25312a9b6aa9ff1aa1bf52fbf65256f800696
Author: Mehdi Abaakouk <email address hidden>
Date: Thu Dec 3 17:11:17 2015 +0100

    Correlate correctly the needed_overlap

    When only the one boundary is needed_overlap should not
    take care of point of the other boundary because their
    will be dropped.

    Change-Id: Ifc76c58c5c1e22551c73809e01fcabfdfb1af3cf
    Related-bug: #1522434

Julien Danjou (jdanjou)
Changed in gnocchi:
milestone: none → 2.0.0
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.