aggregation: issue when one of the resource is no more updated
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Aodh |
Fix Released
|
Medium
|
Mehdi Abaakouk | ||
Gnocchi |
Triaged
|
Medium
|
Mehdi Abaakouk |
Bug Description
Hi,
When across metric aggregation are used by heat and aodh the following happen:
* heat creates a autoscaling group with 3 servers
* gnocchi got 3 three new resources for theses servers with their measurements
* then the autoscaling stuff decide to remove one server
* metrics of one of the gnocchi resources is no more updated with new measurements
* gnocchi "across metric aggregation" endpoint always returns:
<h1>400 Bad Request</h1>
The server could not comply with the request since it is either malformed or otherwise incorrect.<br /><br />
One of the metric to aggregated doesn't have matching granularity
The problem is that we can't really known the reason of why the measurements that doesn't come anymore.
Perhaps it just a temporary issue and they are nothing todo, perhaps the resource doesn't exists anymore outside of
gnocchi and will be not updated anymore.
Cheers,
Changed in gnocchi: | |
assignee: | nobody → Mehdi Abaakouk (sileht) |
Changed in aodh: | |
status: | New → Triaged |
Changed in gnocchi: | |
status: | New → Triaged |
Changed in aodh: | |
importance: | Undecided → Medium |
Changed in aodh: | |
status: | Triaged → Fix Released |
A smart solution would be to set "ended_at" on the resource, when the instance is deleted and filter out instances with this flags set, so the aggregation result will be always correct.
But we don't have such thing in the ceilometer dispatcher yet, so the proposed solution will be: of_overlap= 0' temporary in aodh for gnocchi alarms against "aggregated metrics across resources".
* Implements the "ended_at" thing for the futur
* Set 'percent_