Efficient implementation of cumulative counter
Bug #1061817 reported by
Julien Danjou
This bug affects 6 people
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ceilometer |
Won't Fix
|
High
|
Unassigned |
Bug Description
Cumulative counters can have values going back to 0, like:
0, 10, 20, 30, 0, 20, 30
which translates to:
0, 10, 20, 30, 30, 50, 60
Requesting a sum on this looks inefficient at first sight because we need to through every event to check that none of the value went back to 0.
We need to find an efficient way to manipulate this counters in every backend, when at least getting max() for example.
Changed in ceilometer: | |
status: | New → Confirmed |
importance: | Undecided → High |
tags: | added: effort-xl |
Changed in ceilometer: | |
assignee: | nobody → Anirudh Vedantam (anirudh-vedantam) |
information type: | Public → Public Security |
information type: | Public Security → Private Security |
information type: | Private Security → Public |
information type: | Public → Public Security |
information type: | Public Security → Private Security |
information type: | Private Security → Public |
Changed in ceilometer: | |
assignee: | Anirudh Vedantam (anirudh-vedantam) → nobody |
status: | Confirmed → Triaged |
Changed in ceilometer: | |
status: | Triaged → Incomplete |
assignee: | nobody → 董浩 (photodh) |
Changed in ceilometer: | |
assignee: | 董浩 (photodh) → nobody |
Changed in ceilometer: | |
status: | Incomplete → Triaged |
To post a comment you must log in.
I've submitted the problem to a panel of SQL experts and they provided 2 implementations of a MAX() function for such a counter.
Simple
=====
This is probably the most portable approach, but it requires a double scan of the table, so it's not that efficient. It has been tested on MySQL.
SELECT SUM(IF( COALESCE( (SELECT counter_volume FROM meter M2 WHERE M2.timestamp> M1.timestamp counter_ volume, 0)) FROM meter M1 ORDER BY timestamp
ORDER BY M2.TIMESTAMP ASC LIMIT 1),0)<,
ASC
Windowed
========
This uses window function and is more efficient, however I'm not sure it's available everywhere. It has been tested on PostgreSQL.
with t(tops) as ( volume) over w < counter_volume then counter_volume volume) over w is null then counter_volume
select case when lead(counter_
when lead(counter_
else null
end as max
from meter
window w as (order by timestamp)
)
select sum(tops) from t;
(t returns the list of the higher values above all, and then we sum them to get max)
I think this is a good proof that this is definitely doable. For such counter, we likely want MAX() - FIRST() (to get the actual number of bytes sent via a NIC, for example ) and this is doable in an SQL query.