Polling interval behaving differently between Juno and Liberty

Bug #1570967 reported by rezroo
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ceilometer
Won't Fix
Undecided
Unassigned

Bug Description

I have been using ceilometer in devstack for a while, and what I used to be able to do is modify pipeline.yaml to set the polling interval for "name: cpu_source" to 60, and then publish it using udp. The diff for the changes are below for the Liberty pipeline.yaml:

    stack@vlab:/etc/ceilometer$ diff pipeline.yaml.bak pipeline.yaml
    10c10
    < interval: 600
    ---
    > interval: 60
    53c53
    < - notifier://
    ---
    > - udp://127.0.0.1:4952

The desired and net effect of these changes in Juno was the I would get cpu_util samples every minute, but cpu meters every 10 minute, in the Juno release.

These same changes in Liberty are not having the same effect. I get cpu, cpu_util, and cpu.delta every 1 minute. Why are cpu samples arriving every minute?

    stack@lab:~$ ceilometer sample-list -m cpu | head
    +--------------------------------------+------+------------+------------+------+----------------------------+
    | Resource ID | Name | Type | Volume | Unit | Timestamp |
    +--------------------------------------+------+------------+------------+------+----------------------------+
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu | cumulative | 1.0746e+11 | ns | 2016-04-08T13:48:17.380281 |
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu | cumulative | 1.0739e+11 | ns | 2016-04-08T13:47:17.357667 |
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu | cumulative | 1.073e+11 | ns | 2016-04-08T13:46:17.414188 |
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu | cumulative | 1.0723e+11 | ns | 2016-04-08T13:45:17.356869 |
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu | cumulative | 1.0715e+11 | ns | 2016-04-08T13:44:17.357771 |
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu | cumulative | 1.0707e+11 | ns | 2016-04-08T13:43:17.349820 |
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu | cumulative | 1.0699e+11 | ns | 2016-04-08T13:42:17.348110 |

    stack@lab:~$ ceilometer sample-list -m cpu.delta | head
    +--------------------------------------+-----------+-------+-------------+------+----------------------------+
    | Resource ID | Name | Type | Volume | Unit | Timestamp |
    +--------------------------------------+-----------+-------+-------------+------+----------------------------+
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu.delta | delta | 70000000.0 | ns | 2016-04-08T13:48:17.380281 |
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu.delta | delta | 90000000.0 | ns | 2016-04-08T13:47:17.357667 |
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu.delta | delta | 70000000.0 | ns | 2016-04-08T13:46:17.414188 |
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu.delta | delta | 80000000.0 | ns | 2016-04-08T13:45:17.356869 |
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu.delta | delta | 80000000.0 | ns | 2016-04-08T13:44:17.357771 |
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu.delta | delta | 80000000.0 | ns | 2016-04-08T13:43:17.349820 |
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu.delta | delta | 80000000.0 | ns | 2016-04-08T13:42:17.348110 |

    stack@lab:~$ ceilometer sample-list -m cpu_util | head
    +--------------------------------------+----------+-------+----------------+------+----------------------------+
    | Resource ID | Name | Type | Volume | Unit | Timestamp |
    +--------------------------------------+----------+-------+----------------+------+----------------------------+
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu_util | gauge | 0.116622711567 | % | 2016-04-08T13:48:17.380281 |
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu_util | gauge | 0.150141435735 | % | 2016-04-08T13:47:17.357667 |
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu_util | gauge | 0.116555319427 | % | 2016-04-08T13:46:17.414188 |
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu_util | gauge | 0.133335337808 | % | 2016-04-08T13:45:17.356869 |
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu_util | gauge | 0.133315666786 | % | 2016-04-08T13:44:17.357771 |
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu_util | gauge | 0.133329533442 | % | 2016-04-08T13:43:17.349820 |
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu_util | gauge | 0.133329311232 | % | 2016-04-08T13:42:17.348110 |

Revision history for this message
rezroo (r3za) wrote :
Download full text (3.3 KiB)

Response from ceilometer team is below:

Thank you very much for this question. Your understanding is correct. In Liberty in Ceilometer we did a refactoring which has changed the behaviour in the way you've described. Let me try to explain in details.

Each cpu_util sample is the sample which is derived from cpu ones. In Juno, this transformation took place in Compute agent. In Liberty, we've started to do it in Notification agent. And it's the root cause of the issue. Let's consider the following pipeline.yam file:
sources:
     - name: meter_source
       interval: 600
       meters:
           - "*"
       sinks:
           - meter_sink
     - name: cpu_source
       interval: 60 <-----it's your change
       meters:
           - "cpu"
       sinks:
           - cpu_sink
           - cpu_delta_sink

It's written here that all possible pollsters (see setup.cfg ceilometer.poll.compute section) should be run every 10 minutes and "meter_sink" should be applied accordingly. Also, once in 60 seconds you want to run _only_ cpu pollster (ceilometer.compute.pollsters.cpu:CPUPollster) and transform the result as described in cpu_sink and cpu_delta_sink. In Juno, we did everything in Compute agent.
    In Liberty the situation has been changed. In the example above, we run all pollsters once in 600 seconds and send the result to the RabbitMQ (to the special queue "notification.sample"). Also, every 60 seconds we run only CPUPollster and send results to MQ again. After this, a notification agent should read these messages and apply the second "part" of pipeline.yaml, i.e. sinks. But how does this agent determine what sink to apply to a Sample? The algorithm is simple: we read the Sample from the "notification.sample" queue and check the "counter_name" of it (note that we don't care about "interval" on this stage). After, we send this Sample to every sink if the corresponding source has the satisfied rule in "meters" section. In the example above, every 10 minutes we have Samples for all the possible meters, including "cpu", i.e. every 10 minutes we apply cpu_sink and cpu_delta_sink for the Sample from meter_source source (it's actually wrong, because these sinks refers to cpu_source source and in Juno we applied only meter_sink). Also, every 1 minute (your change) we have only one Sample with "counter_name=cpu". And now we also apply all sinks again, because "cpu" satisfies the "*" and "cpu" rules in the sources description. That's why you see "cpu"-related samples every minute. I assume that every 10 minute you may see two "cpu", "cpu_delta" and "cpu_util"-related samples.
    The main problem we have now in Ceilometer (starting from Liberty) is that we have pipeline.yaml, but this file is not fully applied neither on Polling agents (Central and Compute), nor on Notification. On Polling, we are interested only in sources and don't use transformers information; on Notification, we don't care about intervals.
    I don't see any workaround in your case, because you need "cpu" samples for "cpu_util" construction, and you need it every 60 seconds, i.e. you need to receive "cpu" every 1 minute and it's impossible to configure "please write down cpu S...

Read more...

Revision history for this message
gordon chung (chungg) wrote :
Changed in ceilometer:
status: New → Confirmed
Revision history for this message
gordon chung (chungg) wrote :

i'm marking this won't fix. i think the benefits of change outweigh the difference in experience. that said, if you use gnocchi or most other tsdb, you can define policies to match the level of roll up you expect.

Changed in ceilometer:
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.