Ceilometer

Polling interval behaving differently between Juno and Liberty

Bug #1570967 reported by rezroo on 2016-04-15

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Ceilometer	Won't Fix	Undecided	Unassigned

Bug Description

I have been using ceilometer in devstack for a while, and what I used to be able to do is modify pipeline.yaml to set the polling interval for "name: cpu_source" to 60, and then publish it using udp. The diff for the changes are below for the Liberty pipeline.yaml:

    stack@vlab:/etc/ceilometer$ diff pipeline.yaml.bak pipeline.yaml
    10c10
    < interval: 600
    ---
    > interval: 60
    53c53
    < - notifier://
    ---
    > - udp://127.0.0.1:4952

The desired and net effect of these changes in Juno was the I would get cpu_util samples every minute, but cpu meters every 10 minute, in the Juno release.

These same changes in Liberty are not having the same effect. I get cpu, cpu_util, and cpu.delta every 1 minute. Why are cpu samples arriving every minute?

    stack@lab:~$ ceilometer sample-list -m cpu | head
    +--------------------------------------+------+------------+------------+------+----------------------------+
    | Resource ID | Name | Type | Volume | Unit | Timestamp |
    +--------------------------------------+------+------------+------------+------+----------------------------+
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu | cumulative | 1.0746e+11 | ns | 2016-04-08T13:48:17.380281 |
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu | cumulative | 1.0739e+11 | ns | 2016-04-08T13:47:17.357667 |
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu | cumulative | 1.073e+11 | ns | 2016-04-08T13:46:17.414188 |
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu | cumulative | 1.0723e+11 | ns | 2016-04-08T13:45:17.356869 |
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu | cumulative | 1.0715e+11 | ns | 2016-04-08T13:44:17.357771 |
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu | cumulative | 1.0707e+11 | ns | 2016-04-08T13:43:17.349820 |
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu | cumulative | 1.0699e+11 | ns | 2016-04-08T13:42:17.348110 |

    stack@lab:~$ ceilometer sample-list -m cpu.delta | head
    +--------------------------------------+-----------+-------+-------------+------+----------------------------+
    | Resource ID | Name | Type | Volume | Unit | Timestamp |
    +--------------------------------------+-----------+-------+-------------+------+----------------------------+
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu.delta | delta | 70000000.0 | ns | 2016-04-08T13:48:17.380281 |
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu.delta | delta | 90000000.0 | ns | 2016-04-08T13:47:17.357667 |
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu.delta | delta | 70000000.0 | ns | 2016-04-08T13:46:17.414188 |
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu.delta | delta | 80000000.0 | ns | 2016-04-08T13:45:17.356869 |
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu.delta | delta | 80000000.0 | ns | 2016-04-08T13:44:17.357771 |
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu.delta | delta | 80000000.0 | ns | 2016-04-08T13:43:17.349820 |
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu.delta | delta | 80000000.0 | ns | 2016-04-08T13:42:17.348110 |

    stack@lab:~$ ceilometer sample-list -m cpu_util | head
    +--------------------------------------+----------+-------+----------------+------+----------------------------+
    | Resource ID | Name | Type | Volume | Unit | Timestamp |
    +--------------------------------------+----------+-------+----------------+------+----------------------------+
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu_util | gauge | 0.116622711567 | % | 2016-04-08T13:48:17.380281 |
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu_util | gauge | 0.150141435735 | % | 2016-04-08T13:47:17.357667 |
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu_util | gauge | 0.116555319427 | % | 2016-04-08T13:46:17.414188 |
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu_util | gauge | 0.133335337808 | % | 2016-04-08T13:45:17.356869 |
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu_util | gauge | 0.133315666786 | % | 2016-04-08T13:44:17.357771 |
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu_util | gauge | 0.133329533442 | % | 2016-04-08T13:43:17.349820 |
    | 6c4d5e4c-066e-4e98-b728-caaa66d8bf3b | cpu_util | gauge | 0.133329311232 | % | 2016-04-08T13:42:17.348110 |

Revision history for this message

rezroo (r3za) wrote on 2016-04-15:

Download full text (3.3 KiB)

Response from ceilometer team is below:

Thank you very much for this question. Your understanding is correct. In Liberty in Ceilometer we did a refactoring which has changed the behaviour in the way you've described. Let me try to explain in details.

It's written here that all possible pollsters (see setup.cfg ceilometer.poll.compute section) should be run every 10 minutes and "meter_sink" should be applied accordingly. Also, once in 60 seconds you want to run _only_ cpu pollster (ceilometer.compute.pollsters.cpu:CPUPollster) and transform the result as described in cpu_sink and cpu_delta_sink. In Juno, we did everything in Compute agent.
    In Liberty the situation has been changed. In the example above, we run all pollsters once in 600 seconds and send the result to the RabbitMQ (to the special queue "notification.sample"). Also, every 60 seconds we run only CPUPollster and send results to MQ again. After this, a notification agent should read these messages and apply the second "part" of pipeline.yaml, i.e. sinks. But how does this agent determine what sink to apply to a Sample? The algorithm is simple: we read the Sample from the "notification.sample" queue and check the "counter_name" of it (note that we don't care about "interval" on this stage). After, we send this Sample to every sink if the corresponding source has the satisfied rule in "meters" section. In the example above, every 10 minutes we have Samples for all the possible meters, including "cpu", i.e. every 10 minutes we apply cpu_sink and cpu_delta_sink for the Sample from meter_source source (it's actually wrong, because these sinks refers to cpu_source source and in Juno we applied only meter_sink). Also, every 1 minute (your change) we have only one Sample with "counter_name=cpu". And now we also apply all sinks again, because "cpu" satisfies the "*" and "cpu" rules in the sources description. That's why you see "cpu"-related samples every minute. I assume that every 10 minute you may see two "cpu", "cpu_delta" and "cpu_util"-related samples.
    The main problem we have now in Ceilometer (starting from Liberty) is that we have pipeline.yaml, but this file is not fully applied neither on Polling agents (Central and Compute), nor on Notification. On Polling, we are interested only in sources and don't use transformers information; on Notification, we don't care about intervals.
    I don't see any workaround in your case, because you need "cpu" samples for "cpu_util" construction, and you need it every 60 seconds, i.e. you need to receive "cpu" every 1 minute and it's impossible to configure "please write down cpu S...

Response from ceilometer team is below:

Each cpu_util sample is the sample which is derived from cpu ones. In Juno, this transformation took place in Compute agent. In Liberty, we've started to do it in Notification agent. And it's the root cause of the issue. Let's consider the following pipeline.yam file:
sources:
	    - name: meter_source
	      interval: 600
	      meters:
	          - "*"
	      sinks:
	          - meter_sink
	    - name: cpu_source
	      interval: 60 <-----it's your change
	      meters:
	          - "cpu"
	      sinks:
	          - cpu_sink
	          - cpu_delta_sink
 
It's written here that all possible pollsters (see setup.cfg ceilometer.poll.compute section) should be run every 10 minutes and "meter_sink" should be applied accordingly. Also, once in 60 seconds you want to run _only_ cpu pollster (ceilometer.compute.pollsters.cpu:CPUPollster) and transform the result as described in cpu_sink and cpu_delta_sink. In Juno, we did everything in Compute agent.
    In Liberty the situation has been changed. In the example above, we run all pollsters once in 600 seconds and send the result to the RabbitMQ (to the special queue "notification.sample"). Also, every 60 seconds we run only CPUPollster and send results to MQ again. After this, a notification agent should read these messages and apply the second "part" of pipeline.yaml, i.e. sinks. But how does this agent determine what sink to apply to a Sample? The algorithm is simple: we read the Sample from the "notification.sample" queue and check the "counter_name" of it (note that we don't care about "interval" on this stage). After, we send this Sample to every sink if the corresponding source has the satisfied rule in "meters" section. In the example above, every 10 minutes we have Samples for all the possible meters, including "cpu", i.e. every 10 minutes we apply cpu_sink and cpu_delta_sink for the Sample from meter_source source (it's actually wrong, because these sinks refers to cpu_source source and in Juno we applied only meter_sink). Also, every 1 minute (your change) we have only one Sample with "counter_name=cpu". And now we also apply all sinks again, because "cpu" satisfies the "*" and "cpu" rules in the sources description. That's why  you see "cpu"-related samples every minute. I assume that every 10 minute you may see two "cpu", "cpu_delta" and "cpu_util"-related samples. 
    The main problem we have now in Ceilometer (starting from Liberty) is that we have pipeline.yaml, but this file is not fully applied neither on Polling agents (Central and Compute), nor on Notification. On Polling, we are interested only in sources and don't use transformers information; on Notification, we don't care about intervals. 
    I don't see any workaround in your case, because you need "cpu" samples for "cpu_util" construction, and you need it every 60 seconds, i.e. you need to receive "cpu" every 1 minute and it's impossible to configure "please write down cpu Sample to db only once in 10 minutes, ignore all other occurrence of cpu Sample, use it only for transformation".

Sorry for bad news. I hope that this behaviour will be discussed soon on the Summit.

Thanks,
Nadya

Revision history for this message

gordon chung (chungg) wrote on 2016-04-18:

just an fyi, you can track discussions here: http://lists.openstack.org/pipermail/openstack-dev/2016-April/092107.html

Changed in ceilometer:
status:	New → Confirmed

Revision history for this message

gordon chung (chungg) wrote on 2016-11-02:

i'm marking this won't fix. i think the benefits of change outweigh the difference in experience. that said, if you use gnocchi or most other tsdb, you can define policies to match the level of roll up you expect.

Changed in ceilometer:
status:	Confirmed → Won't Fix

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.