Duplicate meter samples

Bug #1496777 reported by Avi Weit
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ceilometer
Fix Released
Low
Xia Linjuan

Bug Description

I am using ceilometer (devstack master on Ubuntu 14.04.3) with attached ceilometer.conf (controller IP replaced with 1.2.3.4) .

According to devstack/settings (under ceilometer tree), one can append ,profiler to "notification_topics" config var:

# To enable OSprofiler change value of this variable to "notifications,profiler"
CEILOMETER_NOTIFICATION_TOPICS=${CEILOMETER_NOTIFICATION_TOPICS:-notifications}

Doing that, causes ceilometer to emit duplicate samples to the collector service.

      ceilometer sample-list -m cpu -q 'resource_id=d7e45859-f4e0-40d6-8379-9850fd04a9ea' --limit 30
    +--------------------------------------+------+------------+---------------+------+----------------------------+
    | Resource ID | Name | Type | Volume | Unit | Timestamp |
    +--------------------------------------+------+------------+---------------+------+----------------------------+
    | d7e45859-f4e0-40d6-8379-9850fd04a9ea | cpu | cumulative | 42770000000.0 | ns | 2015-09-17T09:05:12.677403 |
    | d7e45859-f4e0-40d6-8379-9850fd04a9ea | cpu | cumulative | 42770000000.0 | ns | 2015-09-17T09:05:12.677403 |
    | d7e45859-f4e0-40d6-8379-9850fd04a9ea | cpu | cumulative | 42610000000.0 | ns | 2015-09-17T09:04:12.670136 |
    | d7e45859-f4e0-40d6-8379-9850fd04a9ea | cpu | cumulative | 42610000000.0 | ns | 2015-09-17T09:04:12.670136 |
    | d7e45859-f4e0-40d6-8379-9850fd04a9ea | cpu | cumulative | 42460000000.0 | ns | 2015-09-17T09:03:12.804537 |
    | d7e45859-f4e0-40d6-8379-9850fd04a9ea | cpu | cumulative | 42460000000.0 | ns | 2015-09-17T09:03:12.804537 |
    | d7e45859-f4e0-40d6-8379-9850fd04a9ea | cpu | cumulative | 42340000000.0 | ns | 2015-09-17T09:02:14.791002 |
    | d7e45859-f4e0-40d6-8379-9850fd04a9ea | cpu | cumulative | 42340000000.0 | ns | 2015-09-17T09:02:14.791002 |

cpu_util samples do not get properly calculated (probably because of duplicate metric their transformer depends on):

 ceilometer sample-list -m cpu_util -q 'resource_id=d7e45859-f4e0-40d6-8379-9850fd04a9ea' --limit 30
+--------------------------------------+----------+-------+----------------+------+----------------------------+
| Resource ID | Name | Type | Volume | Unit | Timestamp |
+--------------------------------------+----------+-------+----------------+------+----------------------------+
| d7e45859-f4e0-40d6-8379-9850fd04a9ea | cpu_util | gauge | 0.233274003978 | % | 2015-09-17T09:26:12.683027 |
| d7e45859-f4e0-40d6-8379-9850fd04a9ea | cpu_util | gauge | 0.0 | % | 2015-09-17T09:26:12.683027 |
| d7e45859-f4e0-40d6-8379-9850fd04a9ea | cpu_util | gauge | 0.266814584225 | % | 2015-09-17T09:25:12.667767 |
| d7e45859-f4e0-40d6-8379-9850fd04a9ea | cpu_util | gauge | 0.0 | % | 2015-09-17T09:25:12.667767 |
| d7e45859-f4e0-40d6-8379-9850fd04a9ea | cpu_util | gauge | 0.266985496297 | % | 2015-09-17T09:24:12.701030 |
| d7e45859-f4e0-40d6-8379-9850fd04a9ea | cpu_util | gauge | 0.0 | % | 2015-09-17T09:24:12.701030 |
| d7e45859-f4e0-40d6-8379-9850fd04a9ea | cpu_util | gauge | 0.268260694066 | % | 2015-09-17T09:23:12.772681 |
| d7e45859-f4e0-40d6-8379-9850fd04a9ea | cpu_util | gauge | 0.0 | % | 2015-09-17T09:23:12.772681 |

Removing ,profiler from "notification_topics" under /etc/ceilometer/ceilometer.conf and restarting its services, causes ceilometer to properly behave as before.

Revision history for this message
Avi Weit (weit) wrote :
Zi Lian Ji (jizilian)
Changed in ceilometer:
assignee: nobody → Zi Lian Ji (jizilian)
Revision history for this message
Rohit Jaiswal (rohit-jaiswal-3) wrote :

This occurs as the Notification agent binds queues to both notifications and profiler topics, if you look at the number of queues using sudo rabbitmqctl list_queues | grep info, you would see this:

notifications.info
profiler.info

Not sure why the new osprofiler topic needs to be configured in CEILOMETER_NOTIFICATION_TOPICS to receive osprofiler.* events, it should work without that change. It could be something to do with this: https://github.com/openstack/ceilometer/blob/master/ceilometer/notification.py#L156

Revision history for this message
Avi Weit (weit) wrote :

I would just like to mention that duplication issue is being observed when appending any other value to "notification_topics", e.g. foo:

[DEFAULT]
collector_workers = 2
debug = True
verbose = True
notification_topics = notifications,foo
rpc_backend = rabbit

Revision history for this message
gordon chung (chungg) wrote :

this is real.

the problem is right now we send data from polling agent to notification agent... it uses the notification_topics option. therefore, we end up pushing the polled data to both queues: notifications.sample and <xyz>.sample. we also listen to both queues... thus we get double values.

it's arguable where the bug is. the code functions as it should -- just that the ceilometer.conf param should probably be different between polling agent and notification agent.

Changed in ceilometer:
status: New → Triaged
importance: Undecided → Medium
importance: Medium → Low
Zi Lian Ji (jizilian)
Changed in ceilometer:
assignee: Zi Lian Ji (jizilian) → nobody
Xia Linjuan (ljxiash)
Changed in ceilometer:
assignee: nobody → Xia Linjuan (ljxiash)
Revision history for this message
Xia Linjuan (ljxiash) wrote :

gorden is right. If configured "notification_topics = notifications,foo", it will produce the same information to two queues. In default we listen to both queues. That is, we have only one listener and it listens to both queues. That's why we have duplicate data.

 What I am confused is, in the code (line 229-234) each message_url listen to all the notification_topics. I think if configured the "foo" means you want a copy of messages maybe sent to the other message_url not the default one.

So, I give the solution that we make the default message_url only listen to "notifications", if so, we don't have duplicate data.
If we also configured the "foo", the default listener ignore this topic and let other message_url do.

209 targets = []
210 for ext in notification_manager:
211 handler = ext.obj
212 if (cfg.CONF.notification.disable_non_metric_meters and
213 isinstance(handler, base.NonMetricNotificationBase)):
214 continue
.
.
.
223 for new_tar in handler.get_targets(cfg.CONF):
224 if new_tar not in targets:
225 targets.append(new_tar)
226 endpoints.append(handler)
227
228 urls = cfg.CONF.notification.messaging_urls or [None]
229 for url in urls:
230 transport = messaging.get_transport(url)
231 listener = messaging.get_notification_listener(
232 transport, targets, endpoints)
233 listener.start()
234 self.listeners.append(listener)

Revision history for this message
gordon chung (chungg) wrote :

i think the confusion is that ceilometer.conf is being reused for both polling agent and notification agent. the problem is that they use the same option to do different things.

in polling agent, notification_topics option defines which queues to *send* data to, if you specify two queues, you *send* datapoint to two queues.

in notification agent, notification_topics option defines which queues to *listen* to. if you specify two topics, you *listen* to two queues.

one option is to create a topic specifically for ipc between polling and notification agent.

Changed in ceilometer:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ceilometer (master)

Reviewed: https://review.openstack.org/241073
Committed: https://git.openstack.org/cgit/openstack/ceilometer/commit/?id=b55ffaa1f8aa863dc3806d99e3de7f46f8a971ad
Submitter: Jenkins
Branch: master

commit b55ffaa1f8aa863dc3806d99e3de7f46f8a971ad
Author: xialinjuan <email address hidden>
Date: Tue Nov 17 17:53:19 2015 +0800

    Clarify the doc about multiple notification_topics usage

    Make clear that the multiple notification topic usage in ceilometer.conf
    in case of mistakenly configured that cause duplicate samples.

    DocImpact
    Closes-Bug: #1496777
    Change-Id: I7af36f8e7615ccc31ffad94169b4e5ce1f9df84f

Changed in ceilometer:
status: In Progress → Fix Committed
Revision history for this message
Thierry Carrez (ttx) wrote : Fix included in openstack/ceilometer 6.0.0.0b1

This issue was fixed in the openstack/ceilometer 6.0.0.0b1 development milestone.

Thierry Carrez (ttx)
Changed in ceilometer:
status: Fix Committed → Fix Released
Liusheng (liusheng)
Changed in ceilometer:
milestone: none → mitaka-1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.