duplicate samples in ceilometer mongo db

Bug #1394600 reported by Vladimir Grujic
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
Won't Fix
Medium
MOS Ceilometer
5.1.x
Won't Fix
Medium
MOS Ceilometer
6.0.x
Won't Fix
Medium
MOS Ceilometer
6.1.x
Won't Fix
Medium
MOS Ceilometer
7.0.x
Won't Fix
Medium
MOS Ceilometer

Bug Description

I have a 5.1 HA deployed system with 3 controllers.
When looking at the ceilometer sample-list data , I've noticed duplicated samples for the same resource_id in the same emission interval. Those samples have microtime in their timestamp field and normal samples do not. Looked at the rabbitmq log I see those messages received multiple times . Looks like a messaging issue between ceilometer-compute-agent and rabbitmq.
BTW duplicated samples brake stats on Resource Usage graphs . The graphs are looking non consistent with the actual state of things in the cloud.

Vladimir

Revision history for this message
Vladimir Grujic (hyperbaba) wrote :

Update,

IvanBerezovsky found out that source of the duplicate messages is NOVA sending them via notification.info channel
For example upon instance creation 18 messages are get sent from nova to notification.info. 12 of them contain metering data (the same data).Those messages are get pickup via ceilometer-agent-notification and get writen to mongo via ceilometer-collector

Revision history for this message
Dmitry Mescheryakov (dmitrymex) wrote :

Ceilometer team, please check if the issue is reproducible on 6.0 as well

Changed in mos:
assignee: nobody → MOS Ceilometer (mos-ceilometer)
milestone: none → 5.1.2
importance: Undecided → High
status: New → Confirmed
tags: added: ceilometer
Revision history for this message
Ivan Berezovskiy (iberezovskiy) wrote :

Each message with metering data for notification agent contains some metric types: memory, instance, disk.root.size, vcpus and etc (http://docs.openstack.org/developer/ceilometer/measurements.html#compute-nova). That means that from one message will be written more than one sample (one sample per metric type). Sample will contain info only about one notification type (memory or instance, or disk.root.size or ..). If we have 12 messages, so we will have 12 sample for memory, 12 sample for instance and etc. There is no any duplication on ceilometer side.

Need in research why nova sends so much message about one instance.

Ilya Tyaptin will take a look on this, but we also need in research from nova side.

Revision history for this message
Ivan Berezovskiy (iberezovskiy) wrote :

Messages in rabbit queue notification.info: http://paste.openstack.org/show/135906/

tags: added: nova
Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

We checked the message Nova sends (http://paste.openstack.org/show/135906/) and IMHO, it's not a Nova issue: it just sends a message for every state of instance boot (start, scheduling, building, spawning, port creation, etc). It's left up to the thing, that collects those message how to treat them.

I have no idea how Ceilometer works, but I suppose they have some kind of 'drivers' which receives these messages, analyzes them and puts to a DB. From what I see here, what you really want is to take the metering values not from all messages (note, that for other consumers it *may* make sense to provide the data in each message, despite the fact the existing 'driver' in Ceilometer doesn't want that).

Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

Untriaging this for now. Needs more eyes from the Ceilometer team and more discussion. Feel free to reach us in Skype/IRC/email.

Revision history for this message
Dmitry Mescheryakov (dmitrymex) wrote :

I've moved the issue to confirmed since it exists. We are not going to fix it, but rather we will document a proper workaround for it.

Changed in mos:
status: Confirmed → Won't Fix
tags: added: release-note
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.