Collector dispatches sample/ event multiple times to dispatcher when multiple dispatchers are configured

Bug #1437689 reported by Rohit Jaiswal
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ceilometer
Won't Fix
Low
Rohit Jaiswal

Bug Description

The requeue_event_on_dispatcher_error is a good feature, but it may cause
data to be dispatched multiple times in below scenario-

When multiple dispatchers are enabled, if one dispatcher, say
database raise exception, then it will be requeued, but http dispatcher or file dispatcher may succeed, then the http dispatcher will handle that message twice, how could we avoid this scenario? does it depend on the http target to filter duplicate data?

Changed in ceilometer:
assignee: nobody → Rohit Jaiswal (rohit-jaiswal-3)
description: updated
description: updated
description: updated
Revision history for this message
Rohit Jaiswal (rohit-jaiswal-3) wrote :

It might be complex for a dispatcher to handle duplicate data. Is it possible for each dispatcher to requeue data on a different error queue specific to that dispatcher? So, each dispatcher will queue data on a dispatcher-specific error queue on a failure, instead of the collector requeue on metering queue. There are fixed number of dispatchers, so the number and size of these error queues is expected to be within bounds, assuming there is a limited number of failures.. This frees up the metering queue from the burden of failed data and also allows collector to continue accepting new data.

Revision history for this message
ZhiQiang Fan (aji-zqfan) wrote :

I think the right direction may be: dispatcher should handle duplicate data by its own.

Revision history for this message
Rohit Jaiswal (rohit-jaiswal-3) wrote :

I agree; i just think that having each dispatcher handle duplicate data in its own way is probably not the best solution. i mean i was thinking about an approach which would avoid data duplication with each new dispatcher having to do minimal work.

Revision history for this message
Rohit Jaiswal (rohit-jaiswal-3) wrote :

Here's what i was thinking -

1. When dispatcher fails to process a sample/event, it raises the exception to collector worker and queues the failed sample/event to a queue by the name - metering-<dispatcher>.error

2. Collector worker receives the exception and does not requeue the sample to metering queue. Collector calls a new method in each dispatcher to check for failed sample/event in their respective queue. Each dispatcher defines and knows its error queue (metering-<dispatcher>.error)

3. All known and configured dispatchers check their respective error queues for failed sample/event, but only the dispatchers that find something in their error queue attempt reprocessing of the failed sample/event by calling the respective storage layer repeatedly.

New dependency is of dispatcher with oslo.messaging

Revision history for this message
gordon chung (chungg) wrote :

so Rohit and i spoke on irc (http://eavesdrop.openstack.org/irclogs/%23openstack-ceilometer/%23openstack-ceilometer.2015-03-30.log)

this can be solved by not enabling multiple dispatchers but instead publishing to two different queues, and having two collectors (with a single dispatch target) listen to each one... i think because of that this bug is less important. also, i think it might actually be a bit more efficient to do it it this way than having a dispatcher manage its own set of queues.

Changed in ceilometer:
importance: Undecided → Low
gordon chung (chungg)
Changed in ceilometer:
status: New → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.