potential race condition in publisher cache

Bug #1523331 reported by ZhiQiang Fan
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ceilometer
Invalid
Medium
Unassigned

Bug Description

observed this: http://logs.openstack.org/19/250019/2/gate/gate-tempest-dsvm-ceilometer-mysql-neutron-full/d2ebdc8/logs/screen-ceilometer-anotification.txt.gz?level=WARNING#_2015-11-27_19_53_31_206

I think it is caused by two threads operate a same queue, so after one thread pop the element, the other one will fail.

The reason why two threads have same queue reference, might because there is a context switch happens just at https://github.com/openstack/ceilometer/blob/07e9066fc0eb4d76104486f28349bd1e60713650/ceilometer/publisher/messaging.py#L146-L147 , so before thread A clear the self.local_queue, it might yield for some reason, then thread B will get a uncleared queue, then they hold a same queue.

I think a lock might be needed when process a shared queue object

Revision history for this message
ZhiQiang Fan (aji-zqfan) wrote :

or can we depend on duplicate messages processing strategy in the consumer side, and just ignore such case?

Changed in ceilometer:
assignee: nobody → ZhiQiang Fan (aji-zqfan)
Revision history for this message
Rohit Jaiswal (rohit-jaiswal-3) wrote :

I think a better way is to flush in a different thread than the threads that add to the queue, see https://github.com/openstack/monasca-ceilometer/blob/master/ceilosca/ceilometer/publisher/monclient.py#L94

Revision history for this message
gordon chung (chungg) wrote :

i'm seeing this a lot now as well

Changed in ceilometer:
status: New → Triaged
importance: Undecided → High
Revision history for this message
gordon chung (chungg) wrote :

this could be addressed by https://review.openstack.org/#/c/275741/

gordon chung (chungg)
Changed in ceilometer:
importance: High → Medium
assignee: ZhiQiang Fan (aji-zqfan) → nobody
Revision history for this message
gordon chung (chungg) wrote :

closing, we have a single thread handling publishing

Changed in ceilometer:
status: Triaged → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.