failed ack when using batched notification listener

Bug #1536174 reported by gordon chung
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
oslo.messaging
Fix Released
High
Mehdi Abaakouk

Bug Description

occassionally, oslo.messaging will throw an error saying: Fail to ack/requeue message when using batch listener[1]:

2016-01-20 09:36:13.992 29715 ERROR oslo_messaging.notify.dispatcher [-] Fail to ack/requeue message
2016-01-20 09:36:13.992 29715 ERROR oslo_messaging.notify.dispatcher Traceback (most recent call last):
2016-01-20 09:36:13.992 29715 ERROR oslo_messaging.notify.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/notify/dispatcher.py", line 75, in _post_dispatch
2016-01-20 09:36:13.992 29715 ERROR oslo_messaging.notify.dispatcher m.acknowledge()
2016-01-20 09:36:13.992 29715 ERROR oslo_messaging.notify.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 117, in acknowledge
2016-01-20 09:36:13.992 29715 ERROR oslo_messaging.notify.dispatcher self.acknowledge_callback()
2016-01-20 09:36:13.992 29715 ERROR oslo_messaging.notify.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/impl_rabbit.py", line 186, in acknowledge
2016-01-20 09:36:13.992 29715 ERROR oslo_messaging.notify.dispatcher self._raw_message.ack()
2016-01-20 09:36:13.992 29715 ERROR oslo_messaging.notify.dispatcher File "/usr/local/lib/python2.7/dist-packages/kombu/message.py", line 88, in ack
2016-01-20 09:36:13.992 29715 ERROR oslo_messaging.notify.dispatcher self.channel.basic_ack(self.delivery_tag)
2016-01-20 09:36:13.992 29715 ERROR oslo_messaging.notify.dispatcher File "/usr/local/lib/python2.7/dist-packages/amqp/channel.py", line 1584, in basic_ack
2016-01-20 09:36:13.992 29715 ERROR oslo_messaging.notify.dispatcher self._send_method((60, 80), args)
2016-01-20 09:36:13.992 29715 ERROR oslo_messaging.notify.dispatcher File "/usr/local/lib/python2.7/dist-packages/amqp/abstract_channel.py", line 50, in _send_method
2016-01-20 09:36:13.992 29715 ERROR oslo_messaging.notify.dispatcher raise RecoverableConnectionError('connection already closed')
2016-01-20 09:36:13.992 29715 ERROR oslo_messaging.notify.dispatcher RecoverableConnectionError: connection already closed

this happens with batch_size = 10 and batch_timeout = 5 but i've seend it happen basically when batch_size != 1

[1] http://logs.openstack.org/94/269594/8/check/gate-ceilometer-dsvm-integration/f8322df/logs/screen-ceilometer-collector.txt.gz

Revision history for this message
gordon chung (chungg) wrote :

this seems to be happening pretty regularly in the gate. i'm going to mark as high, someone please feel free to drop or raise this priority

Changed in oslo.messaging:
importance: Undecided → High
status: New → Confirmed
Revision history for this message
gordon chung (chungg) wrote :

still an issue with 4.6.1

Revision history for this message
Mehdi Abaakouk (sileht) wrote :
Changed in oslo.messaging:
status: Confirmed → Fix Committed
assignee: nobody → Mehdi Abaakouk (sileht)
Changed in oslo.messaging:
status: Fix Committed → Fix Released
Revision history for this message
jazeltq (jazeltq-k) wrote :

Can someone tell more details on this? I met this on ceilometer notification's log very often...
And the queue consumer is also disappeared....

Revision history for this message
jazeltq (jazeltq-k) wrote :

Is this really fixed? Ceilometer use stable/newton oslo.messaging 5.10.2 with threading executor met this in notification's log...
[root@controller-1-jxq-pod1-rack4 _drivers]# rpm -qa | grep oslo.messaging
python2-oslo-messaging-5.10.2-1.el7.noarch

2017-10-19 11:05:18.113 195838 ERROR root [-] Unexpected exception occurred 60 time(s)... retrying.
2017-10-19 11:05:18.113 195838 ERROR root Traceback (most recent call last):
2017-10-19 11:05:18.113 195838 ERROR root File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 250, in wrapper
2017-10-19 11:05:18.113 195838 ERROR root return infunc(*args, **kwargs)
2017-10-19 11:05:18.113 195838 ERROR root File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/base.py", line 203, in _runner
2017-10-19 11:05:18.113 195838 ERROR root batch_size=self.batch_size, batch_timeout=self.batch_timeout)
2017-10-19 11:05:18.113 195838 ERROR root File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/base.py", line 52, in wrapper
2017-10-19 11:05:18.113 195838 ERROR root msg = func(in_self, timeout=timeout_watch.leftover(True))
2017-10-19 11:05:18.113 195838 ERROR root File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 286, in poll
2017-10-19 11:05:18.113 195838 ERROR root self._message_operations_handler.process()
2017-10-19 11:05:18.113 195838 ERROR root File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 89, in process
2017-10-19 11:05:18.113 195838 ERROR root task()
2017-10-19 11:05:18.113 195838 ERROR root File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/impl_rabbit.py", line 251, in acknowledge
2017-10-19 11:05:18.113 195838 ERROR root self._raw_message.ack()
2017-10-19 11:05:18.113 195838 ERROR root File "/usr/lib/python2.7/site-packages/kombu/message.py", line 88, in ack
2017-10-19 11:05:18.113 195838 ERROR root self.channel.basic_ack(self.delivery_tag)
2017-10-19 11:05:18.113 195838 ERROR root File "/usr/lib/python2.7/site-packages/amqp/channel.py", line 1583, in basic_ack
2017-10-19 11:05:18.113 195838 ERROR root self._send_method((60, 80), args)
2017-10-19 11:05:18.113 195838 ERROR root File "/usr/lib/python2.7/site-packages/amqp/abstract_channel.py", line 50, in _send_method
2017-10-19 11:05:18.113 195838 ERROR root raise RecoverableConnectionError('connection already closed')
2017-10-19 11:05:18.113 195838 ERROR root RecoverableConnectionError: connection already closed

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.