Reply queues are accidentally not found when services try to consume from them

Bug #1415932 reported by Nickita Zaporozhets
70
This bug affects 14 people
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
Fix Released
Critical
Dmitry Mescheryakov
5.1.x
Fix Released
Critical
Alexey Khivin
6.0.x
Fix Released
Critical
Alexey Khivin
6.1.x
Fix Released
Critical
Alexey Khivin
7.0.x
Fix Released
Critical
Dmitry Mescheryakov

Bug Description

VERSION:
  feature_groups:
    - mirantis
    - experimental
  production: "docker"
  release: "6.0"
  api: "1.0"
  build_number: "58"
  build_id: "2014-12-26_14-25-46"
  astute_sha: "16b252d93be6aaa73030b8100cf8c5ca6a970a91"
  fuellib_sha: "fde8ba5e11a1acaf819d402c645c731af450aff0"
  ostf_sha: "a9afb68710d809570460c29d6c3293219d3624d4"
  nailgun_sha: "5f91157daa6798ff522ca9f6d34e7e135f150a90"
  fuelmain_sha: "81d38d6f2903b5a8b4bee79ca45a54b76c1361b8"

Nova scheduler loses rabbit queue while booting new instance. Queue with that name exists(according to list queues). Restarting nova-scheduler solves the problem.

The same is true for other services (cinder, neutron), so this is not nova-specific.

Diagnostic snapshot: https://drive.google.com/a/mirantis.com/file/d/0B1aGKcp6WbtcRFRjUjZKLUo5Nmc/view?usp=sharing

 node-2 nova-api Connected to AMQP server on 127.0.0.1:5673
<182>Jan 29 13:04:45 node-2 nova-api 152.90.66.10 "POST /v2/9168b29b5b1d43dfa41a9914de3c6c0b/servers HTTP/1.1" status: 202 len: 755 time: 0.4488471
<179>Jan 29 13:04:45 node-2 nova-conductor Failed to consume message from queue: Basic.consume: (404) NOT_FOUND - no queue 'reply_cfeb6683351949ecae09082f682b15c9
' in vhost '/'
2015-01-29 13:04:45.355 51229 TRACE oslo.messaging._drivers.impl_rabbit Traceback (most recent call last):
2015-01-29 13:04:45.355 51229 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/dist-packages/oslo/messaging/_drivers/impl_rabbit.py", line 681
, in ensure
2015-01-29 13:04:45.355 51229 TRACE oslo.messaging._drivers.impl_rabbit return method()
2015-01-29 13:04:45.355 51229 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/dist-packages/oslo/messaging/_drivers/impl_rabbit.py", line 765
, in _consume
2015-01-29 13:04:45.355 51229 TRACE oslo.messaging._drivers.impl_rabbit queues_tail.consume(nowait=False)
2015-01-29 13:04:45.355 51229 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/dist-packages/oslo/messaging/_drivers/impl_rabbit.py", line 214
, in consume
2015-01-29 13:04:45.355 51229 TRACE oslo.messaging._drivers.impl_rabbit self.queue.consume(*args, callback=_callback, **options)
2015-01-29 13:04:45.355 51229 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/dist-packages/kombu/entity.py", line 611, in consume
2015-01-29 13:04:45.355 51229 TRACE oslo.messaging._drivers.impl_rabbit nowait=nowait)
2015-01-29 13:04:45.355 51229 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/dist-packages/amqp/channel.py", line 1787, in basic_consume
2015-01-29 13:04:45.355 51229 TRACE oslo.messaging._drivers.impl_rabbit (60, 21), # Channel.basic_consume_ok
2015-01-29 13:04:45.355 51229 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/dist-packages/amqp/abstract_channel.py", line 69, in wait
2015-01-29 13:04:45.355 51229 TRACE oslo.messaging._drivers.impl_rabbit return self.dispatch_method(method_sig, args, content)
2015-01-29 13:04:45.355 51229 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/dist-packages/amqp/abstract_channel.py", line 87, in dispatch_m
ethod
2015-01-29 13:04:45.355 51229 TRACE oslo.messaging._drivers.impl_rabbit return amqp_method(self, args)
2015-01-29 13:04:45.355 51229 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/dist-packages/amqp/channel.py", line 241, in _close
2015-01-29 13:04:45.355 51229 TRACE oslo.messaging._drivers.impl_rabbit reply_code, reply_text, (class_id, method_id), ChannelError,
2015-01-29 13:04:45.355 51229 TRACE oslo.messaging._drivers.impl_rabbit NotFound: Basic.consume: (404) NOT_FOUND - no queue 'reply_cfeb6683351949ecae09082f682b15c
9' in vhost '/'

 node-2 nova-conductor Exception during scheduler.run_instance
2015-01-29 13:05:45.404 51229 TRACE nova.scheduler.driver Traceback (most recent call last):
2015-01-29 13:05:45.404 51229 TRACE nova.scheduler.driver File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 614, in build_instances
2015-01-29 13:05:45.404 51229 TRACE nova.scheduler.driver request_spec, filter_properties)
2015-01-29 13:05:45.404 51229 TRACE nova.scheduler.driver File "/usr/lib/python2.7/dist-packages/nova/scheduler/client/__init__.py", line 49, in select_destinations
2015-01-29 13:05:45.404 51229 TRACE nova.scheduler.driver context, request_spec, filter_properties)
2015-01-29 13:05:45.404 51229 TRACE nova.scheduler.driver File "/usr/lib/python2.7/dist-packages/nova/scheduler/client/__init__.py", line 35, in __run_method
2015-01-29 13:05:45.404 51229 TRACE nova.scheduler.driver return getattr(self.instance, __name)(*args, **kwargs)
2015-01-29 13:05:45.404 51229 TRACE nova.scheduler.driver File "/usr/lib/python2.7/dist-packages/nova/scheduler/client/query.py", line 34, in select_destinations
2015-01-29 13:05:45.404 51229 TRACE nova.scheduler.driver context, request_spec, filter_properties)
2015-01-29 13:05:45.404 51229 TRACE nova.scheduler.driver File "/usr/lib/python2.7/dist-packages/nova/scheduler/rpcapi.py", line 108, in select_destinations
2015-01-29 13:05:45.404 51229 TRACE nova.scheduler.driver request_spec=request_spec, filter_properties=filter_properties)
2015-01-29 13:05:45.404 51229 TRACE nova.scheduler.driver File "/usr/lib/python2.7/dist-packages/oslo/messaging/rpc/client.py", line 152, in call
2015-01-29 13:05:45.404 51229 TRACE nova.scheduler.driver retry=self.retry)
2015-01-29 13:05:45.404 51229 TRACE nova.scheduler.driver File "/usr/lib/python2.7/dist-packages/oslo/messaging/transport.py", line 90, in _send
2015-01-29 13:05:45.404 51229 TRACE nova.scheduler.driver timeout=timeout, retry=retry)
2015-01-29 13:05:45.404 51229 TRACE nova.scheduler.driver File "/usr/lib/python2.7/dist-packages/oslo/messaging/_drivers/amqpdriver.py", line 434, in send
2015-01-29 13:05:45.404 51229 TRACE nova.scheduler.driver retry=retry)
2015-01-29 13:05:45.404 51229 TRACE nova.scheduler.driver File "/usr/lib/python2.7/dist-packages/oslo/messaging/_drivers/amqpdriver.py", line 423, in _send
2015-01-29 13:05:45.404 51229 TRACE nova.scheduler.driver result = self._waiter.wait(msg_id, timeout)
2015-01-29 13:05:45.404 51229 TRACE nova.scheduler.driver File "/usr/lib/python2.7/dist-packages/oslo/messaging/_drivers/amqpdriver.py", line 289, in wait
2015-01-29 13:05:45.404 51229 TRACE nova.scheduler.driver reply, ending = self._poll_connection(msg_id, timeout)
2015-01-29 13:05:45.404 51229 TRACE nova.scheduler.driver File "/usr/lib/python2.7/dist-packages/oslo/messaging/_drivers/amqpdriver.py", line 239, in _poll_connection
2015-01-29 13:05:45.404 51229 TRACE nova.scheduler.driver % msg_id)
2015-01-29 13:05:45.404 51229 TRACE nova.scheduler.driver MessagingTimeout: Timed out waiting for a reply to message ID 52ee60fcecf74ebebd713afd91bf6238
2015-01-29 13:05:45.404 51229 TRACE nova.scheduler.driver
<180>Jan 29 13:05:45 node-2 nova-conductor Setting instance to ERROR state.

tags: added: nova
tags: added: messaging
removed: nova
summary: - Nova scheduler loses rabbit queue
+ Reply queues are accidentally not found when services try to consume
+ from them
Changed in mos:
status: New → Incomplete
assignee: nobody → MOS Oslo (mos-oslo)
description: updated
description: updated
Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

I took a look at the diagnostic snapshot. It's still not clear what exactly happens, but must have nothing to do with nova services, as I can see the same error in neutron and cinder logs.

Updated the description accordingly. I'd like the messaging guys to take a look.

I wonder, if this may be caused by x-expires set to 3600000 on reply queues (+ those are not durable)

Revision history for this message
Roman Alekseenkov (ralekseenkov) wrote :

Why did we move customer-found bug to Incomplete state?

Let's get on it & get more info from the bug reporter. Just tell Nickita what he needs to provide

Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

Roman, I only did that, because we hadn't managed to reproduce the bug on any of our 6.0 environments.

I took a look at the diagnostic snapshot Nickita provided, but it's hard to understand what exactly caused the problem: rabbitmq seems to work fine, but yet reply queues disappear sporadically.

What we need right now is an env with reproduced error. And I still want the mos-oslo guys to take a look. Maybe they will be able to tell more.

Revision history for this message
Sergey (svtvin) wrote :

Every 2nd, 3rd time when I launch an new instance I got the same error too.
I have an env with reproduce this error and I can provide you access.
My env was installed from scratch.

VERSION:
  feature_groups:
    - mirantis
    - experimental
  production: "docker"
  release: "6.0"
  api: "1.0"
  build_number: "58"
  build_id: "2014-12-26_14-25-46"
  astute_sha: "16b252d93be6aaa73030b8100cf8c5ca6a970a91"
  fuellib_sha: "fde8ba5e11a1acaf819d402c645c731af450aff0"
  ostf_sha: "a9afb68710d809570460c29d6c3293219d3624d4"
  nailgun_sha: "5f91157daa6798ff522ca9f6d34e7e135f150a90"
  fuelmain_sha: "81d38d6f2903b5a8b4bee79ca45a54b76c1361b8"

Denis Ipatov (dipatov)
Changed in mos:
importance: Undecided → High
importance: High → Critical
Revision history for this message
Davanum Srinivas (DIMS) (dims-v) wrote :

Roman,

The traceback looks similar to the one in this upstream bug:
https://bugs.launchpad.net/oslo.messaging/+bug/1318721

So we should try to see if the review below works:
https://review.openstack.org/#/c/103157/

-- dims

Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

Looks like we have found the same issues in the past:
https://bugs.launchpad.net/mos/+bug/1364480

^^ the workaround for this issue is to run only one RabbitMQ service, in this case queues should be found successfully.

Could anyone with the environment, where this issue can be reproduced, try this workaround?

Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

Looks like this issue will reproduced only on HA environment with 2-3 OpenStack controller nodes, I will try to reproduce the issue on my test environment.

Changed in mos:
milestone: none → 6.1
Revision history for this message
Alexey Khivin (akhivin) wrote :

"Every 2nd, 3rd time when I launch an new instance I got the same error too."

At first glance looks like one of the rabbitmq nodes came out of the cluster

Revision history for this message
Davanum Srinivas (DIMS) (dims-v) wrote :

fyi, the stack trace mentioned in the description is really really really old. not related to the problem from yesterday reported by Paul

Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

I suggest to keep this bug in incomplete state while I do not reproduce the issue - I want to reproduce the issue on my environment and add some comments, because it is critical issue.

Changed in mos:
status: Incomplete → In Progress
Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

Looks like we have the critical bug with RabbitMQ configuration and in oslo.messaging, need to fix these issues.

Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

Steps To Reproduce:
1. Deploy OpenStack cloud with 3 controllers.
2. Shutdown one controller with 'poweroff' command
3. Check the status of cluster

Observed Result:
Cluster is broken.

tags: added: sahara
tags: removed: sahara
Revision history for this message
Nikita Gubenko (nikita-gubenko) wrote :

Hitting the same issue on MOS 6.0 (centos) after the hard reboot of one of the controllers.
Any fix found yet?

Revision history for this message
Sergey (svtvin) wrote :

Hit this bug in 70-90% cases when opening console link for instances:

Links like this:
http://10.0.4.2/horizon/project/instances/c946f126-c1d2-4188-9327-c73afb4c714e/?tab=instance_details__console

<179>Mar 20 15:13:58 node-64 nova-api Failed to consume message from queue: Basic.consume: (404) NOT_FOUND - no queue 'reply_f067833fc2dc4a2995411140a64eedff' in vhost '/'
2015-03-20 15:13:58.166 30676 TRACE oslo.messaging._drivers.impl_rabbit Traceback (most recent call last):
2015-03-20 15:13:58.166 30676 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/dist-packages/oslo/messaging/_drivers/impl_rabbit.py", line 681, in ensure
2015-03-20 15:13:58.166 30676 TRACE oslo.messaging._drivers.impl_rabbit return method()
2015-03-20 15:13:58.166 30676 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/dist-packages/oslo/messaging/_drivers/impl_rabbit.py", line 765, in _consume
2015-03-20 15:13:58.166 30676 TRACE oslo.messaging._drivers.impl_rabbit queues_tail.consume(nowait=False)
2015-03-20 15:13:58.166 30676 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/dist-packages/oslo/messaging/_drivers/impl_rabbit.py", line 214, in consume
2015-03-20 15:13:58.166 30676 TRACE oslo.messaging._drivers.impl_rabbit self.queue.consume(*args, callback=_callback, **options)
2015-03-20 15:13:58.166 30676 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/dist-packages/kombu/entity.py", line 611, in consume
2015-03-20 15:13:58.166 30676 TRACE oslo.messaging._drivers.impl_rabbit nowait=nowait)
2015-03-20 15:13:58.166 30676 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/dist-packages/amqp/channel.py", line 1787, in basic_consume
2015-03-20 15:13:58.166 30676 TRACE oslo.messaging._drivers.impl_rabbit (60, 21), # Channel.basic_consume_ok
2015-03-20 15:13:58.166 30676 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/dist-packages/amqp/abstract_channel.py", line 69, in wait
2015-03-20 15:13:58.166 30676 TRACE oslo.messaging._drivers.impl_rabbit return self.dispatch_method(method_sig, args, content)
2015-03-20 15:13:58.166 30676 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/dist-packages/amqp/abstract_channel.py", line 87, in dispatch_method
2015-03-20 15:13:58.166 30676 TRACE oslo.messaging._drivers.impl_rabbit return amqp_method(self, args)
2015-03-20 15:13:58.166 30676 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/lib/python2.7/dist-packages/amqp/channel.py", line 241, in _close
2015-03-20 15:13:58.166 30676 TRACE oslo.messaging._drivers.impl_rabbit reply_code, reply_text, (class_id, method_id), ChannelError,
2015-03-20 15:13:58.166 30676 TRACE oslo.messaging._drivers.impl_rabbit NotFound: Basic.consume: (404) NOT_FOUND - no queue 'reply_f067833fc2dc4a2995411140a64eedff' in vhost '/'
2015-03-20 15:13:58.166 30676 TRACE oslo.messaging._drivers.impl_rabbit

Revision history for this message
Sergey (svtvin) wrote :

As temporary workaround, I found a solution:
Stop rabbitmq-server service on all control nodes, except one active control node,
where is API IP. (service rabbitmq-server stop)

Revision history for this message
Sergey (svtvin) wrote :

> Stop rabbitmq-server service on all control nodes, except one active
it helps in partly.

As a full solution for me, for now:
Downgrade to rabbitmq-server v2.8.7 on all control nodes.
Looks like it works fine.

Changed in mos:
assignee: MOS Oslo (mos-oslo) → Alex Khivin (akhivin)
Revision history for this message
Alexey Khivin (akhivin) wrote :

The story of queue life

http://paste.openstack.org/show/197823/

the easiest fix I see is to remove {"x-expires",3600000} for a such kind of queues but it needs a little more time for investigation

Revision history for this message
OSCI Robot (oscirobot) wrote :

RPM package oslo.messaging has been built for project openstack/oslo.messaging
Package version == 1.3.1, package release == fuel5.1.1.mira1.git.376d4f3.a06572b

Changeset: https://review.fuel-infra.org/5431
project: openstack/oslo.messaging
branch: openstack-ci/fuel-5.1.1-updates/2014.1.1
author: Alex Khivin
committer: Alex Khivin
subject: Reply queues are accidentally not found
status: patchset-created

Files placed on repository:
python-oslo-messaging-1.3.1-fuel5.1.1.mira1.git.376d4f3.a06572b.noarch.rpm
python-oslo-messaging-doc-1.3.1-fuel5.1.1.mira1.git.376d4f3.a06572b.noarch.rpm

NOTE: Changeset is not merged, created temporary package repository.
RPM repository URL: http://osci-obs.vm.mirantis.net:82/centos-fuel-5.1.1-updates-stable-5431/centos

Revision history for this message
OSCI Robot (oscirobot) wrote :

DEB package oslo.messaging has been built for project openstack/oslo.messaging
Package version == 1.3.1, package release == fuel5.1.1~mira1+git.376d4f3.a06572b

Changeset: https://review.fuel-infra.org/5431
project: openstack/oslo.messaging
branch: openstack-ci/fuel-5.1.1-updates/2014.1.1
author: Alex Khivin
committer: Alex Khivin
subject: Reply queues are accidentally not found
status: patchset-created

Files placed on repository:
python-oslo.messaging_1.3.1-fuel5.1.1~mira1+git.376d4f3.a06572b_all.deb

NOTE: Changeset is not merged, created temporary package repository.
DEB repository URL: http://osci-obs.vm.mirantis.net:82/ubuntu-fuel-5.1.1-updates-stable-5431/ubuntu

Alexey Khivin (akhivin)
Changed in mos:
status: In Progress → Won't Fix
status: Won't Fix → In Progress
Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix proposed to openstack/oslo.messaging (openstack-ci/fuel-5.1.1-updates/2014.1.1)

Fix proposed to branch: openstack-ci/fuel-5.1.1-updates/2014.1.1
Change author: Alex Khivin <email address hidden>
Review: https://review.fuel-infra.org/5431

Revision history for this message
OSCI Robot (oscirobot) wrote :

RPM package oslo.messaging has been built for project openstack/oslo.messaging
Package version == 1.3.1, package release == fuel5.1.1.mira1.git.a860294.a06572b

Changeset: https://review.fuel-infra.org/5431
project: openstack/oslo.messaging
branch: openstack-ci/fuel-5.1.1-updates/2014.1.1
author: Alex Khivin
committer: Alex Khivin
subject: Reply queues are accidentally not found
status: patchset-created

Files placed on repository:
python-oslo-messaging-1.3.1-fuel5.1.1.mira1.git.a860294.a06572b.noarch.rpm
python-oslo-messaging-doc-1.3.1-fuel5.1.1.mira1.git.a860294.a06572b.noarch.rpm

NOTE: Changeset is not merged, created temporary package repository.
RPM repository URL: http://osci-obs.vm.mirantis.net:82/centos-fuel-5.1.1-updates-stable-5431/centos

Revision history for this message
OSCI Robot (oscirobot) wrote :

DEB package oslo.messaging has been built for project openstack/oslo.messaging
Package version == 1.3.1, package release == fuel5.1.1~mira1+git.a860294.a06572b

Changeset: https://review.fuel-infra.org/5431
project: openstack/oslo.messaging
branch: openstack-ci/fuel-5.1.1-updates/2014.1.1
author: Alex Khivin
committer: Alex Khivin
subject: Reply queues are accidentally not found
status: patchset-created

Files placed on repository:
python-oslo.messaging_1.3.1-fuel5.1.1~mira1+git.a860294.a06572b_all.deb

NOTE: Changeset is not merged, created temporary package repository.
DEB repository URL: http://osci-obs.vm.mirantis.net:82/ubuntu-fuel-5.1.1-updates-stable-5431/ubuntu

Revision history for this message
OSCI Robot (oscirobot) wrote :

RPM package oslo.messaging has been built for project openstack/oslo.messaging
Package version == 1.3.1, package release == fuel5.1.2.mira2.git.ea37859.a06572b

Changeset: https://review.fuel-infra.org/5868
project: openstack/oslo.messaging
branch: openstack-ci/fuel-5.1.2/2014.1.1
author: Alex Khivin
committer: Alex Khivin
subject: Reply queues are accidentally not found
status: patchset-created

Files placed on repository:
python-oslo-messaging-1.3.1-fuel5.1.2.mira2.git.ea37859.a06572b.noarch.rpm
python-oslo-messaging-doc-1.3.1-fuel5.1.2.mira2.git.ea37859.a06572b.noarch.rpm

NOTE: Changeset is not merged, created temporary package repository.
RPM repository URL: http://osci-obs.vm.mirantis.net:82/centos-fuel-5.1.2-stable-5868/centos

Revision history for this message
OSCI Robot (oscirobot) wrote :

DEB package oslo.messaging has been built for project openstack/oslo.messaging
Package version == 1.3.1, package release == fuel5.1.2~mira2+git.ea37859.a06572b

Changeset: https://review.fuel-infra.org/5868
project: openstack/oslo.messaging
branch: openstack-ci/fuel-5.1.2/2014.1.1
author: Alex Khivin
committer: Alex Khivin
subject: Reply queues are accidentally not found
status: patchset-created

Files placed on repository:
python-oslo.messaging_1.3.1-fuel5.1.2~mira2+git.ea37859.a06572b_all.deb

NOTE: Changeset is not merged, created temporary package repository.
DEB repository URL: http://osci-obs.vm.mirantis.net:82/ubuntu-fuel-5.1.2-stable-5868/ubuntu

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix proposed to openstack/oslo.messaging (openstack-ci/fuel-6.0.1/2014.2)

Fix proposed to branch: openstack-ci/fuel-6.0.1/2014.2
Change author: Alex Khivin <email address hidden>
Review: https://review.fuel-infra.org/5882

Revision history for this message
OSCI Robot (oscirobot) wrote :

RPM package oslo.messaging has been built for project openstack/oslo.messaging
Package version == 1.4.1, package release == fuel6.0.1.mira25.git.a5e26a6.f9e74fb

Changeset: https://review.fuel-infra.org/5882
project: openstack/oslo.messaging
branch: openstack-ci/fuel-6.0.1/2014.2
author: Alex Khivin
committer: Alex Khivin
subject: Reply queues are accidentally not found
status: patchset-created

Files placed on repository:
python-oslo-messaging-1.4.1-fuel6.0.1.mira25.git.a5e26a6.f9e74fb.noarch.rpm
python-oslo-messaging-doc-1.4.1-fuel6.0.1.mira25.git.a5e26a6.f9e74fb.noarch.rpm

NOTE: Changeset is not merged, created temporary package repository.
RPM repository URL: http://osci-obs.vm.mirantis.net:82/centos-fuel-6.0.1-stable-5882/centos

Revision history for this message
OSCI Robot (oscirobot) wrote :

DEB package oslo.messaging has been built for project openstack/oslo.messaging
Package version == 1.4.1, package release == fuel6.0.1~mira23+git.a5e26a6.f9e74fb

Changeset: https://review.fuel-infra.org/5882
project: openstack/oslo.messaging
branch: openstack-ci/fuel-6.0.1/2014.2
author: Alex Khivin
committer: Alex Khivin
subject: Reply queues are accidentally not found
status: patchset-created

Files placed on repository:
python-oslo.messaging_1.4.1-fuel6.0.1~mira23+git.a5e26a6.f9e74fb_all.deb

NOTE: Changeset is not merged, created temporary package repository.
DEB repository URL: http://osci-obs.vm.mirantis.net:82/ubuntu-fuel-6.0.1-stable-5882/ubuntu

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix proposed to openstack/oslo.messaging (openstack-ci/fuel-6.0-updates/2014.2)

Fix proposed to branch: openstack-ci/fuel-6.0-updates/2014.2
Change author: Alex Khivin <email address hidden>
Review: https://review.fuel-infra.org/5888

Revision history for this message
OSCI Robot (oscirobot) wrote :

RPM package oslo.messaging has been built for project openstack/oslo.messaging
Package version == 1.4.1, package release == fuel6.0.mira25.git.f859eb0.ae0dd8f

Changeset: https://review.fuel-infra.org/5888
project: openstack/oslo.messaging
branch: openstack-ci/fuel-6.0-updates/2014.2
author: Alex Khivin
committer: Alex Khivin
subject: Reply queues are accidentally not found
status: patchset-created

Files placed on repository:
python-oslo-messaging-1.4.1-fuel6.0.mira25.git.f859eb0.ae0dd8f.noarch.rpm
python-oslo-messaging-doc-1.4.1-fuel6.0.mira25.git.f859eb0.ae0dd8f.noarch.rpm

NOTE: Changeset is not merged, created temporary package repository.
RPM repository URL: http://osci-obs.vm.mirantis.net:82/centos-fuel-6.0-updates-stable-5888/centos

Revision history for this message
OSCI Robot (oscirobot) wrote :

DEB package oslo.messaging has been built for project openstack/oslo.messaging
Package version == 1.4.1, package release == fuel6.0~mira23+git.f859eb0.ae0dd8f

Changeset: https://review.fuel-infra.org/5888
project: openstack/oslo.messaging
branch: openstack-ci/fuel-6.0-updates/2014.2
author: Alex Khivin
committer: Alex Khivin
subject: Reply queues are accidentally not found
status: patchset-created

Files placed on repository:
python-oslo.messaging_1.4.1-fuel6.0~mira23+git.f859eb0.ae0dd8f_all.deb

NOTE: Changeset is not merged, created temporary package repository.
DEB repository URL: http://osci-obs.vm.mirantis.net:82/ubuntu-fuel-6.0-updates-stable-5888/ubuntu

Revision history for this message
OSCI Robot (oscirobot) wrote :

RPM package oslo.messaging has been built for project openstack/oslo.messaging
Package version == 1.4.1, package release == fuel6.0.mira25.git.ce9e999.ae0dd8f

Changeset: https://review.fuel-infra.org/5888
project: openstack/oslo.messaging
branch: openstack-ci/fuel-6.0-updates/2014.2
author: Alex Khivin
committer: Alex Khivin
subject: Reply queues are accidentally not found
status: patchset-created

Files placed on repository:
python-oslo-messaging-1.4.1-fuel6.0.mira25.git.ce9e999.ae0dd8f.noarch.rpm
python-oslo-messaging-doc-1.4.1-fuel6.0.mira25.git.ce9e999.ae0dd8f.noarch.rpm

NOTE: Changeset is not merged, created temporary package repository.
RPM repository URL: http://osci-obs.vm.mirantis.net:82/centos-fuel-6.0-updates-stable-5888/centos

Revision history for this message
OSCI Robot (oscirobot) wrote :

DEB package oslo.messaging has been built for project openstack/oslo.messaging
Package version == 1.4.1, package release == fuel6.0~mira23+git.ce9e999.ae0dd8f

Changeset: https://review.fuel-infra.org/5888
project: openstack/oslo.messaging
branch: openstack-ci/fuel-6.0-updates/2014.2
author: Alex Khivin
committer: Alex Khivin
subject: Reply queues are accidentally not found
status: patchset-created

Files placed on repository:
python-oslo.messaging_1.4.1-fuel6.0~mira23+git.ce9e999.ae0dd8f_all.deb

NOTE: Changeset is not merged, created temporary package repository.
DEB repository URL: http://osci-obs.vm.mirantis.net:82/ubuntu-fuel-6.0-updates-stable-5888/ubuntu

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix merged to openstack/oslo.messaging (openstack-ci/fuel-6.0-updates/2014.2)

Reviewed: https://review.fuel-infra.org/5888
Submitter: Alex Khivin <email address hidden>
Branch: openstack-ci/fuel-6.0-updates/2014.2

Commit: ce9e999eea8fbb04df6c69e1bb138c02f7aa8562
Author: Alex Khivin <email address hidden>
Date: Fri Apr 17 10:47:58 2015

Reply queues are accidentally not found

Closes-Bug: 1415932
Change-Id: Ic4fb12f8f9858c98a173bea46c2454443c96a681
(cherry picked from commit a8602948191ff24d32a50cbf0b03ecfee8407423)

Changed in mos:
status: In Progress → Fix Committed
Revision history for this message
OSCI Robot (oscirobot) wrote :

RPM package oslo.messaging has been built for project openstack/oslo.messaging
Package version == 1.4.1, package release == fuel6.0.mira25

Changeset: https://review.fuel-infra.org/5888
project: openstack/oslo.messaging
branch: openstack-ci/fuel-6.0-updates/2014.2
author: Alex Khivin
committer: Alex Khivin
subject: Reply queues are accidentally not found
status: change-merged

Files placed on repository:
python-oslo-messaging-1.4.1-fuel6.0.mira25.noarch.rpm
python-oslo-messaging-doc-1.4.1-fuel6.0.mira25.noarch.rpm

Changeset merged. Package placed on primary repository
RPM repository URL: http://osci-obs.vm.mirantis.net:82/centos-fuel-6.0-updates-stable/centos

Revision history for this message
OSCI Robot (oscirobot) wrote :

DEB package oslo.messaging has been built for project openstack/oslo.messaging
Package version == 1.4.1, package release == fuel6.0~mira23

Changeset: https://review.fuel-infra.org/5888
project: openstack/oslo.messaging
branch: openstack-ci/fuel-6.0-updates/2014.2
author: Alex Khivin
committer: Alex Khivin
subject: Reply queues are accidentally not found
status: change-merged

Files placed on repository:
python-oslo.messaging_1.4.1-fuel6.0~mira23_all.deb

Changeset merged. Package placed on primary repository
DEB repository URL: http://osci-obs.vm.mirantis.net:82/ubuntu-fuel-6.0-updates-stable/ubuntu

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix proposed to openstack/oslo.messaging (openstack-ci/fuel-6.1/2014.2)

Fix proposed to branch: openstack-ci/fuel-6.1/2014.2
Change author: Alex Khivin <email address hidden>
Review: https://review.fuel-infra.org/5954

Revision history for this message
OSCI Robot (oscirobot) wrote :

Fix proposed to branch: openstack-ci/fuel-6.1/2014.2
Review: https://review.fuel-infra.org/5954

Revision history for this message
Mateusz Matuszkowiak (mmatuszkowiak) wrote :

Changing to In Progress, since robot should not do that, sorry.

Changed in mos:
status: Fix Committed → In Progress
Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix merged to openstack/oslo.messaging (openstack-ci/fuel-6.1/2014.2)

Reviewed: https://review.fuel-infra.org/5954
Submitter: Oleksii Zamiatin <email address hidden>
Branch: openstack-ci/fuel-6.1/2014.2

Commit: 54b4abf911aed71f8cb1d3ecbc8e5a8ac7e21f81
Author: Alex Khivin <email address hidden>
Date: Fri Apr 17 14:27:05 2015

Reply queues are accidentally not found

Closes-Bug: 1415932
Change-Id: Ic4fb12f8f9858c98a173bea46c2454443c96a681
(cherry picked from commit a8602948191ff24d32a50cbf0b03ecfee8407423)
(cherry picked from commit ce9e999eea8fbb04df6c69e1bb138c02f7aa8562)

Revision history for this message
OSCI Robot (oscirobot) wrote :

Reviewed: https://review.fuel-infra.org/5954
Committed: https://review.fuel-infra.org/gitweb?p=openstack/oslo.messaging.git;a=commitdiff;h=54b4abf911aed71f8cb1d3ecbc8e5a8ac7e21f81
Submitter: Oleksii Zamiatin
Branch: openstack-ci/fuel-6.1/2014.2

commit 54b4abf911aed71f8cb1d3ecbc8e5a8ac7e21f81
Author: Alex Khivin <email address hidden>

Reply queues are accidentally not found

Closes-Bug: 1415932
Change-Id: Ic4fb12f8f9858c98a173bea46c2454443c96a681
(cherry picked from commit a8602948191ff24d32a50cbf0b03ecfee8407423)
(cherry picked from commit ce9e999eea8fbb04df6c69e1bb138c02f7aa8562)

Alexey Khivin (akhivin)
Changed in mos:
status: In Progress → Fix Released
status: Fix Released → Fix Committed
Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix merged to openstack/oslo.messaging (openstack-ci/fuel-6.0.1/2014.2)

Reviewed: https://review.fuel-infra.org/5882
Submitter: Pekelny Ilya <email address hidden>
Branch: openstack-ci/fuel-6.0.1/2014.2

Commit: a5e26a6ae1550dc23fbcc0ba7830f147f0973a2f
Author: Alex Khivin <email address hidden>
Date: Thu Apr 16 17:16:38 2015

Reply queues are accidentally not found

Closes-Bug: 1415932
Change-Id: Ic4fb12f8f9858c98a173bea46c2454443c96a681
(cherry picked from commit a8602948191ff24d32a50cbf0b03ecfee8407423)

Revision history for this message
OSCI Robot (oscirobot) wrote :

Reviewed: https://review.fuel-infra.org/5882
Committed: https://review.fuel-infra.org/gitweb?p=openstack/oslo.messaging.git;a=commitdiff;h=a5e26a6ae1550dc23fbcc0ba7830f147f0973a2f
Submitter: Pekelny Ilya
Branch: openstack-ci/fuel-6.0.1/2014.2

commit a5e26a6ae1550dc23fbcc0ba7830f147f0973a2f
Author: Alex Khivin <email address hidden>

Reply queues are accidentally not found

Closes-Bug: 1415932
Change-Id: Ic4fb12f8f9858c98a173bea46c2454443c96a681
(cherry picked from commit a8602948191ff24d32a50cbf0b03ecfee8407423)

tags: added: oslo.messaging
Revision history for this message
Sergey Novikov (snovikov) wrote :

I have reproduced similar behavior on virtual environment deployed by Fuel-devops tool(group deploy_ha_one_controller_flat, ISO fuel-6.1-469-2015-05-26_16-19-56.iso).

Nova-conductor and nova-scheduler logs contain the following error message:

Failed to publish message to topic 'reply_27b639bda5c74d3cbae7e130d4ae3ff2'

Guys, please confirm that bug is same. For more details, see attached logs.

Revision history for this message
Alexey Khivin (akhivin) wrote :

There are many reasons why queue may disappeared
this message can be seen always when reboot RabbitMQ, for example

and by thew way
"queue: Basic.consume: (404) NOT_FOUND"
is not the same
"Failed to publish message to topic 'reply_27b639bda5c74d3cbae7e130d4ae3ff2'"

I will look a your logs

Revision history for this message
Viktoria Efimova (vefimova) wrote :

Topology:
UBUNTU+VLAN+522
Controllers -3, Computes - 17

Before this issue there was rabbitmq cluster crash and it was restored. At hte time if test there is no Rabbitmq crash. (https://bugs.launchpad.net/mos/+bug/1467113)

Seeing the same behaviour under rally test NovaServers.boot_server_from_volume_and_live_migrate:
/var/log/nova/nova-conductor.log:2015-06-20 12:47:51.010 5678 ERROR nova.conductor.manager [req-c9d1225f-986e-4df4-94e4-8e619d8f8028 None] Migration of instance 71099728-d86b-4894-a4e4-0d5b938550fa to host node-8.domain.tld unexpectedly failed.
/var/log/nova/nova-conductor.log:2015-06-20 12:47:51.015 5678 ERROR oslo.messaging.rpc.dispatcher [req-c9d1225f-986e-4df4-94e4-8e619d8f8028 ] Exception during message handling: Migration error: Timed out waiting for a reply to message ID 6241206348dd49a0af9c968233ceed43.
/var/log/nova/nova-conductor.log:2015-06-20 12:47:51.017 5678 ERROR oslo.messaging._drivers.common [req-c9d1225f-986e-4df4-94e4-8e619d8f8028 ] Returning exception Migration error: Timed out waiting for a reply to message ID 6241206348dd49a0af9c968233ceed43. to caller
/var/log/nova/nova-conductor.log:2015-06-20 12:47:51.017 5678 ERROR oslo.messaging._drivers.common [req-c9d1225f-986e-4df4-94e4-8e619d8f8028 ] ['Traceback (most recent call last):\n', ' File "/usr/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 137, in _dispatch_and_reply\n incoming.message))\n', ' File "/usr/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 180, in _dispatch\n return self._do_dispatch(endpoint, method, ctxt, args)\n', ' File "/usr/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 126, in _do_dispatch\n result = getattr(endpoint, method)(ctxt, **new_args)\n', ' File "/usr/lib/python2.7/dist-packages/oslo/messaging/rpc/server.py", line 139, in inner\n return func(*args, **kwargs)\n', ' File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 483, in migrate_server\n block_migration, disk_over_commit)\n', ' File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 592, in _live_migrate\n raise exception.MigrationError(reason=ex)\n', 'MigrationError: Migration error: Timed out waiting for a reply to message ID 6241206348dd49a0af9c968233ceed43.\n']

Test failed becuase of http://paste.openstack.org/show/307662/

Another rally test NovaServers.boot_server_from_volume_and_delete failed soon after due to missed queue as well
http://paste.openstack.org/show/307824/

Snapshot at http://mos-scale-share.mirantis.com/fuel-snapshot-2015-06-20_14-13-41.tar.xz

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix proposed to openstack/oslo.messaging (openstack-ci/fuel-7.0/2015.1.0)

Fix proposed to branch: openstack-ci/fuel-7.0/2015.1.0
Change author: Alex Khivin <email address hidden>
Review: https://review.fuel-infra.org/8283

Dina Belova (dbelova)
tags: added: scale
Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix proposed to openstack/oslo.messaging (openstack-ci/fuel-5.1-updates/2014.1.1)

Fix proposed to branch: openstack-ci/fuel-5.1-updates/2014.1.1
Change author: Alex Khivin <email address hidden>
Review: https://review.fuel-infra.org/8886

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Change abandoned on openstack/oslo.messaging (openstack-ci/fuel-7.0/2015.1.0)

Change abandoned by Alex Khivin <email address hidden> on branch: openstack-ci/fuel-7.0/2015.1.0
Review: https://review.fuel-infra.org/8283

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix merged to openstack/oslo.messaging (openstack-ci/fuel-5.1-updates/2014.1.1)

Reviewed: https://review.fuel-infra.org/8886
Submitter: Alex Khivin <email address hidden>
Branch: openstack-ci/fuel-5.1-updates/2014.1.1

Commit: 31b28dab3ef6f958c58855576660be459e7ffb3d
Author: Alex Khivin <email address hidden>
Date: Tue Jun 30 14:41:13 2015

Reply queues are accidentally not found

Closes-Bug: 1415932
Change-Id: Ic4fb12f8f9858c98a173bea46c2454443c96a681
(cherry picked from commit a8602948191ff24d32a50cbf0b03ecfee8407423)

Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

We need to verify the fix for MOS 6.1 and also we need to try to reproduce the issue on MOS 7.0 and then update status from Incomplete.

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix proposed to openstack/oslo.messaging (openstack-ci/fuel-7.0/2015.1.0)

Fix proposed to branch: openstack-ci/fuel-7.0/2015.1.0
Change author: Alexey Khivin <email address hidden>
Review: https://review.fuel-infra.org/10420

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix merged to openstack/oslo.messaging (openstack-ci/fuel-7.0/2015.1.0)

Reviewed: https://review.fuel-infra.org/10420
Submitter: Oleksii Zamiatin <email address hidden>
Branch: openstack-ci/fuel-7.0/2015.1.0

Commit: 750fcc8f96761b512226d696271c9d67db56240b
Author: Alexey Khivin <email address hidden>
Date: Thu Aug 13 17:55:01 2015

Improve "Queue not found" exception handling

In the case queue was disapeared during reconnection process, "Queue not
found" exception may break consumption in other queues, thus rpc subsystem
may got stuck.

Added a new method _try_consume() to consume queue. If queue is not found,
this method reconnect queue (this supposes to re-create a lost queue),
and try consume it one more time.

Also fixed a strange test in test_utils.py

Change-Id: I41ffe7aacbae1ac176e0063a20cdd256cef69127
Closes-bug: #1465757
Closes-bug: #1463802
Closes-bug: #1415932

Revision history for this message
Dmitry Mescheryakov (dmitrymex) wrote :

We did see occurrences of the issue on 7.0, hence I ported the fix from 6.1. The same issue possibly will be in 8.0 and will be tracked in bug https://bugs.launchpad.net/mos/+bug/1463802

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Change abandoned on openstack/oslo.messaging (openstack-ci/fuel-5.1.2/2014.1.1)

Change abandoned by Alexey Khivin <email address hidden> on branch: openstack-ci/fuel-5.1.2/2014.1.1
Review: https://review.fuel-infra.org/5868
Reason: 5.1.2 release is cancelled. So there is no sense to commit to the dead branch

Revision history for this message
Vitaly Sedelnik (vsedelnik) wrote :

The change is actually made it into 6.1 release. Resetting the milestone to 6.1 and status to Fix Released for the 6.1.x series.

tags: added: on-verification
Revision history for this message
Nikita Marchenko (nmarchenko) wrote :

verified on 224 ISO

tags: removed: on-verification
Roman Rufanov (rrufanov)
tags: added: support
Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix proposed to openstack/oslo.messaging (openstack-ci/fuel-8.0/liberty)

Fix proposed to branch: openstack-ci/fuel-8.0/liberty
Change author: Alexey Khivin <email address hidden>
Review: https://review.fuel-infra.org/13463

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Change abandoned on openstack/oslo.messaging (openstack-ci/fuel-8.0/liberty)

Change abandoned by Dmitry Mescheryakov <email address hidden> on branch: openstack-ci/fuel-8.0/liberty
Review: https://review.fuel-infra.org/13463
Reason: It seems like the bug, fixed by this CR, is also fixed in upstream with a different approach: https://review.openstack.org/#/c/195688/2

The upstream fix is present in 8.0 and so we are not going to merge current change. We will reconsider it only if the issue reoccurs.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.