nova-compute loses rabbitmq queues if rabbitmq goes down

Bug #1463440 reported by Leontii Istomin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
Confirmed
Critical
MOS Oslo

Bug Description

regarding this bug https://bugs.launchpad.net/fuel/+bug/1463433 rabbitmq has been restarted.
after that a lot of nova-compute services can't report their status:
http://paste.openstack.org/show/277918/
http://paste.openstack.org/show/277723/

Configuration:
Baremetal,Centos,IBP,HA, Neutron-vlan,Ceph-all,Nova-debug,Nova-quotas, 6.1-521
Controllers:3 Computes:47

api: '1.0'
astute_sha: 7766818f079881e2dbeedb34e1f67e517ed7d479
auth_required: true
build_id: 2015-06-08_06-13-27
build_number: '521'
feature_groups:
- mirantis
fuel-library_sha: f43c2ae1af3b493ee0e7810eab7bb7b50c986c7d
fuel-ostf_sha: 7c938648a246e0311d05e2372ff43ef1eb2e2761
fuelmain_sha: bcc909ffc5dd5156ba54cae348b6a07c1b607b24
nailgun_sha: 4340d55c19029394cd5610b0e0f56d6cb8cb661b
openstack_version: 2014.2.2-6.1
production: docker
python-fuelclient_sha: 4fc55db0265bbf39c369df398b9dc7d6469ba13b
release: '6.1'

Diagnostic Snapshot: http://mos-scale-share.mirantis.com/fuel-snapshot-2015-06-09_09-55-58.tar.xz

Tags: scale
Changed in mos:
assignee: nobody → MOS Oslo (mos-oslo)
status: New → Confirmed
importance: Undecided → High
Revision history for this message
Leontii Istomin (listomin) wrote :

[root@node-1 ~]# rabbitmqctl list_queues
Listing queues ...
Error: unable to connect to node 'rabbit@node-1': nodedown

DIAGNOSTICS
===========

attempted to contact: ['rabbit@node-1']

rabbit@node-1:
  * connected to epmd (port 4369) on node-1
  * node rabbit@node-1 up, 'rabbit' application running

current node details:
- node name: 'rabbitmqctl22420@node-1'
- home dir: /var/lib/rabbitmq
- cookie hash: soeIWU2jk2YNseTyDSlsEA==

You have new mail in /var/spool/mail/root
[root@node-1 ~]# rabbitmqctl cluster_status
Cluster status of node 'rabbit@node-1' ...
[{nodes,[{disc,['rabbit@node-1','rabbit@node-44','rabbit@node-49']}]},
 {running_nodes,['rabbit@node-49','rabbit@node-44','rabbit@node-1']},
 {cluster_name,<<"<email address hidden>">>},
 {partitions,[]}]
...done.

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

Raised to crit as this blocks the scale lab certification for the 6.1

Changed in mos:
milestone: none → 6.1
importance: High → Critical
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

It seems as a dup of https://bugs.launchpad.net/fuel/+bug/1463433
Do you think we should track this as a separate bug by any reason?

Revision history for this message
Davanum Srinivas (DIMS) (dims-v) wrote :

This is a dup of the other one. the blocked connections is causing oslo.messaging to timeout

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.