Comment 5 for bug 1439145

Revision history for this message
Nagabhushana R (bhushana) wrote :

We need to know few things

1-> Did rabbitmq restart anytime.
2-> Network connectivity to RMQ from nova-compute. Was it stable?
3-> Are they running HA.

I see the following in the log message

2015-03-24 18:35:14.320 2972 ERROR nova.openstack.common.periodic_task [-] Error during ComputeManager.update_available_resource: Timed out waiting for a reply to message ID 446d5968d5ff469ea71c84a85d9f2b6d
2015-03-24 18:35:14.320 2972 TRACE nova.openstack.common.periodic_task Traceback (most recent call last):
2015-03-24 18:35:14.320 2972 TRACE nova.openstack.common.periodic_task File "/usr/lib/python2.7/dist-packages/nova/openstack/common/periodic_task.py", line 182, in run_periodic_tasks
2015-03-24 18:35:14.320 2972 TRACE nova.openstack.common.periodic_task task(self, context)
2015-03-24 18:35:14.320 2972 TRACE nova.openstack.common.periodic_task File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 5460, in update_available_resource
2015-03-24 18:35:14.320 2972 TRACE nova.openstack.common.periodic_task rt.update_available_resource(context)
2015-03-24 18:35:14.320 2972 TRACE nova.openstack.common.periodic_task File "/usr/lib/python2.7/dist-packages/nova/openstack/common/lockutils.py", line 249, in inner
2015-03-24 18:35:14.320 2972 TRACE nova.openstack.common.periodic_task return f(*args, **kwargs)
2015-03-24 18:35:14.320 2972 TRACE nova.openstack.common.periodic_task File "/usr/lib/python2.7/dist-packages/nova/compute/resource_tracker.py", line 315, in update_available_resource
2015-03-24 18:35:14.320 2972 TRACE nova.openstack.common.periodic_task context, self.host, self.nodename)
2015-03-24 18:35:14.320 2972 TRACE nova.openstack.common.periodic_task File "/usr/lib/python2.7/dist-packages/nova/objects/base.py", line 110, in wrapper
2015-03-24 18:35:14.320 2972 TRACE nova.openstack.common.periodic_task args, kwargs)
2015-03-24 18:35:14.320 2972 TRACE nova.openstack.common.periodic_task File "/usr/lib/python2.7/dist-packages/nova/conductor/rpcapi.py", line 425, in object_class_action
2015-03-24 18:35:14.320 2972 TRACE nova.openstack.common.periodic_task objver=objver, args=args, kwargs=kwargs)
2015-03-24 18:35:14.320 2972 TRACE nova.openstack.common.periodic_task File "/usr/lib/python2.7/dist-packages/oslo/messaging/rpc/client.py", line 150, in call
2015-03-24 18:35:14.320 2972 TRACE nova.openstack.common.periodic_task wait_for_reply=True, timeout=timeout)
2015-03-24 18:35:14.320 2972 TRACE nova.openstack.common.periodic_task File "/usr/lib/python2.7/dist-packages/oslo/messaging/transport.py", line 90, in _send
2015-03-24 18:35:14.320 2972 TRACE nova.openstack.common.periodic_task timeout=timeout)
2015-03-24 18:35:14.320 2972 TRACE nova.openstack.common.periodic_task File "/usr/lib/python2.7/dist-packages/oslo/messaging/_drivers/amqpdriver.py", line 412, in send
2015-03-24 18:35:14.320 2972 TRACE nova.openstack.common.periodic_task return self._send(target, ctxt, message, wait_for_reply, timeout)
2015-03-24 18:35:14.320 2972 TRACE nova.openstack.common.periodic_task File "/usr/lib/python2.7/dist-packages/oslo/messaging/_drivers/amqpdriver.py", line 403, in _send
2015-03-24 18:35:14.320 2972 TRACE nova.openstack.common.periodic_task result = self._waiter.wait(msg_id, timeout)
2015-03-24 18:35:14.320 2972 TRACE nova.openstack.common.periodic_task File "/usr/lib/python2.7/dist-packages/oslo/messaging/_drivers/amqpdriver.py", line 280, in wait
2015-03-24 18:35:14.320 2972 TRACE nova.openstack.common.periodic_task reply, ending, trylock = self._poll_queue(msg_id, timeout)
2015-03-24 18:35:14.320 2972 TRACE nova.openstack.common.periodic_task File "/usr/lib/python2.7/dist-packages/oslo/messaging/_drivers/amqpdriver.py", line 220, in _poll_queue
2015-03-24 18:35:14.320 2972 TRACE nova.openstack.common.periodic_task message = self.waiters.get(msg_id, timeout)
2015-03-24 18:35:14.320 2972 TRACE nova.openstack.common.periodic_task File "/usr/lib/python2.7/dist-packages/oslo/messaging/_drivers/amqpdriver.py", line 126, in get
2015-03-24 18:35:14.320 2972 TRACE nova.openstack.common.periodic_task 'to message ID %s' % msg_id)
2015-03-24 18:35:14.320 2972 TRACE nova.openstack.common.periodic_task MessagingTimeout: Timed out waiting for a reply to message ID

This could be a reason the compute was masked as not happy.

Could you please ask them to attach /var/log/contrail/ha/rmq-monitor.log. This will help us in checking if RMQ was stable.

Thanks,
Sanju

From: Nagabhushana R <email address hidden>
Date: Wednesday, April 1, 2015 11:54 PM
To: Sanju Abraham <email address hidden>
Subject: Fwd: [Bug 1439145] [NEW] nova-compute status is "XXX"in nova-manage service list

would you know more on this…?