Compute service state is beeing reported as ":-)" while libvirtd is not running.

Bug #1011087 reported by David Naori
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Undecided
Davanum Srinivas (DIMS)

Bug Description

When libvirtd is not running on a compute node - compute manager still reporting host status as ":-)".

[30 Minutes after stopping libvirtd]

nova-manage service list:
Binary Host Zone Status State Updated_At
nova-cert camel-nova.xyz.com nova enabled :-) 2012-06-10 08:29:40
nova-scheduler camel-nova.xyz.com nova enabled :-) 2012-06-10 08:29:40
nova-volume camel-nova.xyz.com nova enabled :-) 2012-06-10 08:29:40
nova-network camel-nova.xyz.com nova enabled :-) 2012-06-10 08:29:40
nova-compute camel-vdsa.xyz.com nova enabled :-) 2012-06-10 08:29:40

Compute.log:
2012-06-10 11:27:25 DEBUG nova.virt.libvirt.connection [-] Connection to libvirt broke from (pid=11618) _test_connection /usr/lib/python2.6/site-packages/nova/virt/libvirt/connection.py:318
2012-06-10 11:27:25 DEBUG nova.virt.libvirt.connection [-] Connecting to libvirt: qemu:///system from (pid=11618) _get_connection /usr/lib/python2.6/site-packages/nova/virt/libvirt/connection.py:297
2012-06-10 11:27:25 ERROR nova.manager [-] Error during ComputeManager.update_available_resource: Failed to connect socket to '/var/run/libvirt/libvirt-sock': No such file or directory
2012-06-10 11:27:25 TRACE nova.manager Traceback (most recent call last):
2012-06-10 11:27:25 TRACE nova.manager File "/usr/lib/python2.6/site-packages/nova/manager.py", line 155, in periodic_tasks
2012-06-10 11:27:25 TRACE nova.manager task(self, context)
2012-06-10 11:27:25 TRACE nova.manager File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 2403, in update_available_resource
2012-06-10 11:27:25 TRACE nova.manager self.driver.update_available_resource(context, self.host)
2012-06-10 11:27:25 TRACE nova.manager File "/usr/lib/python2.6/site-packages/nova/virt/libvirt/connection.py", line 1907, in update_available_resource
2012-06-10 11:27:25 TRACE nova.manager 'vcpus_used': self.get_vcpu_used(),
2012-06-10 11:27:25 TRACE nova.manager File "/usr/lib/python2.6/site-packages/nova/virt/libvirt/connection.py", line 1731, in get_vcpu_used
2012-06-10 11:27:25 TRACE nova.manager for dom_id in self._conn.listDomainsID():
2012-06-10 11:27:25 TRACE nova.manager File "/usr/lib/python2.6/site-packages/nova/virt/libvirt/connection.py", line 304, in _get_connection
2012-06-10 11:27:25 TRACE nova.manager self._connect, self.uri, self.read_only)
2012-06-10 11:27:25 TRACE nova.manager File "/usr/lib/python2.6/site-packages/eventlet/tpool.py", line 147, in proxy_call
2012-06-10 11:27:25 TRACE nova.manager rv = execute(f,*args,**kwargs)
2012-06-10 11:27:25 TRACE nova.manager File "/usr/lib/python2.6/site-packages/eventlet/tpool.py", line 76, in tworker
2012-06-10 11:27:25 TRACE nova.manager rv = meth(*args,**kwargs)
2012-06-10 11:27:25 TRACE nova.manager File "/usr/lib/python2.6/site-packages/nova/virt/libvirt/connection.py", line 343, in _connect
2012-06-10 11:27:25 TRACE nova.manager return libvirt.openAuth(uri, auth, 0)
2012-06-10 11:27:25 TRACE nova.manager File "/usr/lib64/python2.6/site-packages/libvirt.py", line 102, in openAuth
2012-06-10 11:27:25 TRACE nova.manager if ret is None:raise libvirtError('virConnectOpenAuth() failed')
2012-06-10 11:27:25 TRACE nova.manager libvirtError: Failed to connect socket to '/var/run/libvirt/libvirt-sock': No such file or directory
2012-06-10 11:27:25 TRACE nova.manager
2012-06-10 11:27:25 DEBUG nova.manager [-] Running periodic task ComputeManager._poll_rebooting_instances from (pid=11618) periodic_tasks /usr/lib/python2.6/site-packages/nova/manager.py:152
2012-06-10 11:27:25 DEBUG nova.manager [-] Skipping ComputeManager._cleanup_running_deleted_instances, 13 ticks left until next run from (pid=11618) periodic_tasks /usr/lib/python2.6/site-packages/nova/manager.py:147
2012-06-10 11:27:25 DEBUG nova.manager [-] Running periodic task ComputeManager._heal_instance_info_cache from (pid=11618) periodic_tasks /usr/lib/python2.6/site-packages/nova/manager.py:152
2012-06-10 11:27:25 DEBUG nova.rpc.amqp [-] Making asynchronous call on network ... from (pid=11618) multicall /usr/lib/python2.6/site-packages/nova/rpc/amqp.py:321
2012-06-10 11:27:25 DEBUG nova.rpc.amqp [-] MSG_ID is aebead13017b4ec7b38c769861de9c7c from (pid=11618) multicall /usr/lib/python2.6/site-packages/nova/rpc/amqp.py:324
2012-06-10 11:27:25 DEBUG nova.manager [-] Skipping ComputeManager._run_image_cache_manager_pass, 34 ticks left until next run from (pid=11618) periodic_tasks /usr/lib/python2.6/site-packages/nova/manager.py:147
2012-06-10 11:27:25 DEBUG nova.manager [-] Running periodic task ComputeManager._reclaim_queued_deletes from (pid=11618) periodic_tasks /usr/lib/python2.6/site-packages/nova/manager.py:152
2012-06-10 11:27:25 DEBUG nova.compute.manager [-] FLAGS.reclaim_instance_interval <= 0, skipping... from (pid=11618) _reclaim_queued_deletes /usr/lib/python2.6/site-packages/nova/compute/manager.py:2380
2012-06-10 11:27:25 DEBUG nova.manager [-] Running periodic task ComputeManager._report_driver_status from (pid=11618) periodic_tasks /usr/lib/python2.6/site-packages/nova/manager.py:152
2012-06-10 11:27:25 INFO nova.compute.manager [-] Updating host status
2012-06-10 11:27:25 DEBUG nova.virt.libvirt.connection [-] Updating host stats from (pid=11618) update_status /usr/lib/python2.6/site-packages/nova/virt/libvirt/connection.py:2475

Scheduler.log
2012-06-10 11:27:25 DEBUG nova.rpc.amqp [-] received {u'_context_roles': [u'admin'], u'_context_request_id': u'req-2ad3e69b-8745-42ca-a52b-a7da8782a2c4', u'_context_read_deleted': u'no', u'args': {u'service_name': u'compute', u'host': u'camel-vdsa.xyz.com', u'capabilities': {u'disk_available': 31, u'vcpus_used': 0, u'hypervisor_type': u'QEMU', u'disk_total': 120, u'host_memory_free': 15411, u'vcpus': 8, u'disk_used': 89, u'host_memory_total': 15951, u'hypervisor_version': 12001, u'cpu_info': {u'arch': u'x86_64', u'model': u'Penryn', u'vendor': u'Intel', u'features': [u'dca', u'pdcm', u'xtpr', u'tm2', u'est', u'vmx', u'ds_cpl', u'monitor', u'dtes64', u'pbe', u'tm', u'ht', u'ss', u'acpi', u'ds', u'vme'], u'topology': {u'cores': u'4', u'threads': u'1', u'sockets': u'2'}}}}, u'_context_auth_token': '<SANITIZED>', u'_context_is_admin': True, u'_context_project_id': None, u'_context_timestamp': u'2012-06-10T08:27:25.490477', u'_context_user_id': None, u'method': u'update_service_capabilities', u'_context_remote_address': None} from (pid=8515) _safe_log /usr/lib/python2.6/site-packages/nova/rpc/common.py:160
2012-06-10 11:27:25 DEBUG nova.rpc.amqp [req-2ad3e69b-8745-42ca-a52b-a7da8782a2c4 None None] unpacked context: {'user_id': None, 'roles': [u'admin'], 'timestamp': '2012-06-10T08:27:25.490477', 'auth_token': '<SANITIZED>', 'remote_address': None, 'is_admin': True, 'request_id': u'req-2ad3e69b-8745-42ca-a52b-a7da8782a2c4', 'project_id': None, 'read_deleted': u'no'} from (pid=8515) _safe_log /usr/lib/python2.6/site-packages/nova/rpc/common.py:160
2012-06-10 11:27:25 DEBUG nova.scheduler.host_manager [req-2ad3e69b-8745-42ca-a52b-a7da8782a2c4 None None] Received compute service update from camel-vdsa.qa.lab.tlv.redhat.com. from (pid=8515) update_service_capabilities /usr/lib/python2.6/site-packages/nova/scheduler/host_manager.py:273

Yaguang Tang (heut2008)
Changed in nova:
assignee: nobody → Yaguang Tang (heut2008)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/8439

Changed in nova:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/17299

Changed in nova:
assignee: Yaguang Tang (heut2008) → Davanum Srinivas (dims-v)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/17299
Committed: http://github.com/openstack/nova/commit/0e85ba136e3c1c2f81f5ed7d594020539100939d
Submitter: Jenkins
Branch: master

commit 0e85ba136e3c1c2f81f5ed7d594020539100939d
Author: Davanum Srinivas <email address hidden>
Date: Sat Dec 1 22:53:34 2012 -0500

    Add notifications when libvirtd goes down

    During driver status update from periodic tasks, if we find that
    the libvirtd is down then we should send send notifications and
    log an error.

    Fixes LP #1011087

    Change-Id: I447f24e4ac719d2d550810509f72be6f270ce326

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
milestone: none → grizzly-2
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: grizzly-2 → 2013.1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.