libvirt-bin sometimes hangs
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
New
|
Undecided
|
Unassigned |
Bug Description
I have a diablo-stable cluster using kvm that has been running for a long time with multiple users. It works well but I just saw, for the second time, libvirt-bin hang on a compute node when trying to reboot a vm. There is a traceback in the compute log which I assume is a failed attempt to connect to libvirt. The vm gets stuck in the REBOOT state. When I restart libvirt-bin the vm continues to ACTIVE and all seems good. Here is nova.conf and log excerpt:
--flagfile=
--use_deprecate
--dhcpbridge_
--dhcpbridge=
--sql_connectio
--s3_host=
--rabbit_
--glance_
--logdir=
--state_
--lock_
--verbose
--ec2_url=http://
--fixed_
--network_size=256
--image_
--bridge_
--flat_
--flat_
--network_
--force_
--public_
--multi_host=1
--osapi_
--quota_
--quota_ram=1000000
--quota_
--iscsi_
2012-02-13 11:59:34,081 INFO nova.compute.
_lock: admin: |True|
2012-02-13 11:59:34,081 INFO nova.compute.
_lock: executing: |<function reboot_instance at 0x2c6ba28>|
2012-02-13 11:59:34,081 AUDIT nova.compute.
tance 174
2012-02-13 11:59:34,127 DEBUG nova.compute.
e of instance-000000ae from (pid=1124) _get_power_state /usr/lib/
2012-02-13 11:59:34,855 DEBUG nova.rpc [97342dd6-
on network ... from (pid=1124) multicall /usr/lib/
2012-02-13 11:59:34,855 DEBUG nova.rpc [97342dd6-
e9eeb2313dfb076ce from (pid=1124) multicall /usr/lib/
2012-02-13 11:59:37,602 DEBUG nova.utils [97342dd6-
phore "iptables" for method "apply"... from (pid=1124) inner /usr/lib/
2012-02-13 11:59:37,602 DEBUG nova.utils [97342dd6-
lock "iptables" for method "apply"... from (pid=1124) inner /usr/lib/
2012-02-13 11:59:37,603 DEBUG nova.utils [97342dd6-
): sudo iptables-save -t filter from (pid=1124) execute /usr/lib/
2012-02-13 11:59:38,459 INFO nova.virt.
2012-02-13 11:59:38,462 DEBUG nova.utils [97342dd6-
): sudo iptables-restore from (pid=1124) execute /usr/lib/
2012-02-13 11:59:39,331 ERROR nova.rpc [5524d009-
handling
(nova.rpc): TRACE: Traceback (most recent call last):
(nova.rpc): TRACE: File "/usr/lib/
(nova.rpc): TRACE: rval = node_func(
(nova.rpc): TRACE: File "/usr/lib/
(nova.rpc): TRACE: return f(*args, **kw)
(nova.rpc): TRACE: File "/usr/lib/
(nova.rpc): TRACE: function(self, context, instance_id, *args, **kwargs)
(nova.rpc): TRACE: File "/usr/lib/
(nova.rpc): TRACE: self.driver.
(nova.rpc): TRACE: File "/usr/lib/
(nova.rpc): TRACE: return f(*args, **kw)
(nova.rpc): TRACE: File "/usr/lib/
(nova.rpc): TRACE: virt_dom = self._conn.
(nova.rpc): TRACE: File "/usr/lib/
(nova.rpc): TRACE: if ret is None:raise libvirtError(
(nova.rpc): TRACE: libvirtError: Domain not found: no domain with matching name 'instance-000000ae'
(nova.rpc): TRACE:
Have you tried to check if libvirt is working properly for the nova user?
What is the output of "virsh uri" if you execute it as the nova user? And the output of "virsh list"?