OpenStack Compute (Nova)

Race condition in LibVirtConnection.get_disk_available_least

Reported by David Kranz on 2012-03-29
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
High
Vish Ishaya

Bug Description

Running a stress test that quickly creates and reboots many instances, the following error occurs in nova-compute. I don't know this code but it seems to be asking libvirt for a list of instances and then probing them. But what happens if the instance is no longer available when the code gets to its place on the list?

The Tempest stress test run that produced this has not yet been checked in but should be very soon.

8653c4396d476289162d78fdf251f1] check_instance_lock: decorating: |<function reboot_instance at 0x158a1b8>|
2012-03-29 11:02:17 INFO nova.compute.manager [req-e0ac41dc-0790-43e8-97eb-153d8b7cb9a7 8d8b2da6dbef4362a779699cafde336c 5d\
8653c4396d476289162d78fdf251f1] check_instance_lock: arguments: |<nova.compute.manager.ComputeManager object at 0x7fb154a8b\
b50>| |<nova.rpc.amqp.RpcContext object at 0x3500f90>| |481133bb-3487-4ed2-acf3-8b1521f3c7e0|
2012-03-29 11:02:17 DEBUG nova.compute.manager [req-e0ac41dc-0790-43e8-97eb-153d8b7cb9a7 8d8b2da6dbef4362a779699cafde336c 5\
d8653c4396d476289162d78fdf251f1] instance 481133bb-3487-4ed2-acf3-8b1521f3c7e0: getting locked state from (pid=1367) get_lo\
ck /usr/lib/python2.7/dist-packages/nova/compute/manager.py:1596
2012-03-29 11:02:17 INFO nova.compute.manager [req-e0ac41dc-0790-43e8-97eb-153d8b7cb9a7 8d8b2da6dbef4362a779699cafde336c 5d\
8653c4396d476289162d78fdf251f1] check_instance_lock: locked: |False|
2012-03-29 11:02:17 INFO nova.compute.manager [req-e0ac41dc-0790-43e8-97eb-153d8b7cb9a7 8d8b2da6dbef4362a779699cafde336c 5d\
8653c4396d476289162d78fdf251f1] check_instance_lock: admin: |True|
2012-03-29 11:02:17 INFO nova.compute.manager [req-e0ac41dc-0790-43e8-97eb-153d8b7cb9a7 8d8b2da6dbef4362a779699cafde336c 5d\
8653c4396d476289162d78fdf251f1] check_instance_lock: executing: |<function reboot_instance at 0x158a1b8>|
2012-03-29 11:02:17 AUDIT nova.compute.manager [req-e0ac41dc-0790-43e8-97eb-153d8b7cb9a7 8d8b2da6dbef4362a779699cafde336c 5\
d8653c4396d476289162d78fdf251f1] Rebooting instance 481133bb-3487-4ed2-acf3-8b1521f3c7e0
2012-03-29 11:02:17 DEBUG nova.compute.manager [req-e0ac41dc-0790-43e8-97eb-153d8b7cb9a7 8d8b2da6dbef4362a779699cafde336c 5\
d8653c4396d476289162d78fdf251f1] [instance: 481133bb-3487-4ed2-acf3-8b1521f3c7e0] Checking state from (pid=1367) _get_power\
_state /usr/lib/python2.7/dist-packages/nova/compute/manager.py:260
2012-03-29 11:02:18 DEBUG nova.rpc.amqp [req-e0ac41dc-0790-43e8-97eb-153d8b7cb9a7 8d8b2da6dbef4362a779699cafde336c 5d8653c4\
396d476289162d78fdf251f1] Making asynchronous call on network ... from (pid=1367) multicall /usr/lib/python2.7/dist-package\
s/nova/rpc/amqp.py:321
2012-03-29 11:02:18 DEBUG nova.rpc.amqp [req-e0ac41dc-0790-43e8-97eb-153d8b7cb9a7 8d8b2da6dbef4362a779699cafde336c 5d8653c4\
396d476289162d78fdf251f1] MSG_ID is 6456ecf038464194bc5614763f033a2b from (pid=1367) multicall /usr/lib/python2.7/dist-pack\
ages/nova/rpc/amqp.py:324
2012-03-29 11:02:18 DEBUG nova.utils [-] Running cmd (subprocess): qemu-img info /var/lib/nova/instances/instance-00000017/\
disk from (pid=1367) execute /usr/lib/python2.7/dist-packages/nova/utils.py:221
2012-03-29 11:02:18 DEBUG nova.utils [req-bda58051-5080-4674-bd0b-b85fb5aefdb8 8d8b2da6dbef4362a779699cafde336c 5d8653c4396\
d476289162d78fdf251f1] Running cmd (subprocess): sudo nova-rootwrap iptables-restore from (pid=1367) execute /usr/lib/pytho\
n2.7/dist-packages/nova/utils.py:221
2012-03-29 11:02:19 ERROR nova.manager [-] Error during ComputeManager.update_available_resource: Instance instance-0000001\
6 could not be found.
(nova.manager): TRACE: Traceback (most recent call last):
(nova.manager): TRACE: File "/usr/lib/python2.7/dist-packages/nova/manager.py", line 155, in periodic_tasks
(nova.manager): TRACE: task(self, context)
(nova.manager): TRACE: File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2402, in update_available_re\
source
(nova.manager): TRACE: self.driver.update_available_resource(context, self.host)
(nova.manager): TRACE: File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/connection.py", line 1931, in update_avai\
lable_resource
(nova.manager): TRACE: 'disk_available_least': self.get_disk_available_least()}
(nova.manager): TRACE: File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/connection.py", line 2283, in get_disk_av\
ailable_least
(nova.manager): TRACE: disk_infos = utils.loads(self.get_instance_disk_info(i_name))
(nova.manager): TRACE: File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/connection.py", line 2223, in get_instanc\
e_disk_info
(nova.manager): TRACE: virt_dom = self._lookup_by_name(instance_name)
(nova.manager): TRACE: File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/connection.py", line 1567, in _lookup_by_\
name
(nova.manager): TRACE: raise exception.InstanceNotFound(instance_id=instance_name)
(nova.manager): TRACE: InstanceNotFound: Instance instance-00000016 could not be found.
(nova.m

Changed in nova:
status: New → Triaged
importance: Undecided → High
milestone: none → essex-rc2
Changed in nova:
assignee: nobody → Vish Ishaya (vishvananda)

Fix proposed to branch: master
Review: https://review.openstack.org/5999

Changed in nova:
status: Triaged → In Progress

Reviewed: https://review.openstack.org/5999
Committed: http://github.com/openstack/nova/commit/e1580f2f99e8900aabdb5a049198adcc5af86229
Submitter: Jenkins
Branch: master

commit e1580f2f99e8900aabdb5a049198adcc5af86229
Author: Vishvananda Ishaya <email address hidden>
Date: Fri Mar 30 10:05:45 2012 -0700

    Handle not found in check for disk availability

     * includes failing test
     * fixes bug 968339

    Change-Id: I92951a9d2f2027464e915608e8aaf205543f3c93

Changed in nova:
status: In Progress → Fix Committed

Reviewed: https://review.openstack.org/6016
Committed: http://github.com/openstack/nova/commit/e54ad5a1ce1e02b6842a3197700eeeca33b1ca88
Submitter: Jenkins
Branch: milestone-proposed

commit e54ad5a1ce1e02b6842a3197700eeeca33b1ca88
Author: Vishvananda Ishaya <email address hidden>
Date: Fri Mar 30 10:05:45 2012 -0700

    Handle not found in check for disk availability

     * includes failing test
     * fixes bug 968339

    Change-Id: I92951a9d2f2027464e915608e8aaf205543f3c93

Changed in nova:
status: Fix Committed → Fix Released
Thierry Carrez (ttx) on 2012-04-05
Changed in nova:
milestone: essex-rc2 → 2012.1
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers