Activity log for bug #956719

Date Who What changed Old value New value Message
2012-03-16 06:25:24 Anthony Young bug added bug
2012-03-16 06:25:38 Anthony Young summary Instance sometimes do not fully terminated, causes crash Instance sometimes do not fully terminate, causes crash
2012-03-16 06:34:37 Anthony Young description I can reproduce a situation where instances do not fully terminate. In this situation, instance backing files are deleted, but the domain still exists in libvirt. Then, due to this bug: https://bugs.launchpad.net/bugs/955788 nova-compute will crash on startup. Steps to reproduce: > (run devstack) > cd exercises > # the following script launches 2 instances in quick succession, and then terminates after 10 seconds > curl https://raw.github.com/gist/2048763/898a7dc2348bf994eb4f4a93c299f1096522a824/gistfile1.txt > test.sh > chmod 755 test.sh Repeat the above 4-8 times. Then: > sudo virsh list Expected: No domains Actual: $ sudo virsh list Id Name State ---------------------------------- 19 instance-00000002 running 29 instance-0000000c running Then, nova-compute starts spitting this error: 2012-03-15 23:10:09 ERROR nova.manager [-] Error during ComputeManager.update_available_resource: Unexpected error while running command.Command: qemu-img info /opt/stack/nova/instances/instance-00000002/diskExit code: 1Stdout: ''Stderr: "qemu-img: Could not open '/opt/stack/nova/instances/instance-00000002/disk': No such file or directory\n"(nova.manager): TRACE: Traceback (most recent call last):(nova.manager): TRACE: File "/opt/stack/nova/nova/manager.py", line 155, in periodic_tasks(nova.manager): TRACE: task(self, context)(nova.manager): TRACE: File "/opt/stack/nova/nova/compute/manager.py", line 2386, in update_available_resource(nova.manager): TRACE: self.driver.update_available_resource(context, self.host)(nova.manager): TRACE: File "/opt/stack/nova/nova/virt/libvirt/connection.py", line 1805, in update_available_resource(nova.manager): TRACE: 'disk_available_least': self.get_disk_available_least()}(nova.manager): TRACE: File "/opt/stack/nova/nova/virt/libvirt/connection.py", line 2156, in get_disk_available_least(nova.manager): TRACE: disk_infos = utils.loads(self.get_instance_disk_info(i_name))(nova.manager): TRACE: File "/opt/stack/nova/nova/virt/libvirt/connection.py", line 2115, in get_instance_disk_info(nova.manager): TRACE: out, err = utils.execute('qemu-img', 'info', path)(nova.manager): TRACE: File "/opt/stack/nova/nova/utils.py", line 240, in execute(nova.manager): TRACE: cmd=' '.join(cmd))(nova.manager): TRACE: ProcessExecutionError: Unexpected error while running command.(nova.manager): TRACE: Command: qemu-img info /opt/stack/nova/instances/instance-00000002/disk(nova.manager): TRACE: Exit code: 1(nova.manager): TRACE: Stdout: ''(nova.manager): TRACE: Stderr: "qemu-img: Could not open '/opt/stack/nova/instances/instance-00000002/disk': No such file or directory\n"(nova.manager): TRACE: And upon next restart, the process crashes I can reproduce a situation where instances do not fully terminate. In this situation, instance backing files are deleted, but the domain still exists in libvirt. Then, due to this bug: https://bugs.launchpad.net/bugs/955788 nova-compute will crash on startup. Steps to reproduce: > (run devstack) > cd exercises > # the following script launches 2 instances in quick succession, and then terminates after 10 seconds > curl https://raw.github.com/gist/2048763/898a7dc2348bf994eb4f4a93c299f1096522a824/gistfile1.txt > test.sh > chmod 755 test.sh And then: > ./test.sh Repeat this command 4-8 times. Then: > sudo virsh list Expected: No domains Actual: $ sudo virsh list  Id Name State ----------------------------------  19 instance-00000002 running  29 instance-0000000c running Then, nova-compute starts spitting this error: 2012-03-15 23:10:09 ERROR nova.manager [-] Error during ComputeManager.update_available_resource: Unexpected error while running command.Command: qemu-img info /opt/stack/nova/instances/instance-00000002/diskExit code: 1Stdout: ''Stderr: "qemu-img: Could not open '/opt/stack/nova/instances/instance-00000002/disk': No such file or directory\n"(nova.manager): TRACE: Traceback (most recent call last):(nova.manager): TRACE: File "/opt/stack/nova/nova/manager.py", line 155, in periodic_tasks(nova.manager): TRACE: task(self, context)(nova.manager): TRACE: File "/opt/stack/nova/nova/compute/manager.py", line 2386, in update_available_resource(nova.manager): TRACE: self.driver.update_available_resource(context, self.host)(nova.manager): TRACE: File "/opt/stack/nova/nova/virt/libvirt/connection.py", line 1805, in update_available_resource(nova.manager): TRACE: 'disk_available_least': self.get_disk_available_least()}(nova.manager): TRACE: File "/opt/stack/nova/nova/virt/libvirt/connection.py", line 2156, in get_disk_available_least(nova.manager): TRACE: disk_infos = utils.loads(self.get_instance_disk_info(i_name))(nova.manager): TRACE: File "/opt/stack/nova/nova/virt/libvirt/connection.py", line 2115, in get_instance_disk_info(nova.manager): TRACE: out, err = utils.execute('qemu-img', 'info', path)(nova.manager): TRACE: File "/opt/stack/nova/nova/utils.py", line 240, in execute(nova.manager): TRACE: cmd=' '.join(cmd))(nova.manager): TRACE: ProcessExecutionError: Unexpected error while running command.(nova.manager): TRACE: Command: qemu-img info /opt/stack/nova/instances/instance-00000002/disk(nova.manager): TRACE: Exit code: 1(nova.manager): TRACE: Stdout: ''(nova.manager): TRACE: Stderr: "qemu-img: Could not open '/opt/stack/nova/instances/instance-00000002/disk': No such file or directory\n"(nova.manager): TRACE: And upon next restart, the process crashes
2012-03-16 08:41:52 Vish Ishaya nova: milestone essex-rc1
2012-03-16 08:41:54 Vish Ishaya nova: importance Undecided High
2012-03-16 08:41:56 Vish Ishaya nova: status New Triaged
2012-03-16 16:35:55 Anthony Young nova: assignee Anthony Young (sleepsonthefloor)
2012-03-16 21:56:01 Vish Ishaya nova: status Triaged In Progress
2012-03-17 18:55:54 OpenStack Infra nova: status In Progress Fix Committed
2012-03-20 08:42:39 Thierry Carrez nova: status Fix Committed Fix Released
2012-04-05 10:54:40 Thierry Carrez nova: milestone essex-rc1 2012.1