Removing instance backing file causes nova to crash

Bug #955788 reported by Anthony Young
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Low
Michael Still

Bug Description

If you delete the backing file of a running instance, nova-compute will crash on startup. Interestingly, this crash occurs regardless of whether the running instance is in the nova db.

Steps to reproduce:

> (launch instance)
> rm /opt/stack/nova/instances/instance-00000001/disk
> Optional: mysql nova: delete from instance_info_caches; delete from instances;
> restart nova-compute

Expected:

nova-compute starts

Actual:

Stderr: "qemu-img: Could not open '/opt/stack/nova/instances/instance-00000004/disk': No such file or directory\n"
(nova): TRACE: Traceback (most recent call last):
(nova): TRACE: File "/opt/stack/nova/bin/nova-compute", line 49, in <module>
(nova): TRACE: service.wait()
(nova): TRACE: File "/opt/stack/nova/nova/service.py", line 413, in wait
(nova): TRACE: _launcher.wait()
(nova): TRACE: File "/opt/stack/nova/nova/service.py", line 131, in wait
(nova): TRACE: service.wait()
(nova): TRACE: File "/usr/lib/python2.7/dist-packages/eventlet/greenthread.py", line 166, in wait
(nova): TRACE: return self._exit_event.wait()
(nova): TRACE: File "/usr/lib/python2.7/dist-packages/eventlet/event.py", line 116, in wait
(nova): TRACE: return hubs.get_hub().switch()
(nova): TRACE: File "/usr/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 177, in switch
(nova): TRACE: return self.greenlet.switch()
(nova): TRACE: File "/usr/lib/python2.7/dist-packages/eventlet/greenthread.py", line 192, in main
(nova): TRACE: result = function(*args, **kwargs)
(nova): TRACE: File "/opt/stack/nova/nova/service.py", line 101, in run_server
(nova): TRACE: server.start()
(nova): TRACE: File "/opt/stack/nova/nova/service.py", line 174, in start
(nova): TRACE: self.manager.update_available_resource(ctxt)
(nova): TRACE: File "/opt/stack/nova/nova/compute/manager.py", line 2393, in update_available_resource
(nova): TRACE: self.driver.update_available_resource(context, self.host)
(nova): TRACE: File "/opt/stack/nova/nova/virt/libvirt/connection.py", line 1807, in update_available_resource
(nova): TRACE: 'disk_available_least': self.get_disk_available_least()}
(nova): TRACE: File "/opt/stack/nova/nova/virt/libvirt/connection.py", line 2158, in get_disk_available_least
(nova): TRACE: disk_infos = utils.loads(self.get_instance_disk_info(i_name))
(nova): TRACE: File "/opt/stack/nova/nova/virt/libvirt/connection.py", line 2117, in get_instance_disk_info
(nova): TRACE: out, err = utils.execute('qemu-img', 'info', path)
(nova): TRACE: File "/opt/stack/nova/nova/utils.py", line 240, in execute
(nova): TRACE: cmd=' '.join(cmd))
(nova): TRACE: ProcessExecutionError: Unexpected error while running command.
(nova): TRACE: Command: qemu-img info /opt/stack/nova/instances/instance-00000004/disk
(nova): TRACE: Exit code: 1
(nova): TRACE: Stdout: ''

This popped up because instance cleanup is broken for me on devstack. Every time I run ./stack.sh virsh destroy fails, but the backing files are removed. Thus the system is left in a state where I can't launch anything.

Changed in nova:
assignee: nobody → Nagaraju-Bingi (nagaraju-bingi)
status: New → In Progress
Revision history for this message
Lloyd Dewolf (lloydde) wrote :

Hi Nagaraju-Bingi, Is this still "in progress"?

Revision history for this message
Thierry Carrez (ttx) wrote :

Doesn't look like you're still working on that. Please set to InProgress and reassign to you if that's still the case.

Changed in nova:
assignee: Nagaraju-Bingi (nagaraju-bingi) → nobody
status: In Progress → Confirmed
Thierry Carrez (ttx)
Changed in nova:
importance: Undecided → Low
Revision history for this message
Michael Still (mikal) wrote :

So, qemu-img doesn't seem to be called in this code path any more, but I'm going to add a safety check anyway.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/18608

Changed in nova:
assignee: nobody → Michael Still (mikalstill)
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/18608
Committed: http://github.com/openstack/nova/commit/1b7cea76abde83e9f937e33b56d54fa885f2a0b9
Submitter: Jenkins
Branch: master

commit 1b7cea76abde83e9f937e33b56d54fa885f2a0b9
Author: Michael Still <email address hidden>
Date: Mon Dec 24 09:51:19 2012 +1100

    Verify the disk file exists before running qemu-img on it.

    Should resolve bug 955788, although it is a little hard to tell
    because the bug is so old.

    Change-Id: Ic0c47f4b6181f56a98cf58d4ebe2cc926d06d524

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
milestone: none → grizzly-2
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: grizzly-2 → 2013.1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.