Comment 0 for bug 1350766

Revision history for this message
Michael Steffens (michael-steffens-b) wrote : Race condition: compute intermittanty corrupts base images on download from glance

Under certain conditions, which I happen to meet often on my Icehouse single node setup, uploaded images or snapshots fail to boot. See also https://ask.openstack.org/en/question/42804/icehouse-how-to-boot-a-snapshot-from-a-running-instance/

Reason: When first instantiating a QCOW2 image, it's

(1) downloaded as QCOW2 to /var/lib/nova/instances/_base/IMAGEID.part
(2) converted to RAW format base /var/lib/nova/instances/_base/IMAGEID.converted using qemu-img

The step (1) is performed in nova/image/glance.py, GlanceImageService.download using buffered IO, which does not guarantee the resulting data to be written to disk on file close. Consequently, the source image file may not be written completely when qemu-img starts reading. Whether the result is good or bad depends on speed of download, file size, and how fast qemu-image digests its input.

Proposed fix: enforce fsync on output File object before returning from download. Patch attached.