Activity log for bug #1350766

Date Who What changed Old value New value Message
2014-07-31 09:38:40 Michael Steffens bug added bug
2014-07-31 09:38:40 Michael Steffens attachment added Enforce fsync on output File object before returning from download https://bugs.launchpad.net/bugs/1350766/+attachment/4166489/+files/nova-glance.patch
2014-08-01 09:20:34 Michael Steffens summary Race condition: compute intermittanty corrupts base images on download from glance Race condition: compute intermittently corrupts base images on download from glance
2014-08-01 09:24:00 Michael Steffens description Under certain conditions, which I happen to meet often on my Icehouse single node setup, uploaded images or snapshots fail to boot. See also https://ask.openstack.org/en/question/42804/icehouse-how-to-boot-a-snapshot-from-a-running-instance/ Reason: When first instantiating a QCOW2 image, it's (1) downloaded as QCOW2 to /var/lib/nova/instances/_base/IMAGEID.part (2) converted to RAW format base /var/lib/nova/instances/_base/IMAGEID.converted using qemu-img The step (1) is performed in nova/image/glance.py, GlanceImageService.download using buffered IO, which does not guarantee the resulting data to be written to disk on file close. Consequently, the source image file may not be written completely when qemu-img starts reading. Whether the result is good or bad depends on speed of download, file size, and how fast qemu-image digests its input. Proposed fix: enforce fsync on output File object before returning from download. Patch attached. Under certain conditions, which I happen to meet often on my Icehouse single node setup, uploaded images or snapshots fail to boot. See also https://ask.openstack.org/en/question/42804/icehouse-how-to-boot-a-snapshot-from-a-running-instance/ Reason: When first instantiating a QCOW2 image, it's (1) downloaded as QCOW2 to /var/lib/nova/instances/_base/IMAGEID.part (2) converted to RAW format base /var/lib/nova/instances/_base/IMAGEID.converted using qemu-img The step (1) is performed in nova/image/glance.py, GlanceImageService.download using buffered IO, which does not guarantee the resulting data to be written to disk on file close. Consequently, the source image file may not be written completely when qemu-img sub-process starts reading in step (2). Whether the result is good or bad depends on speed of download, file size, and how quickly qemu-image can digest its input. Proposed fix: enforce fsync on output File object before returning from download. Patch attached.
2014-08-04 06:38:22 Michael Steffens tags compute
2014-08-04 06:40:06 Michael Steffens tags compute compute libvirt security
2014-08-04 11:41:00 Michael Steffens description Under certain conditions, which I happen to meet often on my Icehouse single node setup, uploaded images or snapshots fail to boot. See also https://ask.openstack.org/en/question/42804/icehouse-how-to-boot-a-snapshot-from-a-running-instance/ Reason: When first instantiating a QCOW2 image, it's (1) downloaded as QCOW2 to /var/lib/nova/instances/_base/IMAGEID.part (2) converted to RAW format base /var/lib/nova/instances/_base/IMAGEID.converted using qemu-img The step (1) is performed in nova/image/glance.py, GlanceImageService.download using buffered IO, which does not guarantee the resulting data to be written to disk on file close. Consequently, the source image file may not be written completely when qemu-img sub-process starts reading in step (2). Whether the result is good or bad depends on speed of download, file size, and how quickly qemu-image can digest its input. Proposed fix: enforce fsync on output File object before returning from download. Patch attached. Under certain conditions, which I happen to meet often on my Icehouse single node setup, uploaded images or snapshots fail to boot. See also https://ask.openstack.org/en/question/42804/icehouse-how-to-boot-a-snapshot-from-a-running-instance/ Reason: When first instantiating a QCOW2 image, it's (1) downloaded as QCOW2 to /var/lib/nova/instances/_base/IMAGEID.part (2) converted to RAW format base /var/lib/nova/instances/_base/IMAGEID.converted using qemu-img The step (1) is performed in nova/image/glance.py, GlanceImageService.download using buffered IO, which does not guarantee the resulting data to be written to disk on file close. Consequently, the source image file may not be written completely when qemu-img sub-process starts reading in step (2). Whether the result is good or bad depends on speed of download, file size, and how quickly qemu-image can digest its input. Proposed fix: enforce fsync on output File object before returning from download. Patch attached. Security considerations: * Due to the race between resources shared between users and tenants (compute node network and filesystem IO) a failure can be triggered across tenants, implying the risk of DoS. * To make things worse -- with the default setting of not cleaning the image cache -- any corrupted image will remain in cache until replaced with fresh upload using a new image ID. Affected snapshots remain unusable forever, until ex- and re-imported manually under better conditions. * Base image corruptions here are not detected and cannot be caught. Theoretically (a bit esoteric, quite unlikely, but not impossible), an attacker might modulate resource usage to precisely create an incompletely written image, that boots and runs, but has access control information stripped.
2014-08-05 11:04:32 Michael Steffens information type Public Public Security
2014-08-05 14:17:23 Tristan Cacqueray bug task added ossa
2014-08-05 14:17:35 Tristan Cacqueray ossa: status New Incomplete
2014-08-06 11:59:10 Tristan Cacqueray bug added subscriber Nova Core security contacts
2014-09-04 19:56:13 melanie witt nova: importance Undecided High
2014-09-04 19:56:13 melanie witt nova: status New Triaged
2014-09-08 18:36:35 OpenStack Infra nova: status Triaged In Progress
2014-09-08 18:36:35 OpenStack Infra nova: assignee Davanum Srinivas (DIMS) (dims-v)
2014-09-08 23:46:45 Jeremy Stanley ossa: status Incomplete Won't Fix
2014-09-12 15:11:08 Sean Dague marked as duplicate 1368815
2014-09-25 07:18:11 OpenStack Infra nova: assignee Davanum Srinivas (DIMS) (dims-v) Tony Breeds (o-tony)