Race condition: compute intermittently corrupts base images on download from glance

Bug #1350766 reported by Michael Steffens
260
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
In Progress
High
Tony Breeds
OpenStack Security Advisory
Won't Fix
Undecided
Unassigned

Bug Description

Under certain conditions, which I happen to meet often on my Icehouse single node setup, uploaded images or snapshots fail to boot. See also https://ask.openstack.org/en/question/42804/icehouse-how-to-boot-a-snapshot-from-a-running-instance/

Reason: When first instantiating a QCOW2 image, it's

(1) downloaded as QCOW2 to /var/lib/nova/instances/_base/IMAGEID.part
(2) converted to RAW format base /var/lib/nova/instances/_base/IMAGEID.converted using qemu-img

The step (1) is performed in nova/image/glance.py, GlanceImageService.download using buffered IO, which does not guarantee the resulting data to be written to disk on file close. Consequently, the source image file may not be written completely when qemu-img sub-process starts reading in step (2). Whether the result is good or bad depends on speed of download, file size, and how quickly qemu-image can digest its input.

Proposed fix: enforce fsync on output File object before returning from download. Patch attached.

Security considerations:

 * Due to the race between resources shared between users and tenants (compute node network and filesystem IO) a failure can be triggered across tenants, implying the risk of DoS.

 * To make things worse -- with the default setting of not cleaning the image cache -- any corrupted image will remain in cache until replaced with fresh upload using a new image ID. Affected snapshots remain unusable forever, until ex- and re-imported manually under better conditions.

 * Base image corruptions here are not detected and cannot be caught. Theoretically (a bit esoteric, quite unlikely, but not impossible), an attacker might modulate resource usage to precisely create an incompletely written image, that boots and runs, but has access control information stripped.

Revision history for this message
Michael Steffens (michael-steffens-b) wrote :
summary: - Race condition: compute intermittanty corrupts base images on download
+ Race condition: compute intermittently corrupts base images on download
from glance
description: updated
tags: added: compute
tags: added: libvirt security
description: updated
information type: Public → Public Security
Revision history for this message
Tristan Cacqueray (tristan-cacqueray) wrote :

Thanks for the report, the OSSA task is set to incompete pending for additional security detail from nova-coresec.

What is the likeliness to trigger this race in production ?

Changed in ossa:
status: New → Incomplete
Revision history for this message
Michael Steffens (michael-steffens-b) wrote :

That is really tough to guess. I don't know any reason why a production environment would be less susceptible by principle.

During my tests almost all QCOW2 instantiations and launches of snapshots failed, until applying the fsync fix. Notable exceptions:

 * The cirros and ubuntu original images instantiated right after OpenStack setup, with no other load on the system at all.
 * Windows (due to its size! qemu-img can consume the disk head without catching up, while nova download is far away still writing the disk tail).

Failures of the others were varying, from no boot disk found at all, to failures during boot. Thus, I wouldn't be too surprised if even a certain fraction of running instances in production got clipped unnoticed, but only lost chunks not being read so early, or not at all.

How big that fraction is depends on too many boundary conditions.

What's bothering me most with respect to security is the failure's stickiness. Once a base image is broken on a compute node, it requires careful intervention not get promoted into all subsequent instantiations, also of other users and tenants.

Revision history for this message
Michael Steffens (michael-steffens-b) wrote :

To answer you question more precisely: the race is triggered always, also in production. Whether the right runner wins, and whether you notice the damage if not, depends.

Revision history for this message
Tristan Cacqueray (tristan-cacqueray) wrote :

Thanks you for the additional details.

There is a clear risk of compute node DoS through base image corruption. This seems to impact up to Havana. Can @nova-coresec have a look at this report please ?

Revision history for this message
Thierry Carrez (ttx) wrote :

This feels like a corruption bug, I'm not sure an "attacker" would act differently from a normal user here, so i'm not sure it really qualifies as a vulnerability.

If the user had to do something special to trigger corruption, I would change my mind, but I think most setups do not hit this condition (otherwise this bug would have surfaced earlier) and normal usage triggers the exact same issue as an attack.

I'd definitely a bug though, and it should definitely be fixed. the question is, should we go through the delays of private security bugfixing or fix it asap.

Revision history for this message
Michael Steffens (michael-steffens-b) wrote :

A vulnerability exploited by normal user behavior (such as putting load on a system), that can be used to cause corruption across different user instances I'd even be more concerned about, than something that needs special actions. On the other hand, modulating the load in manner that a specific corruption (such as selectively dropping chunks), would require very sophisticated actions, I agree.

Nevertheless, yesterday, after a regular Ubuntu nova-compute update reverted my local fix to defective behavior, I observed a new variant of corruption: A new snapshot booted fine, but then exposed filesystem errors. After redoing the whole exercise using the same image after reapplying the fsync patch, everything was fine.

I wouldn't be surprised if such issues do already surface in production now and then (less frequent than in my environment, though), but are then blamed on guest OS issues instead. Let me illustrate.:

This is how it looks to the end user: Take a snapshot, launch, fails. Launch the same snapshot again, fails the same way. Looks like the snaphost itself is defective, doesn't it? Most suspected: the filesystem has been in inconsistent state when doing the snapshot. So let's do a new snapshot. And indeed that either works, or fails consistently in a different way than the first.

Who wouldn't conclude that it's the guest OS or the way the snapshot is done (nothing OpenStack could do anything about) that is at fault, rather than the image being corrupted after download from glance, and then cached?

Is there anything I can provide to get this ticket out of the incomplete and unassigned state?

Revision history for this message
Jeremy Stanley (fungi) wrote :

If you want to propose a patch to nova addressing the issue, you can assign the nova bugtask to yourself, otherwise it will need to wait for nova bug supervisors to triage it and an interested developer to pick it up and start work on a solution.

The vulnerability management team will probably end up removing/invalidating the security advisory task and switching this to a normal public bug, but that really shouldn't affect the nova triage and development work on it.

Revision history for this message
Michael Steffens (michael-steffens-b) wrote :

The patch was proposed in the very beginning, see https://bugs.launchpad.net/nova/+bug/1350766/+attachment/4166489/+files/nova-glance.patch

It's copies a snipplet of utils code from swift, in order not to introduce a cross service dependency. The actual fix is then a one-liner: fsync the image before returning from download.

The modification itself is quite self contained. Whether the style is acceptable needs review. But should be low hanging fruit.

melanie witt (melwitt)
Changed in nova:
importance: Undecided → High
status: New → Triaged
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/119866

Changed in nova:
assignee: nobody → Davanum Srinivas (DIMS) (dims-v)
status: Triaged → In Progress
Revision history for this message
Jeremy Stanley (fungi) wrote :

I've marked the OSSA task as "won't fix" to indicate this issue isn't one for which the project vulnerability management team would publish a coordinated security advisory, as the conditions by which it is triggered do not seem to be under direct control of a malicious actor but rather one of volume and statistical happenstance. This does definitely still sound like an annoying bug, however, and one which the Nova developers will hopefully address in a timely manner.

Changed in ossa:
status: Incomplete → Won't Fix
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Davanum Srinivas (dims) (<email address hidden>) on branch: master
Review: https://review.openstack.org/119866
Reason: Looks like there are major issues with this approach. need to find another way

Revision history for this message
Michael Steffens (michael-steffens-b) wrote :

Filed a bug report for qemu-img: https://bugs.launchpad.net/qemu/+bug/1368815

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/123957

Changed in nova:
assignee: Davanum Srinivas (DIMS) (dims-v) → Tony Breeds (o-tony)
To post a comment you must log in.
This report contains Public Security information  
Everyone can see this security related information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.