Snapshot creation is racey

Bug #949475 reported by Rick Harris
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Rick Harris

Bug Description

A race condition exists in the creation of snapshots since we check the task_state in compute/api but don't set the task_state until compute/manager receives the message.

This can allow two (or more) snapshots to be created at virtually the same time, which, in the case of XenServer may end up causing VHDs to not coalesce properly.

Even worse, when the VHD doesn't coalesce, this can cause the creation of "bad" images which ultimately corrupt the SR on the machine which they are restored to. This can have a cascading effect across the cluster where one bad image corrupts multiple machines.

The two pronged solution is to prevent "bad" images from getting created in the first place, and, 2), if we detect a bad image, bail on it before corrupting the SR.

Changed in nova:
assignee: nobody → Rick Harris (rconradharris)
importance: Undecided → High
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/5059

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/5059
Committed: http://github.com/openstack/nova/commit/08b4e6c2b808011ea7ae9b367bfb829cb332f4e7
Submitter: Jenkins
Branch: master

commit 08b4e6c2b808011ea7ae9b367bfb829cb332f4e7
Author: Rick Harris <email address hidden>
Date: Thu Mar 8 02:55:04 2012 +0000

    Fix racey snapshots.

    Fixes bug 949475

    Atomically tests and sets the instance task_state before allowing a
    snapshot or backup to be initiated.

    Change-Id: I40671a80f5e75337e176a715837f62d400cc21b6

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
milestone: none → essex-rc1
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: essex-rc1 → 2012.1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.