All nova apis relying on Instance.save(expected_*_state) for safety contain a race condition

Bug #1297375 reported by Matthew Booth
22
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Matthew Booth

Bug Description

Take, for example, resize_instance(). In manager.py, we assert that the instance is in RESIZE_PREP state with:

  instance.save(expected_task_state=task_states.RESIZE_PREP)

This should mean that the first resize will succeed, and any subsequent will fail. However, the underlying db implementation does not lock the instance during the update, and therefore doesn't guarantee this.

Specifically, _instance_update() in db/sqlalchemy/apy.py starts a session, and reads task_state from the instance. However, it does not use a 'select ... for update', meaning the row is not locked. 2 concurrent calls to this method can both read the same state, then race to the update. The last writer will win. Without 'select ... for update', the db transaction is only ensuring that all writes are atomic, not reads with dependent writes.

SQLAlchemy seems to support select ... for update, as do MySQL and PostgreSQL, although MySQL will fall back to whole table locks for non-InnoDB tables, which would likely be a significant performance hit.

Tags: compute db
Tracy Jones (tjones-i)
tags: added: db
tags: added: compute
melanie witt (melwitt)
Changed in nova:
importance: Undecided → High
status: New → Confirmed
Revision history for this message
jichenjc (jichenjc) wrote :

https://bugs.launchpad.net/nova/+bug/1268569

looks like this was also caused by multiple API call concurrently

Revision history for this message
jichenjc (jichenjc) wrote :

looks to me we have several alternatives
1) use row level lock when we try to do instance status update
2) use table level lock and accept the possible performance impact
3) ignore the issue since we did resize the instance, only 1st caller get error notification

Revision history for this message
Jay Pipes (jaypipes) wrote :

While a SELECT FOR UPDATE would solve this particular race condition, it would introduce a write-intent lock at a potentially very contentious place in the DB API. I'm gonna take this bug on and implement a compare-and-swap loop to issue a lock-free update that does not have a race condition.

Changed in nova:
assignee: nobody → Jay Pipes (jaypipes)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/109837

Changed in nova:
status: Confirmed → In Progress
Changed in nova:
assignee: Jay Pipes (jaypipes) → Matthew Booth (mbooth-9)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/141115
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=5b4083f8cbbcca18ea5896f06b371e29b244d0e2
Submitter: Jenkins
Branch: master

commit 5b4083f8cbbcca18ea5896f06b371e29b244d0e2
Author: Matthew Booth <email address hidden>
Date: Thu Dec 11 17:52:08 2014 +0000

    Implement compare-and-swap for instance update

    This patch reworks nova.db.sqlalchemy.api._instance_update to remove a
    race condition that exists in a critical section of code that
    retrieves the instance record from the database, does some checks of
    both the new values and existing values of instance fields, and then
    tries to update the database with the new values.

    We update the value using update_on_match, which does an optimistic,
    atomic, lock-free, compare-and-swap on the object. If the update
    fails, we do some additional checks to determine the specific error,
    which maintains the previous behaviour. There's an exceptionally
    small chance of a race when checking for an error if the expected
    update conditions were not met when we tried the UPDATE, but were met
    when we fetch the instance for error checking. We handle this by
    raising a new error, InstanceUpdateConflict.

    _instance_update() is simplified by having it return only the updated
    instance_ref. instance_update_and_get_original() is updated to fetch
    the old value itself before calling _instance_update().

    Closes-bug: #1297375
    Change-Id: I9cd0f4b620e639b238555983bf6d58deafbaefeb

Changed in nova:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/202593

Thierry Carrez (ttx)
Changed in nova:
milestone: none → liberty-2
status: Fix Committed → Fix Released
Revision history for this message
John Garbutt (johngarbutt) wrote :

This was reverted, but now has a patch open for review again

Changed in nova:
milestone: liberty-2 → none
status: Fix Released → In Progress
Revision history for this message
John Garbutt (johngarbutt) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/202593
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=2a875644ccb4071758234a2a9837fdb6b4ad915e
Submitter: Jenkins
Branch: master

commit 2a875644ccb4071758234a2a9837fdb6b4ad915e
Author: Matthew Booth <email address hidden>
Date: Thu Dec 11 17:52:08 2014 +0000

    Implement compare-and-swap for instance update

    This patch reworks nova.db.sqlalchemy.api._instance_update to remove a
    race condition that exists in a critical section of code that
    retrieves the instance record from the database, does some checks of
    both the new values and existing values of instance fields, and then
    tries to update the database with the new values.

    We update the value using update_on_match, which does an optimistic,
    atomic, lock-free, compare-and-swap on the object. If the update
    fails, we do some additional checks to determine the specific error,
    which maintains the previous behaviour.

    _instance_update() is simplified by having it return only the updated
    instance_ref. instance_update_and_get_original() is updated to fetch
    the old value itself before calling _instance_update().

    This patch was originally submitted as change
    I9cd0f4b620e639b238555983bf6d58deafbaefeb, but that was reverted. The
    original version was optimistic about not racing in
    instance_update_and_get_original between fetching an instance and
    updating it, in which case it raised a different exception to the
    previous behaviour. In practice we were hit that race frequently in
    the gate, and the unexpectedly different exception was causing gate
    failures. This new version adds a retry in this case, and raises the
    expected exception. The frequency of hitting that race highlights the
    importance of this patch, as it is otherwise unhandled.

    This new version also updates the exception hierarchy so that all
    exceptions raised by _instance_update inherit from a single exception:
    InstanceUpdateConflict. UnexpectedVMStateError is removed, as it had no
    users. An InstanceUpdateConflict is now raised instead.

    The new code trivially allows the caller to check any property when
    updating an instance, not just the two prescribed. This new version
    enables this by allowing the 'expected' dict to be passed in, as well
    as maintaining the behaviour of pulling expected values from the
    update dict. This has numerous potential applications, for example
    atomically changing the host of an instance during evacuation.

    The new version adds numerous additional tests.

    Closes-bug: #1297375
    Change-Id: I293da6f320dd8f3474ce2a9c907298e1fb348181

Changed in nova:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Michael Still (<email address hidden>) on branch: master
Review: https://review.openstack.org/109837
Reason: Matt's patch merged, so I am going to abandon this one.

Thierry Carrez (ttx)
Changed in nova:
milestone: none → liberty-3
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: liberty-3 → 12.0.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.