Deadlock creating >5 servers at once using xenapi driver

Bug #924918 reported by Johannes Erdfelt
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Undecided
Johannes Erdfelt

Bug Description

A deadlock can occur when a compute node is working on more than 5 servers at once. This only affects systems using the xenapi driver.

The problem is caused by a nested use of _get_session(). If multiple greenthreads are attempting to communicate with dom0 at once, then the outer call to _get_session() can end up with all of the sessions acquired leaving the inner calls to _get_session() waiting for one of the outer calls to release.

Changed in nova:
assignee: nobody → Johannes Erdfelt (johannes.erdfelt)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/3631

Changed in nova:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/3631
Committed: http://github.com/openstack/nova/commit/093e4d38d511c7bb8d48fceebbfd8e350b533075
Submitter: Jenkins
Branch: master

commit 093e4d38d511c7bb8d48fceebbfd8e350b533075
Author: Johannes Erdfelt <email address hidden>
Date: Wed Feb 1 17:07:16 2012 +0000

    Make sure multiple calls to _get_session() aren't nested

    Fixes bug 924918

    async_call_plugin() acquires a xenapi session as does the nested call to
    get_xenapi_host(). This can cause a deadlock if multiple greenthreads
    all block waiting for the outer sessions to be freed to allocate the
    inner session. This change moves the call to get_xenapi_host() to outside
    the with statement to ensure calls to _get_session() aren't nested.

    Change-Id: I8f5490f40a9ccaf74a276187f66519a5d5f52b2e

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
milestone: none → essex-4
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: essex-4 → 2012.1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.