tpool implementation in sqlalchemy is wrong

Bug #1128605 reported by Chris Behrens
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Chris Behrens
oslo-incubator
Fix Released
Undecided
Chris Behrens
Grizzly
Fix Released
Undecided
Chris Behrens

Bug Description

The eventlet tpool implementation wraps every SQL query individually such that deadlock can occur. When using transactions (and autocommit is False anyway, such that we're always using transactions), all SQL queries must submit a 'COMMIT' query to end the transaction. It's possible that there are no threads available to submit the 'COMMIT' query because they are all blocking in SQL queries that grab a lock.

Ie, imagine the thread pool size is 20. Imagine that you have 20 queries in progress that grab the same lock within mysql... and all are in a state where none of them have done a COMMIT yet. You now have 20 queries stuck waiting for the lock timeout.

Eventually you'll get a 'Lock wait timeout' error from mysql.

When we implement tpool, we'll need to wrap each DB API call as a whole.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to oslo-incubator (master)

Fix proposed to branch: master
Review: https://review.openstack.org/22158

Changed in oslo:
assignee: nobody → Chris Behrens (cbehrens)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/22163

Changed in nova:
assignee: nobody → Chris Behrens (cbehrens)
status: New → In Progress
Chris Behrens (cbehrens)
Changed in nova:
milestone: none → grizzly-3
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to oslo-incubator (master)

Reviewed: https://review.openstack.org/22158
Committed: http://github.com/openstack/oslo-incubator/commit/02c12aade7a0c28c66cb45b54786c90c0ae8fb09
Submitter: Jenkins
Branch: master

commit 02c12aade7a0c28c66cb45b54786c90c0ae8fb09
Author: Chris Behrens <email address hidden>
Date: Mon Feb 18 02:35:34 2013 +0000

    Move DB thread pooling to DB API loader

    Fixes bug 1128605

    The dbpool code in sqlalchemy session is the wrong place to implement
    thread pooling as it wraps each individual SQL call to run in its own
    thread. When combined with SQL server locking, all threads can be eaten
    waiting on locks with none available to run a 'COMMIT'.

    The correct place to do thread pooling is around each DB API call.

    This patch removes dbpool from sqlalchemy and creates a common DB API
    loader for all openstack projects which implements the following
    configuration options:

    db_backend: Full path to DB API backend module (or a known short name if
                a project chooses to implement a mapping)
    dbapi_use_tpool: True or False whether to use thread pooling around all
                     DB API calls.

    DB backend modules must implement a 'get_backend()' method.

    Example usage for nova/db/api.py would be:

    """
    from nova.openstack.common.db import api as db_api

    _KNOWN_BACKENDS = {'sqlalchemy': 'nova.db.sqlalchemy.api'}

    IMPL = db_api.DBAPI(backend_mapping=_KNOWN_BACKENDS)
    """

    NOTE: Enabling thread pooling will be broken until this issue is
    resolved in eventlet _OR_ until we modify our eventlet.monkey_patch()
    calls to include 'thread=False':

    https://bitbucket.org/eventlet/eventlet/issue/137/

    Change-Id: Idf14563ea07cf8ccf2a77b3f53659d8528927fc7

Changed in oslo:
status: In Progress → Fix Committed
Changed in nova:
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/22163
Committed: http://github.com/openstack/nova/commit/de09aa254f647dc0dab0cd67c0ed26e093df180a
Submitter: Jenkins
Branch: master

commit de09aa254f647dc0dab0cd67c0ed26e093df180a
Author: Chris Behrens <email address hidden>
Date: Mon Feb 18 04:00:38 2013 +0000

    Move DB thread pooling to DB API.

    The eventlet db_pool wrapping done in sqlalchemy is wrong. It needs to
    be around the whole DB API call.

    Fixes bug 1128605

    Syncs nova with the fix in oslo. Moves nova to use the oslo version of
    db/api.py to import the DB API implementation.

    Renames configuration option sql_dbpool_enable to dbapi_use_tpool

    NOTE: tpool will not work without a fix to eventlet for Threads + locking.

    DocImpact

    Change-Id: I6c75b6138d38d12261d133f2cb2f5e59c667f837

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in oslo:
milestone: none → grizzly-3
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: grizzly-3 → 2013.1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.