Incorrect exception handling in DB code involving rollbacked transactions.

Bug #1501686 reported by Ann Taraday
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Medium
Ann Taraday
Kilo
Fix Released
Undecided
Unassigned

Bug Description

I found out that some methods like add_ha_port https://github.com/openstack/neutron/blob/master/neutron/db/l3_hamode_db.py#L312-L328 contain the following logic:

def create():
      create_something()
      try:
            _do_other_thing()
      except Exception:
             with excutils.save_and_reraise_exception():
                     delete_something()

def _do_other_thing():
     with context.session.begin(subtransactions=True):
         ....

The problem is that when exception is raised in _do_other_thing it is caught in except block, but the object cannot be deleted in except section because internal transaction has been rolled back. We have tests on these methods, but they also are not correct (for example https://github.com/openstack/neutron/blob/master/neutron/tests/unit/db/test_l3_hamode_db.py#L360-L377) as methods _do_other_thing() are mocked so inside transaction is never created and aborted.

The possible solution is to use nested transaction in such places like this:

def create()
       with context.session.begin(subtransactions=True):
               create_something()
               try:
                       _do_other_thing()
               except Exception:
                       with excutils.save_and_reraise_exception():
                           delete_something()

def _do_other_thing():
     with context.session.begin(nested=True):
         ....

description: updated
Changed in neutron:
importance: Undecided → Medium
Assaf Muller (amuller)
Changed in neutron:
status: New → Confirmed
tags: added: l3-ha
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/230481

Changed in neutron:
status: Confirmed → In Progress
Changed in neutron:
assignee: Ann Kamyshnikova (akamyshnikova) → Cedric Brandily (cbrandily)
Changed in neutron:
assignee: Cedric Brandily (cbrandily) → Ann Kamyshnikova (akamyshnikova)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/230481
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=924f19e8f16aebf103a3b70f2bd236afc846933b
Submitter: Jenkins
Branch: master

commit 924f19e8f16aebf103a3b70f2bd236afc846933b
Author: Ann Kamyshnikova <email address hidden>
Date: Sat Dec 12 22:11:37 2015 +0300

    Make object creation methods in l3_hamode_db atomic

    Methods _create_ha_network, add_ha_port don't have wrapping
    transaction in them, so they are prone to race conditions.
    This commit adds a transaction to them. To avoid problem with
    rolling back outmost transaction during exception handling,
    internal methods have been wrapped in nested transaction.

    Nested transaction is required in places like this:

    def create():
          create_something()
          try:
                _do_other_thing()
          except Exception:
                 with excutils.save_and_reraise_exception():
                         delete_something()

    def _do_other_thing():
         with context.session.begin(subtransactions=True):
             ....

    When exception is raised in _do_other_thing it
    is caught in except block, but the object cannot be deleted in
    except section because internal transaction has been rolled back.

    A new method safe_creation and has been added
    that provides a common way of handling such situations.

    Closes-bug: #1501686

    Change-Id: I952f6f7f8684743aa7f829bd92b1dc41b2c6aecf

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/liberty)

Fix proposed to branch: stable/liberty
Review: https://review.openstack.org/270866

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/kilo)

Fix proposed to branch: stable/kilo
Review: https://review.openstack.org/273110

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/liberty)

Reviewed: https://review.openstack.org/270866
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=229856666adf729172be0c4cd0231934bcbc8d23
Submitter: Jenkins
Branch: stable/liberty

commit 229856666adf729172be0c4cd0231934bcbc8d23
Author: Ann Kamyshnikova <email address hidden>
Date: Sat Dec 12 22:11:37 2015 +0300

    Make object creation methods in l3_hamode_db atomic

    Methods _create_ha_network, add_ha_port don't have wrapping
    transaction in them, so they are prone to race conditions.
    This commit adds a transaction to them. To avoid problem with
    rolling back outmost transaction during exception handling,
    internal methods have been wrapped in nested transaction.

    Nested transaction is required in places like this:

    def create():
          create_something()
          try:
                _do_other_thing()
          except Exception:
                 with excutils.save_and_reraise_exception():
                         delete_something()

    def _do_other_thing():
         with context.session.begin(subtransactions=True):
             ....

    When exception is raised in _do_other_thing it
    is caught in except block, but the object cannot be deleted in
    except section because internal transaction has been rolled back.

    A new method safe_creation and has been added
    that provides a common way of handling such situations.

    Closes-bug: #1501686

    Change-Id: I952f6f7f8684743aa7f829bd92b1dc41b2c6aecf
    (cherry picked from commit 924f19e8f16aebf103a3b70f2bd236afc846933b)

tags: added: in-stable-liberty
Revision history for this message
LIU Yulong (dragon889) wrote :

Hi all, HA router create failed:
http://paste.openstack.org/show/486954/
seems still got the transaction rollback issue.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/kilo)

Reviewed: https://review.openstack.org/273110
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=c1a6fdb9f64e3436dbbe97603df8beb7ddd6a501
Submitter: Jenkins
Branch: stable/kilo

commit c1a6fdb9f64e3436dbbe97603df8beb7ddd6a501
Author: Ann Kamyshnikova <email address hidden>
Date: Sat Dec 12 22:11:37 2015 +0300

    Make object creation methods in l3_hamode_db atomic

    Methods _create_ha_network, add_ha_port don't have wrapping
    transaction in them, so they are prone to race conditions.
    This commit adds a transaction to them. To avoid problem with
    rolling back outmost transaction during exception handling,
    internal methods have been wrapped in nested transaction.

    Nested transaction is required in places like this:

    def create():
          create_something()
          try:
                _do_other_thing()
          except Exception:
                 with excutils.save_and_reraise_exception():
                         delete_something()

    def _do_other_thing():
         with context.session.begin(subtransactions=True):
             ....

    When exception is raised in _do_other_thing it
    is caught in except block, but the object cannot be deleted in
    except section because internal transaction has been rolled back.

    A new method safe_creation and has been added
    that provides a common way of handling such situations.

    Closes-bug: #1501686

    Conflicts:
     doc/source/devref/effective_neutron.rst
     neutron/db/common_db_mixin.py
     neutron/db/l3_hamode_db.py

    Change-Id: I952f6f7f8684743aa7f829bd92b1dc41b2c6aecf
    (cherry picked from commit 924f19e8f16aebf103a3b70f2bd236afc846933b)
    (cherry picked from commit 229856666adf729172be0c4cd0231934bcbc8d23)

tags: added: in-stable-kilo
Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/neutron 2015.1.4

This issue was fixed in the openstack/neutron 2015.1.4 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (master)

Reviewed: https://review.openstack.org/346954
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=22a341a204e84b83094899730602975baa469af9
Submitter: Jenkins
Branch: master

commit 22a341a204e84b83094899730602975baa469af9
Author: Sreekumar S <email address hidden>
Date: Mon Jul 25 22:43:24 2016 +0530

    Fixes the midonet test_l3 unit test failures

    Failures for midonet unit tests seems to be caused due
    to nested rollback.
    Referred https://review.openstack.org/#/c/230481/
    for this fix.

    Change-Id: Ic6b37bf3f799055c93e9eeb9b7f51758ceecbeb5
    Closes-Bug: #1605894
    Related-Bug: #1501686

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (stable/mitaka)

Related fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/352917

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (stable/mitaka)

Change abandoned by Sreekumar S (<email address hidden>) on branch: stable/mitaka
Review: https://review.openstack.org/352917
Reason: The merging got all wrong. Abandoning.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (stable/mitaka)

Reviewed: https://review.openstack.org/346376
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=51ebedb2c654efd69adde83bcd26578c79720c51
Submitter: Jenkins
Branch: stable/mitaka

commit 51ebedb2c654efd69adde83bcd26578c79720c51
Author: Sreekumar S <email address hidden>
Date: Mon Jun 20 21:06:19 2016 +0530

    Fixes port device_id/device_owner change in failed operation

    If a router is a attached to a neutron port and that operation
    fails at a late stage in the processing then the 'device_id' and
    'device_owner' attributes of the port are still changed. The
    failed operation should not change them.

    The reason is, as of commit 0947458018725b241603139f4ec6f92e84b2f29b,
    the update of those attributes happens outside of the DB transaction.
    For failed operation, the values are not reset back to their original
    values.

    (cherry picked from commit cb8af9bd50659daca4c1dea4070a011190ef3461)

    Also fixes: Fixes the midonet test_l3 unit test failures

    Failures for midonet unit tests seems to be caused due
    to nested rollback.
    Referred https://review.openstack.org/#/c/230481/
    for this fix.

    Conflicts:
     neutron/db/l3_db.py
     neutron/tests/unit/extensions/test_l3.py

    (cherry picked from commit 22a341a204e84b83094899730602975baa469af9)

    NOTE: This commit merged two patches from master branch.

    Change-Id: I797df266dafc41843408dc95a6ce9f986db5c21c
    Closes-Bug: #1523859
    Closes-Bug: #1605894
    Related-Bug: #1501686

tags: added: in-stable-mitaka
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 2015.1.4

This issue was fixed in the openstack/neutron 2015.1.4 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.