Bug #1714898 “Some DB operations aren't handling errors like Dea...” : Bugs : networking-odl

Michel Peterson (mpeterson) on 2017-09-04

Changed in networking-odl:
assignee:	nobody → Michel Peterson (mpeterson)
summary:	- Some DB operations aren't handling things like Dead Locks correctly + Some DB operations aren't handling errors like Dead Locks correctly

Revision history for this message

Michel Peterson (mpeterson) wrote on 2017-09-04:

#1

screen-q-svc-498881-d2fcd3e.txt.gz Edit (13.7 MiB, text/plain)

Attaching log that is linked in order to not lose it when it's deleted from the CIs machines.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-09-04: Fix proposed to networking-odl (master)

#2

Fix proposed to branch: master
Review: https://review.openstack.org/500584

Changed in networking-odl:
status:	New → In Progress

Isaku Yamahata (yamahata) on 2017-09-06

Changed in networking-odl:
importance:	Undecided → Critical

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-10-26: Fix merged to networking-odl (master)

#3

Reviewed: https://review.openstack.org/500584
Committed: https://git.openstack.org/cgit/openstack/networking-odl/commit/?id=75c8962918d47978ca3fcf7fbfcfcb40a156d802
Submitter: Zuul
Branch: master

commit 75c8962918d47978ca3fcf7fbfcfcb40a156d802
Author: Michel Peterson <email address hidden>
Date: Mon Sep 4 17:45:38 2017 +0300

Fixes error handling of DB calls

    There are functions that are wrapped with a `wrap_db_retry` because they
    are part of a transaction happening at a higher level inside Neutron. The
    problem with wrapping it with that decorator is that without specifying
    an `exception_checker` then it only handles `RetryRequests` and not
    other types of exceptions that are expected to be received (ie:
    `DBDeadLock`). This was an oversight that allowed the related bug to
    ocurr and lead to errors reaching Neutron when they could have been
    handled at a lower level. In addition to this problem, some of the
    functions tried to retry without a proper `SAVEPOINT` or transaction
    that would allow the retry. Moreover, the retries were often made at a
    very low level, when it should have been at a higher level where the
    operation is retried as a whole and not "atomic" operations.

    This patch allows the inclusion of an `exception_checker` based on the
    Neutron library that handles these DB exceptions + a MySQL issue that
    is triggered by `DBDeadLock`s happening (see neutron bug #1590298). It
    also keeps the handling of these exceptions at the networking_odl
    level because we need to do a best effort of allowing to succeed without
    having to trigger a rollback of the Neutron action which would be much
    costly performance wise. This retry, however is done whenever possible
    at the highest level that makes sense, including `SAVEPOINT`s and
    transactions depending on the need.

    Also, fixes some issues where some of the operations that the driver
    does were not able to be retried because of incorrect handling of
    exceptions.

Change-Id: I31085cf73618df48f55f3169e071d2cb64c9b018
Closes-Bug: #1714898

Reviewed:  https://review.openstack.org/500584
Committed: https://git.openstack.org/cgit/openstack/networking-odl/commit/?id=75c8962918d47978ca3fcf7fbfcfcb40a156d802
Submitter: Zuul
Branch:    master

commit 75c8962918d47978ca3fcf7fbfcfcb40a156d802
Author: Michel Peterson <mpeterso@redhat.com>
Date:   Mon Sep 4 17:45:38 2017 +0300

Fixes error handling of DB calls
    
    There are functions that are wrapped with a `wrap_db_retry` because they
    are part of a transaction happening at a higher level inside Neutron. The
    problem with wrapping it with that decorator is that without specifying
    an `exception_checker` then it only handles `RetryRequests` and not
    other types of exceptions that are expected to be received (ie:
    `DBDeadLock`). This was an oversight that allowed the related bug to
    ocurr and lead to errors reaching Neutron when they could have been
    handled at a lower level. In addition to this problem, some of the
    functions tried to retry without a proper `SAVEPOINT` or transaction
    that would allow the retry. Moreover, the retries were often made at a
    very low level, when it should have been at a higher level where the
    operation is retried as a whole and not "atomic" operations.
    
    This patch allows the inclusion of an `exception_checker` based on the
    Neutron library that handles these DB exceptions + a MySQL issue that
    is triggered by `DBDeadLock`s happening (see neutron bug #1590298). It
    also keeps the handling of these exceptions at the networking_odl
    level because we need to do a best effort of allowing to succeed without
    having to trigger a rollback of the Neutron action which would be much
    costly performance wise. This retry, however is done whenever possible
    at the highest level that makes sense, including `SAVEPOINT`s and
    transactions depending on the need.
    
    Also, fixes some issues where some of the operations that the driver
    does were not able to be retried because of incorrect handling of
    exceptions.
    
    Change-Id: I31085cf73618df48f55f3169e071d2cb64c9b018
    Closes-Bug: #1714898

Changed in networking-odl:
status:	In Progress → Fix Released

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-10-28: Fix proposed to networking-odl (stable/pike)

#4

Fix proposed to branch: stable/pike
Review: https://review.openstack.org/516001

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-11-14: Fix merged to networking-odl (stable/pike)

#5

Reviewed: https://review.openstack.org/516001
Committed: https://git.openstack.org/cgit/openstack/networking-odl/commit/?id=7f9a439aea15f02c2c2822231f6604c5f7435bc8
Submitter: Zuul
Branch: stable/pike

commit 7f9a439aea15f02c2c2822231f6604c5f7435bc8
Author: Michel Peterson <email address hidden>
Date: Mon Sep 4 17:45:38 2017 +0300

Fixes error handling of DB calls

    There are functions that are wrapped with a `wrap_db_retry` because they
    are part of a transaction happening at a higher level inside Neutron. The
    problem with wrapping it with that decorator is that without specifying
    an `exception_checker` then it only handles `RetryRequests` and not
    other types of exceptions that are expected to be received (ie:
    `DBDeadLock`). This was an oversight that allowed the related bug to
    ocurr and lead to errors reaching Neutron when they could have been
    handled at a lower level. In addition to this problem, some of the
    functions tried to retry without a proper `SAVEPOINT` or transaction
    that would allow the retry. Moreover, the retries were often made at a
    very low level, when it should have been at a higher level where the
    operation is retried as a whole and not "atomic" operations.

    This patch allows the inclusion of an `exception_checker` based on the
    Neutron library that handles these DB exceptions + a MySQL issue that
    is triggered by `DBDeadLock`s happening (see neutron bug #1590298). It
    also keeps the handling of these exceptions at the networking_odl
    level because we need to do a best effort of allowing to succeed without
    having to trigger a rollback of the Neutron action which would be much
    costly performance wise. This retry, however is done whenever possible
    at the highest level that makes sense, including `SAVEPOINT`s and
    transactions depending on the need.

    Also, fixes some issues where some of the operations that the driver
    does were not able to be retried because of incorrect handling of
    exceptions.

    Change-Id: I31085cf73618df48f55f3169e071d2cb64c9b018
    Closes-Bug: #1714898
    (cherry picked from commit 75c8962918d47978ca3fcf7fbfcfcb40a156d802)

Reviewed:  https://review.openstack.org/516001
Committed: https://git.openstack.org/cgit/openstack/networking-odl/commit/?id=7f9a439aea15f02c2c2822231f6604c5f7435bc8
Submitter: Zuul
Branch:    stable/pike

commit 7f9a439aea15f02c2c2822231f6604c5f7435bc8
Author: Michel Peterson <mpeterso@redhat.com>
Date:   Mon Sep 4 17:45:38 2017 +0300

Fixes error handling of DB calls
    
    There are functions that are wrapped with a `wrap_db_retry` because they
    are part of a transaction happening at a higher level inside Neutron. The
    problem with wrapping it with that decorator is that without specifying
    an `exception_checker` then it only handles `RetryRequests` and not
    other types of exceptions that are expected to be received (ie:
    `DBDeadLock`). This was an oversight that allowed the related bug to
    ocurr and lead to errors reaching Neutron when they could have been
    handled at a lower level. In addition to this problem, some of the
    functions tried to retry without a proper `SAVEPOINT` or transaction
    that would allow the retry. Moreover, the retries were often made at a
    very low level, when it should have been at a higher level where the
    operation is retried as a whole and not "atomic" operations.
    
    This patch allows the inclusion of an `exception_checker` based on the
    Neutron library that handles these DB exceptions + a MySQL issue that
    is triggered by `DBDeadLock`s happening (see neutron bug #1590298). It
    also keeps the handling of these exceptions at the networking_odl
    level because we need to do a best effort of allowing to succeed without
    having to trigger a rollback of the Neutron action which would be much
    costly performance wise. This retry, however is done whenever possible
    at the highest level that makes sense, including `SAVEPOINT`s and
    transactions depending on the need.
    
    Also, fixes some issues where some of the operations that the driver
    does were not able to be retried because of incorrect handling of
    exceptions.
    
    Change-Id: I31085cf73618df48f55f3169e071d2cb64c9b018
    Closes-Bug: #1714898
    (cherry picked from commit 75c8962918d47978ca3fcf7fbfcfcb40a156d802)

tags:

added: in-stable-pike

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-12-07: Fix included in openstack/networking-odl 12.0.0.0b2

#6

This issue was fixed in the openstack/networking-odl 12.0.0.0b2 development milestone.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-05-10: Fix included in openstack/networking-odl 11.0.1

#7

This issue was fixed in the openstack/networking-odl 11.0.1 release.

networking-odl

Some DB operations aren't handling errors like Dead Locks correctly

Bug Description

Other bug subscribers

Bug attachments

Remote bug watches