OpenSSL.SSL.SysCallError: (111, 'ECONNREFUSED') and Connection thread stops

Bug #1895727 reported by Terry Wilson
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Ubuntu Cloud Archive
Fix Released
Undecided
Unassigned
Ussuri
Fix Released
Undecided
Unassigned
Victoria
Fix Released
Undecided
Unassigned
ovsdbapp
Fix Released
Undecided
Unassigned
python-ovsdbapp (Ubuntu)
Fix Released
Undecided
Unassigned
Focal
Fix Released
Undecided
Unassigned
Groovy
Fix Released
Undecided
Unassigned
Hirsute
Fix Released
Undecided
Unassigned

Bug Description

If ovsdb-server is down for a while and we are connecting via SSL, python-ovs will raise

OpenSSL.SSL.SysCallError: (111, 'ECONNREFUSED')

instead of just returning an error type. If this goes on for a bit, then the Connection thread will exit and be unrecoverable without restarting neutron-server.

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

SRU:

[Impact]
Any intermittent connection issues between neutron-server and ovsdb nb/sb resulted in neutron-server not handling any more ovsdb transactions due to improper exception handling during reconnections. This further creates failures in post commit updates of resources and results in neutron/ovn db inconsistencies.
This fix catches the exceptions and retries to connect to ovsdb.

[Test plan]
* Deploy bionic-ussuri with neutron-server and ovn-central as HA using juju charms.
* Launch few instances and check if instances are in active state
* Simulated the network communication issues by modifying iptables related to ports 6641 6643 6644 16642

  - On ovn-central/0, Dropping packets from ovn-central/2 and neutron-server/2
  - On ovn-central/1, Dropping packets from ovn-central/2 and neutron-server/2
  - On ovn-central/2, Dropping packets from ovn-central/0, ovn-central/1, neutron-server/0, neutron-server/1

DROP_PKTS_FROM_OVN_CENTRAL=
DROP_PKTS_FROM_NEUTRON_SERVER=
for ip in $DROP_PKTS_FROM_OVN_CENTRAL; do for port in 6641 6643 6644 16642; do iptables -I ufw-before-input 1 -s $ip -p tcp --dport $port -j REJECT; done; done
for ip in $DROP_PKTS_FROM_NEUTRON_SERVER; do for port in 6641 16642; do iptables -I ufw-before-input 1 -s $ip -p tcp --dport $port -j REJECT; done; done

* After a minute, drop the new REJECT rules added.
* Launch around 5 new VMs (5 to ensure some post creations to be landed on neutron-server/2) and look for Timeout Exceptions on neutron-server/2
  If there are any Timeout exceptions, the neutron-server ovsdb connections are stale and not handling any more ovsdb transactions.
  No Timeout exceptions and any port status updates from ovsdb implies neutron-server is successful in reconnection and started handling updates.

[Where problems could occur]

The fix passed the upstream zuul gates (tempest tests etc) and the patch just adds reconnection tries to ovsdbapp. The fix increases the reconnection attempts for every 4 minutes (3 min connection timeout + 1 min sleep) until the connection is successful. I dont see any regressions can happen with this change.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ovsdbapp (master)

Reviewed: https://review.opendev.org/752092
Committed: https://git.openstack.org/cgit/openstack/ovsdbapp/commit/?id=83cf7aa6c81f1b2341b2bba1fe156047fa5d29f6
Submitter: Zuul
Branch: master

commit 83cf7aa6c81f1b2341b2bba1fe156047fa5d29f6
Author: Terry Wilson <email address hidden>
Date: Tue Sep 15 13:42:08 2020 -0500

    Don't give up when an Exception happens in idl.run

    It's possible that idl.run() could have a bug where it raises an
    Exception for an extended period of time while ovsdb-server is
    down, but recover once ovsdb-server comes back up. Specifically,
    python-ovs currently doesn't properly catch an exception when the
    socket type is 'ssl' that it catches for other protocols.

    Change-Id: Ia068650d2db3d5d8642771a6df5a260d692aea20
    Closes-Bug: #1895727

Changed in ovsdbapp:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ovsdbapp (stable/victoria)

Fix proposed to branch: stable/victoria
Review: https://review.opendev.org/759665

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ovsdbapp (stable/victoria)

Reviewed: https://review.opendev.org/759665
Committed: https://git.openstack.org/cgit/openstack/ovsdbapp/commit/?id=4807809ba7ec09a7e9cf533e334de282e0d373cd
Submitter: Zuul
Branch: stable/victoria

commit 4807809ba7ec09a7e9cf533e334de282e0d373cd
Author: Terry Wilson <email address hidden>
Date: Tue Sep 15 13:42:08 2020 -0500

    Don't give up when an Exception happens in idl.run

    It's possible that idl.run() could have a bug where it raises an
    Exception for an extended period of time while ovsdb-server is
    down, but recover once ovsdb-server comes back up. Specifically,
    python-ovs currently doesn't properly catch an exception when the
    socket type is 'ssl' that it catches for other protocols.

    Change-Id: Ia068650d2db3d5d8642771a6df5a260d692aea20
    Closes-Bug: #1895727
    (cherry picked from commit 83cf7aa6c81f1b2341b2bba1fe156047fa5d29f6)

tags: added: in-stable-victoria
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ovsdbapp (stable/ussuri)

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/760414

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ovsdbapp (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/760435

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ovsdbapp (stable/train)

Reviewed: https://review.opendev.org/760435
Committed: https://git.openstack.org/cgit/openstack/ovsdbapp/commit/?id=2184d8e1e5bd3acb4073b7f1d4439f1d3bf658e6
Submitter: Zuul
Branch: stable/train

commit 2184d8e1e5bd3acb4073b7f1d4439f1d3bf658e6
Author: Terry Wilson <email address hidden>
Date: Tue Sep 15 13:42:08 2020 -0500

    Don't give up when an Exception happens in idl.run

    It's possible that idl.run() could have a bug where it raises an
    Exception for an extended period of time while ovsdb-server is
    down, but recover once ovsdb-server comes back up. Specifically,
    python-ovs currently doesn't properly catch an exception when the
    socket type is 'ssl' that it catches for other protocols.

    Conflicts:
      ovsdbapp/backend/ovs_idl/connection.py

    Change-Id: Ia068650d2db3d5d8642771a6df5a260d692aea20
    Closes-Bug: #1895727
    (cherry picked from commit 83cf7aa6c81f1b2341b2bba1fe156047fa5d29f6)
    (cherry picked from commit 4807809ba7ec09a7e9cf533e334de282e0d373cd)
    (cherry picked from commit b49239d02408065e7e16ae5085c27df77ec4ac57)

tags: added: in-stable-train
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ovsdbapp (stable/ussuri)

Reviewed: https://review.opendev.org/760414
Committed: https://git.openstack.org/cgit/openstack/ovsdbapp/commit/?id=b49239d02408065e7e16ae5085c27df77ec4ac57
Submitter: Zuul
Branch: stable/ussuri

commit b49239d02408065e7e16ae5085c27df77ec4ac57
Author: Terry Wilson <email address hidden>
Date: Tue Sep 15 13:42:08 2020 -0500

    Don't give up when an Exception happens in idl.run

    It's possible that idl.run() could have a bug where it raises an
    Exception for an extended period of time while ovsdb-server is
    down, but recover once ovsdb-server comes back up. Specifically,
    python-ovs currently doesn't properly catch an exception when the
    socket type is 'ssl' that it catches for other protocols.

    Conflicts:
      ovsdbapp/backend/ovs_idl/connection.py

    Change-Id: Ia068650d2db3d5d8642771a6df5a260d692aea20
    Closes-Bug: #1895727
    (cherry picked from commit 83cf7aa6c81f1b2341b2bba1fe156047fa5d29f6)
    (cherry picked from commit 4807809ba7ec09a7e9cf533e334de282e0d373cd)

tags: added: in-stable-ussuri
description: updated
tags: added: sts
Changed in python-ovsdbapp (Ubuntu Hirsute):
status: New → Fix Released
Revision history for this message
Hemanth Nakkina (hemanth-n) wrote :
Revision history for this message
Hemanth Nakkina (hemanth-n) wrote :
Revision history for this message
Hemanth Nakkina (hemanth-n) wrote :

Debdiff's for Grrovy and Focal uploaded

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in python-ovsdbapp (Ubuntu Focal):
status: New → Confirmed
Changed in python-ovsdbapp (Ubuntu Groovy):
status: New → Confirmed
Revision history for this message
Corey Bryant (corey.bryant) wrote :
description: updated
Revision history for this message
Brian Murray (brian-murray) wrote : Please test proposed package

Hello Terry, or anyone else affected,

Accepted python-ovsdbapp into groovy-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/python-ovsdbapp/1.5.0-0ubuntu2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-groovy to verification-done-groovy. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-groovy. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in python-ovsdbapp (Ubuntu Groovy):
status: Confirmed → Fix Committed
tags: added: verification-needed verification-needed-groovy
Changed in python-ovsdbapp (Ubuntu Focal):
status: Confirmed → Fix Committed
tags: added: verification-needed-focal
Revision history for this message
Brian Murray (brian-murray) wrote :

Hello Terry, or anyone else affected,

Accepted python-ovsdbapp into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/python-ovsdbapp/1.1.0-0ubuntu2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Revision history for this message
Corey Bryant (corey.bryant) wrote :

Hello Terry, or anyone else affected,

Accepted python-ovsdbapp into ussuri-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository.

Please help us by testing this new package. To enable the -proposed repository:

  sudo add-apt-repository cloud-archive:ussuri-proposed
  sudo apt-get update

Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-ussuri-needed to verification-ussuri-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-ussuri-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags: added: verification-ussuri-needed
Revision history for this message
Corey Bryant (corey.bryant) wrote :

Hello Terry, or anyone else affected,

Accepted python-ovsdbapp into victoria-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository.

Please help us by testing this new package. To enable the -proposed repository:

  sudo add-apt-repository cloud-archive:victoria-proposed
  sudo apt-get update

Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-victoria-needed to verification-victoria-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-victoria-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags: added: verification-victoria-needed
Changed in cloud-archive:
status: New → Fix Released
Revision history for this message
Hemanth Nakkina (hemanth-n) wrote :

Verified the test case on 4 environments and installed the new packages and restarted neutron-server service. The connections towards OVSDB are reconnected and VMs are launched without any issues

* bionic ussuri (ussuri-proposed)
* focal (focal-proposed)
* focal victoria (victoria-proposed)
* groovy (groovy-proposed)

tags: added: verification-done verification-done-focal verification-done-groovy verification-ussuri-done verification-victoria-done
removed: verification-needed verification-needed-focal verification-needed-groovy verification-ussuri-needed verification-victoria-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package python-ovsdbapp - 1.5.0-0ubuntu2

---------------
python-ovsdbapp (1.5.0-0ubuntu2) groovy; urgency=medium

  [ Chris MacNaughton ]
  * d/control: Update VCS paths for move to lp:~ubuntu-openstack-dev.

  [ Corey Bryant ]
  * d/gbp.conf: Create stable/victoria branch.

  [ Hemanth Nakkina ]
  * Don't give up when an Exception happens in idl.run (LP: #1895727)
    - d/p/0001-Don-t-give-up-when-an-Exception-happens-in-idl.run.patch

 -- Chris MacNaughton <email address hidden> Thu, 08 Oct 2020 14:45:23 +0000

Changed in python-ovsdbapp (Ubuntu Groovy):
status: Fix Committed → Fix Released
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Update Released

The verification of the Stable Release Update for python-ovsdbapp has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package python-ovsdbapp - 1.1.0-0ubuntu2

---------------
python-ovsdbapp (1.1.0-0ubuntu2) focal; urgency=medium

  [ Corey Bryant ]
  * d/gbp.conf: Create stable/ussuri branch.

  [ Chris MacNaughton ]
  * d/control: Update VCS paths for move to lp:~ubuntu-openstack-dev.

  [ Hemanth Nakkina ]
  * Don't give up when an Exception happens in idl.run (LP: #1895727)
    - d/p/0001-Don-t-give-up-when-an-Exception-happens-in-idl.run.patch

 -- Corey Bryant <email address hidden> Mon, 12 Apr 2021 17:12:59 -0400

Changed in python-ovsdbapp (Ubuntu Focal):
status: Fix Committed → Fix Released
Revision history for this message
Corey Bryant (corey.bryant) wrote :

The verification of the Stable Release Update for python-ovsdbapp has completed successfully and the package has now been released to -updates. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Corey Bryant (corey.bryant) wrote :

This bug was fixed in the package python-ovsdbapp - 1.5.0-0ubuntu2~cloud0
---------------

 python-ovsdbapp (1.5.0-0ubuntu2~cloud0) focal-victoria; urgency=medium
 .
   * New update for the Ubuntu Cloud Archive.
 .
 python-ovsdbapp (1.5.0-0ubuntu2) groovy; urgency=medium
 .
   [ Chris MacNaughton ]
   * d/control: Update VCS paths for move to lp:~ubuntu-openstack-dev.
 .
   [ Corey Bryant ]
   * d/gbp.conf: Create stable/victoria branch.
 .
   [ Hemanth Nakkina ]
   * Don't give up when an Exception happens in idl.run (LP: #1895727)
     - d/p/0001-Don-t-give-up-when-an-Exception-happens-in-idl.run.patch

Revision history for this message
Corey Bryant (corey.bryant) wrote :

The verification of the Stable Release Update for python-ovsdbapp has completed successfully and the package has now been released to -updates. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Corey Bryant (corey.bryant) wrote :

This bug was fixed in the package python-ovsdbapp - 1.1.0-0ubuntu2~cloud0
---------------

 python-ovsdbapp (1.1.0-0ubuntu2~cloud0) bionic-ussuri; urgency=medium
 .
   * New update for the Ubuntu Cloud Archive.
 .
 python-ovsdbapp (1.1.0-0ubuntu2) focal; urgency=medium
 .
   [ Corey Bryant ]
   * d/gbp.conf: Create stable/ussuri branch.
 .
   [ Chris MacNaughton ]
   * d/control: Update VCS paths for move to lp:~ubuntu-openstack-dev.
 .
   [ Hemanth Nakkina ]
   * Don't give up when an Exception happens in idl.run (LP: #1895727)
     - d/p/0001-Don-t-give-up-when-an-Exception-happens-in-idl.run.patch

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/ovsdbapp 1.6.1

This issue was fixed in the openstack/ovsdbapp 1.6.1 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.