ovn-ctl: ovsdb-server startup might sometimes get stuck and not upgrade clustered database

Bug #1937075 reported by Frode Nordahl
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ovn (Ubuntu)
Fix Released
Undecided
Unassigned
Focal
Fix Released
High
Unassigned
Groovy
Fix Released
Undecided
Unassigned
Hirsute
Fix Released
Undecided
Unassigned
Impish
Fix Released
Undecided
Unassigned

Bug Description

[Impact]
On upgrades requiring schema changes a cluster may never be upgraded, this can be problematic and might in the worst case scenario lead to data plane outage.

Having said that it is not very likely that we will SRU changes to Focal that includes schema changes, but it is good to be on the safe side so we're ready for it in the event it happens.

[Test Plan]
1. With base in the ovn-central charm gate test we can deploy a OVN database cluster.
2. Restart services so that DB leader transition away from the machine that created the cluster.
3. Restart services again on original leader and confirm whether the bug is fixed.

[Regression Potential]
The patch in question has been in upstream branches and releases for a long time, it is also a very minor change adding a argument that allows control tool to not hang when the local unit is not the leader.

[Original Bug Description]
As reported and fixed in [0] if the node that is configured with an empty ``--db-*-cluster-remote-addr`` is no longer the leader, the ovn-ctl script will hang on startup.

Prior to the hang the script will successfully start the ovsdb-server, but never get to the point where a clustered DB is upgraded in the event of a schema change.

0: https://github.com/ovn-org/ovn/commit/65cc0e225b5922a72f0d40c2c39da0210669c21a

Related branches

Frode Nordahl (fnordahl)
Changed in ovn (Ubuntu Impish):
status: New → Fix Released
Changed in ovn (Ubuntu Hirsute):
status: New → Fix Released
Changed in ovn (Ubuntu Groovy):
status: New → Fix Released
Changed in ovn (Ubuntu Focal):
status: New → Triaged
importance: Undecided → High
Frode Nordahl (fnordahl)
description: updated
Revision history for this message
Brian Murray (brian-murray) wrote : Please test proposed package

Hello Frode, or anyone else affected,

Accepted ovn into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/ovn/20.03.2-0ubuntu0.20.04.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in ovn (Ubuntu Focal):
status: Triaged → Fix Committed
tags: added: verification-needed verification-needed-focal
Revision history for this message
Luciano Lo Giudice (lmlogiudice) wrote :

Tested with the new packages and the db server did not get stuck on startup for a clustered database. Marking it as 'verification-done'.

tags: added: verification-done-focal
removed: verification-needed verification-needed-focal
Revision history for this message
Chris Halse Rogers (raof) wrote : Update Released

The verification of the Stable Release Update for ovn has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ovn - 20.03.2-0ubuntu0.20.04.2

---------------
ovn (20.03.2-0ubuntu0.20.04.2) focal; urgency=medium

  * Add RBAC rules for IGMP_Group table (LP: #1914988):
    - d/p/lp-1914988-Add-IGMP_Group-to-ovn-controller-RBAC.patch
    - d/p/lp-1914988-tests-Use-ovn_start-in-tests-ovn-controller.at.patch
    - d/p/lp-1914988-tests-Make-certificate-generation-extendable.patch
    - d/p/lp-1914988-tests-Test-with-SSL-and-RBAC-for-controller-by-defau.patch
  * d/p/lp-1943266-physical-do-not-forward-traffic-from-localport-to-a-.patch:
    Do not forward traffic from localport to localnet ports (LP: #1943266).a
  * d/p/lp-1937075-ovn-ctl-Fix-stucked-while-do-cluster-db-init.patch:
    Fix issue where clustered database might not be upgraded (LP: #1937075).

 -- Frode Nordahl <email address hidden> Fri, 01 Oct 2021 09:42:00 +0200

Changed in ovn (Ubuntu Focal):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.