nova.db.sqlalchemy.migration.db_version is racy

Bug #1804652 reported by Matthew Booth
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Low
Matthew Booth

Bug Description

db_version() attempts to initialise versioning if the db is not versioned. However, it doesn't consider concurrency, so we can get errors if multiple watchers try to get the db version before the db is initialised. We are seeing this in practise during tripleo deployments in a script which waits on multiple controller nodes for db sync to complete.

Tags: db nova-manage
Revision history for this message
Matthew Booth (mbooth-9) wrote :
Revision history for this message
Matt Riedemann (mriedem) wrote :

Not really sure what you would expect nova to do with this, it's handling the case that the migrate_version table doesn't exist (no schema) and puts the db under version control if not, and as noted in the bz it can race fail if 2 workers are doing that concurrently. I guess you want to handle db_version_control failing with:

"InternalError: (1050, u\"Table 'migrate_version' already exists\")",

And then just call versioning_api.db_version again?

Changed in nova:
importance: Undecided → Low
tags: added: db
Changed in nova:
status: New → Triaged
Revision history for this message
Matthew Booth (mbooth-9) wrote :

Just noticed I hadn't tagged the patch I knocked up with this bug.

Yes, basically just try again, except there are a bunch of different failure modes which aren't part of an API, so we can't rely on what they are.

Changed in nova:
assignee: nobody → Matthew Booth (mbooth-9)
status: Triaged → In Progress
Revision history for this message
Matthew Booth (mbooth-9) wrote :

Incidentally this race isn't theoretical: see the BZ linked in comment 1. We're currently hitting this reliably on install in a script which runs on all controllers to wait for the schema to be loaded before continuing. We're also working on another workaround in the installer, but this remains a thing.

Revision history for this message
Matthew Booth (mbooth-9) wrote :
Matt Riedemann (mriedem)
tags: added: nova-manage
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.opendev.org/619622
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=9c0d188988eb86f9186bca98e544ba15105072e8
Submitter: Zuul
Branch: master

commit 9c0d188988eb86f9186bca98e544ba15105072e8
Author: Matthew Booth <email address hidden>
Date: Thu Nov 22 14:06:53 2018 +0000

    Workaround a race initialising version control in db_version()

    Closes-Bug: #1804652

    Change-Id: I18e682d7aa5ebbddec7e22e1a504888a83b9be12

Changed in nova:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.