InternalError during nova-manage db sync when upgrading to Rocky

Bug #1761775 reported by Matt Riedemann
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
grenade
Fix Released
Undecided
Surya Seetharaman
openstack-ansible
Fix Released
Medium
Jesse Pretorius

Bug Description

Seeing this in a grenade run upgrading to Rocky:

http://logs.openstack.org/38/544238/4/check/neutron-grenade/d101bba/logs/grenade.sh.txt.gz#_2018-04-05_09_41_21_078

2018-04-05 09:41:21.078 | + /opt/stack/new/grenade/projects/60_nova/upgrade.sh:main:74 : /usr/local/bin/nova-manage --config-file /etc/nova/nova.conf db sync
2018-04-05 09:41:21.528 | /usr/local/lib/python2.7/dist-packages/psycopg2/__init__.py:144: UserWarning: The psycopg2 wheel package will be renamed from release 2.8; in order to keep installing from binary please use "pip install psycopg2-binary" instead. For details see: <http://initd.org/psycopg/docs/install.html#binary-install-from-pypi>.
2018-04-05 09:41:21.528 | """)
2018-04-05 09:41:25.030 | ERROR oslo_db.sqlalchemy.exc_filters [None req-955c4479-2900-4425-bb3e-2161ed85d6ed None None] DBAPIError exception wrapped from (pymysql.err.InternalError) (1054, u"Unknown column 'cell_mappings.disabled' in 'field list'") [SQL: u'SELECT cell_mappings.created_at AS cell_mappings_created_at, cell_mappings.updated_at AS cell_mappings_updated_at, cell_mappings.id AS cell_mappings_id, cell_mappings.uuid AS cell_mappings_uuid, cell_mappings.name AS cell_mappings_name, cell_mappings.transport_url AS cell_mappings_transport_url, cell_mappings.database_connection AS cell_mappings_database_connection, cell_mappings.disabled AS cell_mappings_disabled \nFROM cell_mappings \nWHERE cell_mappings.uuid = %(uuid_1)s \n LIMIT %(param_1)s'] [parameters: {u'uuid_1': '00000000-0000-0000-0000-000000000000', u'param_1': 1}] (Background on this error at: http://sqlalche.me/e/2j85): InternalError: (1054, u"Unknown column 'cell_mappings.disabled' in 'field list'")
2018-04-05 09:41:25.030 | ERROR oslo_db.sqlalchemy.exc_filters Traceback (most recent call last):
2018-04-05 09:41:25.030 | ERROR oslo_db.sqlalchemy.exc_filters  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1193, in _execute_context
2018-04-05 09:41:25.030 | ERROR oslo_db.sqlalchemy.exc_filters  context)
2018-04-05 09:41:25.030 | ERROR oslo_db.sqlalchemy.exc_filters  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/default.py", line 507, in do_execute
2018-04-05 09:41:25.030 | ERROR oslo_db.sqlalchemy.exc_filters  cursor.execute(statement, parameters)
2018-04-05 09:41:25.030 | ERROR oslo_db.sqlalchemy.exc_filters  File "/usr/local/lib/python2.7/dist-packages/pymysql/cursors.py", line 165, in execute
2018-04-05 09:41:25.030 | ERROR oslo_db.sqlalchemy.exc_filters  result = self._query(query)
2018-04-05 09:41:25.030 | ERROR oslo_db.sqlalchemy.exc_filters  File "/usr/local/lib/python2.7/dist-packages/pymysql/cursors.py", line 321, in _query
2018-04-05 09:41:25.030 | ERROR oslo_db.sqlalchemy.exc_filters  conn.query(q)
2018-04-05 09:41:25.030 | ERROR oslo_db.sqlalchemy.exc_filters  File "/usr/local/lib/python2.7/dist-packages/pymysql/connections.py", line 860, in query
2018-04-05 09:41:25.030 | ERROR oslo_db.sqlalchemy.exc_filters  self._affected_rows = self._read_query_result(unbuffered=unbuffered)
2018-04-05 09:41:25.030 | ERROR oslo_db.sqlalchemy.exc_filters  File "/usr/local/lib/python2.7/dist-packages/pymysql/connections.py", line 1061, in _read_query_result
2018-04-05 09:41:25.031 | ERROR oslo_db.sqlalchemy.exc_filters  result.read()
2018-04-05 09:41:25.031 | ERROR oslo_db.sqlalchemy.exc_filters  File "/usr/local/lib/python2.7/dist-packages/pymysql/connections.py", line 1349, in read
2018-04-05 09:41:25.031 | ERROR oslo_db.sqlalchemy.exc_filters  first_packet = self.connection._read_packet()
2018-04-05 09:41:25.031 | ERROR oslo_db.sqlalchemy.exc_filters  File "/usr/local/lib/python2.7/dist-packages/pymysql/connections.py", line 1018, in _read_packet
2018-04-05 09:41:25.031 | ERROR oslo_db.sqlalchemy.exc_filters  packet.check_error()
2018-04-05 09:41:25.031 | ERROR oslo_db.sqlalchemy.exc_filters  File "/usr/local/lib/python2.7/dist-packages/pymysql/connections.py", line 384, in check_error
2018-04-05 09:41:25.031 | ERROR oslo_db.sqlalchemy.exc_filters  err.raise_mysql_exception(self._data)
2018-04-05 09:41:25.031 | ERROR oslo_db.sqlalchemy.exc_filters  File "/usr/local/lib/python2.7/dist-packages/pymysql/err.py", line 107, in raise_mysql_exception
2018-04-05 09:41:25.031 | ERROR oslo_db.sqlalchemy.exc_filters  raise errorclass(errno, errval)
2018-04-05 09:41:25.031 | ERROR oslo_db.sqlalchemy.exc_filters InternalError: (1054, u"Unknown column 'cell_mappings.disabled' in 'field list'")
2018-04-05 09:41:25.031 | ERROR oslo_db.sqlalchemy.exc_filters 
2018-04-05 09:41:25.032 | ERROR: Could not access cell0.
2018-04-05 09:41:25.032 | Has the nova_api database been created?
2018-04-05 09:41:25.032 | Has the nova_cell0 database been created?
2018-04-05 09:41:25.032 | Has "nova-manage api_db sync" been run?
2018-04-05 09:41:25.032 | Has "nova-manage cell_v2 map_cell0" been run?
2018-04-05 09:41:25.032 | Is [api_database]/connection set in nova.conf?
2018-04-05 09:41:25.032 | Is the cell0 database connection URL correct?

That's from this call:

https://github.com/openstack-dev/grenade/blob/b7dbc632b6a0fe2b625096b3a53576e746150b0a/projects/60_nova/upgrade.sh#L74

That doesn't fail because the nova-manage db sync command catches the exception, logs the error but doesn't return a non-zero exit code:

https://github.com/openstack/nova/blob/887dc23a1775a8e64eee62ac3f94a04cba0052c8/nova/cmd/manage.py#L468

This is failing in Rocky because the cell_mappings.disabled column is new, but the nova_api db sync hasn't happened yet since grenade syncs the nova db before the nova_api db, which follows the nova upgrade docs:

https://docs.openstack.org/nova/latest/user/upgrade.html#rolling-upgrade-process

So we could fix this in one of two ways:

1. Swap the order in the grenade upgrade.sh script to sync the nova_api db first (and update the nova docs to flip that order also).

2. Add a from-queens script to grenade for upgrading nova to sync the API DB first.

Option #1 seems a bit shady, but option #2 seems excessive, especially if we're going to be changing things in the API DB in future releases.

Revision history for this message
Matt Riedemann (mriedem) wrote :

Alternatively, we could pass the --local_cell flag to the nova-manage db sync command in grenade to only sync the cell (nova) database, not cell0. But that's another behavior change.

tags: added: cells
Changed in grenade:
status: New → Confirmed
Changed in grenade:
assignee: nobody → Surya Seetharaman (tssurya)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to openstack-ansible-os_nova (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/559354

Revision history for this message
Matt Riedemann (mriedem) wrote :

The api_db sync was added to grenade back in mitaka, when order didn't really matter, so it was probably not done that way intentionally:

https://review.openstack.org/#/c/190399/

And when upgrading from ocata->pike (cells v2 was required starting in ocata), the old side of grenade in that scenario, ocata, would be installed with devstack, which syncs the API DB first before the cell (nova) DB.

Changed in openstack-ansible:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to grenade (master)

Fix proposed to branch: master
Review: https://review.openstack.org/559358

Changed in grenade:
status: Confirmed → In Progress
Changed in openstack-ansible:
assignee: nobody → Jesse Pretorius (jesse-pretorius)
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to grenade (master)

Reviewed: https://review.openstack.org/559358
Committed: https://git.openstack.org/cgit/openstack-dev/grenade/commit/?id=dc7f4a4ba5697d5a73a1e656d4a1717964324eab
Submitter: Zuul
Branch: master

commit dc7f4a4ba5697d5a73a1e656d4a1717964324eab
Author: Surya Seetharaman <email address hidden>
Date: Fri Apr 6 17:26:36 2018 +0200

    Do the api_db sync before db sync

    This patch swaps the order of doing api_db sync with db_sync.
    The API database migration should be run before the migrations for
    the main/cell databases. This is because the former contains
    information about how to find and connect to the latter.

    This was discovered in Rocky since we added a data model change
    (a new column to the cell_mappings table) that depends on db sync
    being run after having added this new column to the api_db (i.e running
    api_db sync before db sync). The data model change was the first time
    where the actual sync order became meaningful.

    This has been correctly done in devstack and the fact that grenade was
    doing the db sync before api_db sync was hidden by the fact that
    devstack had run at least once before reaching this part of grenade,
    which meant that the database was already initialized.

    Change-Id: Ic790ef7c3531c2b672621310524797548246b2ef
    Closes-Bug: #1761775

Changed in grenade:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to openstack-ansible-os_nova (master)

Reviewed: https://review.openstack.org/559354
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible-os_nova/commit/?id=68f51e94867081a5c1f8169589f0c88984111cd7
Submitter: Zuul
Branch: master

commit 68f51e94867081a5c1f8169589f0c88984111cd7
Author: Jesse Pretorius <email address hidden>
Date: Fri Apr 6 16:14:03 2018 +0100

    Re-order database migrations

    The database migration sequence currently implemented were
    required for Newton->Ocata as cells v2 was being implemented.

    Now that cells are in place already, we can simplify the
    sequence of tasks. This also now happens to be necessary due
    to changes in the Rocky API database.

    Change-Id: I101e75f70a7c99ceb1129c9c224f6a303dfae771
    Related-Bug: #1761775

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to openstack-ansible-os_nova (stable/queens)

Related fix proposed to branch: stable/queens
Review: https://review.openstack.org/562023

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to openstack-ansible-os_nova (stable/pike)

Related fix proposed to branch: stable/pike
Review: https://review.openstack.org/562024

Changed in openstack-ansible:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to openstack-ansible-os_nova (stable/queens)

Reviewed: https://review.openstack.org/562023
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible-os_nova/commit/?id=c0d9a6314ab871a8ec1e29e7cddfce4f57290e52
Submitter: Zuul
Branch: stable/queens

commit c0d9a6314ab871a8ec1e29e7cddfce4f57290e52
Author: Jesse Pretorius <email address hidden>
Date: Fri Apr 6 16:14:03 2018 +0100

    Re-order database migrations

    The database migration sequence currently implemented were
    required for Newton->Ocata as cells v2 was being implemented.

    Now that cells are in place already, we can simplify the
    sequence of tasks. This also now happens to be necessary due
    to changes in the Rocky API database.

    Change-Id: I101e75f70a7c99ceb1129c9c224f6a303dfae771
    Related-Bug: #1761775
    (cherry picked from commit 68f51e94867081a5c1f8169589f0c88984111cd7)

tags: added: in-stable-queens
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to openstack-ansible-os_nova (stable/pike)

Reviewed: https://review.openstack.org/562024
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible-os_nova/commit/?id=a8feb9bbcec9085bd4c2c4fe044711a8bb7e818f
Submitter: Zuul
Branch: stable/pike

commit a8feb9bbcec9085bd4c2c4fe044711a8bb7e818f
Author: Jesse Pretorius <email address hidden>
Date: Fri Apr 6 16:14:03 2018 +0100

    Re-order database migrations

    The database migration sequence currently implemented were
    required for Newton->Ocata as cells v2 was being implemented.

    Now that cells are in place already, we can simplify the
    sequence of tasks. This also now happens to be necessary due
    to changes in the Rocky API database.

    Change-Id: I101e75f70a7c99ceb1129c9c224f6a303dfae771
    Related-Bug: #1761775
    (cherry picked from commit 68f51e94867081a5c1f8169589f0c88984111cd7)

tags: added: in-stable-pike
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers