Inefficient queries inside online_data_migrations

Bug #1822613 reported by Mohammed Naser
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Matt Riedemann

Bug Description

The online_data_migrations should be ran after an upgrade and contains a list of tasks to do to backfill information after an upgrade, however, some of those queries are extremely inefficient which results in this online data migrations taking an unacceptable period of time. The SQL query that takes a really long time in question:

> SELECT count(*) AS count_1
> FROM (SELECT instance_extra.created_at AS instance_extra_created_at,
> instance_extra.updated_at AS instance_extra_updated_at,
> instance_extra.deleted_at AS instance_extra_deleted_at,
> instance_extra.deleted AS instance_extra_deleted, instance_extra.id AS
> instance_extra_id, instance_extra.instance_uuid AS
> instance_extra_instance_uuid
> FROM instance_extra
> WHERE instance_extra.keypairs IS NULL AND instance_extra.deleted = 0) AS anon_1

It would also be good for us to *not* run a data migration again if we know we've already gotten found=0 when online_data_migrations is running in "forever-until-complete". Also, the value of 50 rows per run in that mode is quite small.

ref: http://lists.openstack.org/pipermail/openstack-discuss/2019-April/004397.html

Tags: db nova-manage
tags: added: db nova-manage
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/649648

Changed in nova:
assignee: nobody → Matt Riedemann (mriedem)
status: New → In Progress
Revision history for this message
Matt Riedemann (mriedem) wrote :

There are several things in this bug, and the patch I've proposed solves the immediate pain which is the migration to move keypairs from the nova cell DB to the API DB which we shouldn't need anymore.

As discussed in the ML, we don't have a generic marker to keep track of which migrations don't need to be run anymore (unlike the migrations table in the DB for the sqlalchemy-migrate schema migrations), but some online data migrations do use markers for the longer-running migrations to perform paging.

As for the default batch size of 50, it's hard to say what that should be by default, it totally depends on the cloud, and I seem to remember another bug related to that batch size before as well (either being too big or too small). Maybe a better solution would be to just add a configuration option for the default batch size so large public clouds can make it big (1000?) and smaller private clouds can leave it smaller (50 by default). Anyway, that's likely a separate issue/bug.

Changed in nova:
importance: Undecided → Medium
Revision history for this message
Matt Riedemann (mriedem) wrote :

Bug 1742649 was the one I was thinking about regarding --max-count but it was for the map_instances command, but similar issue.

Revision history for this message
Matt Riedemann (mriedem) wrote :

For the --max-count batch size issue, I think you might be looking for something like this:

https://review.openstack.org/#/c/378718/

Where archive_deleted_rows takes a batch size and an option to run until complete. The online_data_migration command doesn't really work that way - it either runs in batches of 50 until complete, or if you specify your own --max-count it only processes that many and stops.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/649648
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=cec1808050495aa43a2b67058077063bf3b6f4ed
Submitter: Zuul
Branch: master

commit cec1808050495aa43a2b67058077063bf3b6f4ed
Author: Matt Riedemann <email address hidden>
Date: Wed Apr 3 11:35:37 2019 -0400

    Drop migrate_keypairs_to_api_db data migration

    This was added in Newton:

      I97b72ae3e7e8ea3d6b596870d8da3aaa689fd6b5

    And was meant to migrate keypairs from the cell
    (nova) DB to the API DB. Before that though, the
    keypairs per instance would be migrated to the
    instance_extra table in the cell DB. The migration
    to instance_extra was dropped in Queens with change:

      Ie83e7bd807c2c79e5cbe1337292c2d1989d4ac03

    As the commit message on ^ mentions, the 345 cell
    DB schema migration required that the cell DB keypairs
    table was empty before you could upgrade to Ocata.

    The migrate_keypairs_to_api_db routine only migrates
    any keypairs to the API DB if there are entries in the
    keypairs table in the cell DB, but because of that blocker
    migration in Ocata that cannot be the case anymore, so
    really migrate_keypairs_to_api_db is just wasting time
    querying the database during the online_data_migrations
    routine without it actually migrating anything, so we
    should just remove it.

    Change-Id: Ie56bc411880c6d1c04599cf9521e12e8b4878e1e
    Closes-Bug: #1822613

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 20.0.0.0rc1

This issue was fixed in the openstack/nova 20.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.