ncc upgrade to N or O release fail with db migration error

Bug #1711209 reported by Danny Hammo
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Nova Cloud Controller Charm
Fix Released
Critical
Felipe Reyes

Bug Description

While testing openstack upgrade from M to N and O release, the ncc after the upgrade gets stuck at "db migration" and eventually (it takes about 20mins) errors to:

nova-cloud-controller/1* error idle 1/lxd/9 10.230.40.26 8774/tcp hook failed: "shared-db-relation-changed" for mysql:shared-db

Also noticed that the none leader ncc fails to start haproxy and memcached, which blocks the service:
nova-cloud-controller/0 blocked idle 0/lxd/9 10.230.40.39 8774/tcp Services not running that should be: haproxy, memcached

I have hit this issue on every try. All charms were upgraded to 17.02 release prior to openstack upgrade.

The juju-unit logs throws the following traceback:

2017-08-11 00:25:41 INFO juju-log shared-db:14: Migrating the nova database.
2017-08-11 00:25:45 DEBUG shared-db-relation-changed Option "logdir" from group "DEFAULT" is deprecated. Use option "log-dir" from group "DEFAULT".
2017-08-11 00:25:45 DEBUG shared-db-relation-changed Option "verbose" from group "DEFAULT" is deprecated for removal. Its value may be silently ignored in the future.
2017-08-11 00:25:46 DEBUG shared-db-relation-changed Traceback (most recent call last):
2017-08-11 00:25:46 DEBUG shared-db-relation-changed File "/var/lib/juju/agents/unit-nova-cloud-controller-1/charm/hooks/shared-db-relation-changed", line 1190, in <module>
2017-08-11 00:25:46 DEBUG shared-db-relation-changed main()
2017-08-11 00:25:46 DEBUG shared-db-relation-changed File "/var/lib/juju/agents/unit-nova-cloud-controller-1/charm/hooks/shared-db-relation-changed", line 1184, in main
2017-08-11 00:25:46 DEBUG shared-db-relation-changed hooks.execute(sys.argv)
2017-08-11 00:25:46 DEBUG shared-db-relation-changed File "/var/lib/juju/agents/unit-nova-cloud-controller-1/charm/hooks/charmhelpers/core/hookenv.py", line 731, in execute
2017-08-11 00:25:46 DEBUG shared-db-relation-changed self._hooks[hook_name]()
2017-08-11 00:25:46 DEBUG shared-db-relation-changed File "/var/lib/juju/agents/unit-nova-cloud-controller-1/charm/hooks/nova_cc_utils.py", line 1102, in wrapped_f
2017-08-11 00:25:46 DEBUG shared-db-relation-changed f(*args)
2017-08-11 00:25:46 DEBUG shared-db-relation-changed File "/var/lib/juju/agents/unit-nova-cloud-controller-1/charm/hooks/charmhelpers/contrib/openstack/utils.py", line 1864, in wrapped_f
2017-08-11 00:25:46 DEBUG shared-db-relation-changed restart_functions)
2017-08-11 00:25:46 DEBUG shared-db-relation-changed File "/var/lib/juju/agents/unit-nova-cloud-controller-1/charm/hooks/charmhelpers/core/host.py", line 655, in restart_on_change_helper
2017-08-11 00:25:46 DEBUG shared-db-relation-changed r = lambda_f()
2017-08-11 00:25:46 DEBUG shared-db-relation-changed File "/var/lib/juju/agents/unit-nova-cloud-controller-1/charm/hooks/charmhelpers/contrib/openstack/utils.py", line 1863, in <lambda>
2017-08-11 00:25:46 DEBUG shared-db-relation-changed (lambda: f(*args, **kwargs)), restart_map, stopstart,
2017-08-11 00:25:46 DEBUG shared-db-relation-changed File "/var/lib/juju/agents/unit-nova-cloud-controller-1/charm/hooks/shared-db-relation-changed", line 475, in db_changed
2017-08-11 00:25:46 DEBUG shared-db-relation-changed leader_init_db_if_ready()
2017-08-11 00:25:46 DEBUG shared-db-relation-changed File "/var/lib/juju/agents/unit-nova-cloud-controller-1/charm/hooks/shared-db-relation-changed", line 189, in leader_init_db_if_ready
2017-08-11 00:25:46 DEBUG shared-db-relation-changed migrate_nova_databases()
2017-08-11 00:25:46 DEBUG shared-db-relation-changed File "/var/lib/juju/agents/unit-nova-cloud-controller-1/charm/hooks/charmhelpers/core/decorators.py", line 40, in _retry_on_exception_inner_2
2017-08-11 00:25:46 DEBUG shared-db-relation-changed return f(*args, **kwargs)
2017-08-11 00:25:46 DEBUG shared-db-relation-changed File "/var/lib/juju/agents/unit-nova-cloud-controller-1/charm/hooks/nova_cc_utils.py", line 790, in migrate_nova_databases
2017-08-11 00:25:46 DEBUG shared-db-relation-changed migrate_nova_database()
2017-08-11 00:25:46 DEBUG shared-db-relation-changed File "/var/lib/juju/agents/unit-nova-cloud-controller-1/charm/hooks/nova_cc_utils.py", line 700, in migrate_nova_database
2017-08-11 00:25:46 DEBUG shared-db-relation-changed subprocess.check_output(cmd)
2017-08-11 00:25:46 DEBUG shared-db-relation-changed File "/usr/lib/python2.7/subprocess.py", line 574, in check_output
2017-08-11 00:25:46 DEBUG shared-db-relation-changed raise CalledProcessError(retcode, cmd, output=output)
2017-08-11 00:25:46 DEBUG shared-db-relation-changed subprocess.CalledProcessError: Command '['nova-manage', 'db', 'sync']' returned non-zero exit status 1

Charm release version:
Stdout: |
    commit-sha-1: 8c0168c9c37b97f7e85684b740c9014a12a5e3a0
    commit-short: 8c0168c
    branch: HEAD
    remote: https://github.com/openstack/charm-nova-cloud-controller
    info-generated: Thu Jul 6 19:28:20 UTC 2017
    note: This file should exist only in a built or released charm artifact (not in the charm source code tree).
  UnitId: nova-cloud-controller/1

sosreport from the affected unit is attached

Tags: sts
Revision history for this message
Danny Hammo (dan-hammo) wrote :
Danny Hammo (dan-hammo)
description: updated
Revision history for this message
Danny Hammo (dan-hammo) wrote :
Download full text (13.4 KiB)

On a recent test, the nova-schedule.log had the following error:

2017-08-23 17:07:22.616 1151495 ERROR nova
2017-08-23 17:07:27.726 1151526 WARNING oslo_reports.guru_meditation_report [-] Guru meditation now registers SIGUSR1 and SIGUSR2 by default for backward compatibility. SIGUSR1 will no longer be registered in a future release, so please use SIGUSR2 to generate reports.
2017-08-23 17:07:27.746 1151526 WARNING oslo_config.cfg [-] Option "scheduler_default_filters" from group "DEFAULT" is deprecated. Use option "enabled_filters" from group "filter_scheduler".
2017-08-23 17:07:28.070 1151526 CRITICAL nova [req-cf2e56c6-f4d2-49b5-9b77-2fbd62ff7a45 - - - - -] ProgrammingError: (pymysql.err.ProgrammingError) (1146, u"Table 'nova_api.aggregates' doesn't exist") [SQL: u'SELECT aggregates.created_at AS aggregates_created_at, aggregates.updated_at AS aggregates_updated_at, aggregates.id AS aggregates_id, aggregates.uuid AS aggregates_uuid, aggregates.name AS aggregates_name, aggregate_hosts_1.created_at AS aggregate_hosts_1_created_at, aggregate_hosts_1.updated_at AS aggregate_hosts_1_updated_at, aggregate_hosts_1.id AS aggregate_hosts_1_id, aggregate_hosts_1.host AS aggregate_hosts_1_host, aggregate_hosts_1.aggregate_id AS aggregate_hosts_1_aggregate_id, aggregate_metadata_1.created_at AS aggregate_metadata_1_created_at, aggregate_metadata_1.updated_at AS aggregate_metadata_1_updated_at, aggregate_metadata_1.id AS aggregate_metadata_1_id, aggregate_metadata_1.`key` AS aggregate_metadata_1_key, aggregate_metadata_1.value AS aggregate_metadata_1_value, aggregate_metadata_1.aggregate_id AS aggregate_metadata_1_aggregate_id \nFROM aggregates LEFT OUTER JOIN aggregate_hosts AS aggregate_hosts_1 ON aggregates.id = aggregate_hosts_1.aggregate_id LEFT OUTER JOIN aggregate_metadata AS aggregate_metadata_1 ON aggregates.id = aggregate_metadata_1.aggregate_id']
2017-08-23 17:07:28.070 1151526 ERROR nova Traceback (most recent call last):
2017-08-23 17:07:28.070 1151526 ERROR nova File "/usr/bin/nova-scheduler", line 10, in <module>
2017-08-23 17:07:28.070 1151526 ERROR nova sys.exit(main())
2017-08-23 17:07:28.070 1151526 ERROR nova File "/usr/lib/python2.7/dist-packages/nova/cmd/scheduler.py", line 44, in main
2017-08-23 17:07:28.070 1151526 ERROR nova topic=CONF.scheduler_topic)
2017-08-23 17:07:28.070 1151526 ERROR nova File "/usr/lib/python2.7/dist-packages/nova/service.py", line 241, in create
2017-08-23 17:07:28.070 1151526 ERROR nova periodic_interval_max=periodic_interval_max)
2017-08-23 17:07:28.070 1151526 ERROR nova File "/usr/lib/python2.7/dist-packages/nova/service.py", line 117, in __init__
2017-08-23 17:07:28.070 1151526 ERROR nova self.manager = manager_class(host=self.host, *args, **kwargs)
2017-08-23 17:07:28.070 1151526 ERROR nova File "/usr/lib/python2.7/dist-packages/nova/scheduler/manager.py", line 57, in __init__
2017-08-23 17:07:28.070 1151526 ERROR nova invoke_on_load=True).driver
2017-08-23 17:07:28.070 1151526 ERROR nova File "/usr/lib/python2.7/dist-packages/stevedore/driver.py", line 61, in __init__
2017-08-23 17:07:28.070 1151526 ERROR nova warn_on_missing_entrypoint=warn_on_missing_entrypoi...

Revision history for this message
Felipe Reyes (freyes) wrote : Re: [Bug 1711209] Re: ncc upgrade to N or O release fail with db migration error

I can reproduce the "nova-manage db sync" error using the latest stable charms:

0) juju-deployer bundle -> http://paste.ubuntu.com/25382733/
1) juju deployer -c bundle.yaml -d -v -s -e my-controller:my-model xenial-mitaka
2) bzr branch lp:openstack-charm-testing
3) source ~/novarc
4) cd openstack-charm-testing
5) ./configure SOME-PROFILE
6) git clone https://github.com/openstack-charmers/openstack-charms-tools
7) cd openstack-charms-tools ; ./os-upgrade.py -o cloud:xenial-newton
8) let hooks finish
9) ./os-upgrade.py -o cloud:xenial-ocata
  - Once this step finishes the nova-cloud-controller is in error state because "nova-manage db online_data_migrations" was not run.

I'll try using the master branch to check if the problem is still there

References: https://docs.openstack.org/nova/latest/user/upgrade.html

Felipe Reyes (freyes)
Changed in charm-nova-cloud-controller:
assignee: nobody → Felipe Reyes (freyes)
Changed in charm-nova-cloud-controller:
importance: Undecided → High
status: New → Triaged
milestone: none → 17.08
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-nova-cloud-controller (master)

Fix proposed to branch: master
Review: https://review.openstack.org/498019

Changed in charm-nova-cloud-controller:
status: Triaged → In Progress
Changed in charm-nova-cloud-controller:
importance: High → Critical
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-nova-cloud-controller (master)

Reviewed: https://review.openstack.org/498019
Committed: https://git.openstack.org/cgit/openstack/charm-nova-cloud-controller/commit/?id=eacd234ed5b544e7a83487312807c63aa1c7bbb5
Submitter: Jenkins
Branch: master

commit eacd234ed5b544e7a83487312807c63aa1c7bbb5
Author: Felipe Reyes <email address hidden>
Date: Fri Aug 25 15:11:39 2017 -0300

    Run nova-manage db online_data_migrations on upgrades

    In mitaka an extra operation called online_data_migrations was added to
    assist rolling upgrades.

    This patch runs this before and after, this is to make sure the old
    upgrades where this migration was not being performed are run before
    installing a newer release.

    References:
    - https://git.openstack.org/cgit/openstack/nova/commit/?id=7d5069ce20ec0d792a3974b2c53d01ae005cde98
    - https://docs.openstack.org/nova/latest/user/upgrade.html

    Change-Id: Ic4bc2c5ae86c99c3af2a1d0ee08badb09835d932
    Closes-Bug: 1711209

Changed in charm-nova-cloud-controller:
status: In Progress → Fix Committed
James Page (james-page)
Changed in charm-nova-cloud-controller:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.