Newton - Ocata Undercloud Upgrade fails

Bug #1699515 reported by Simon Wright
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Brad P. Crochet

Bug Description

Following the tripleo instructions http://tripleo.org/installation/installation.html#installing-the-undercloud
I successfully deployed undercloud and then overcloud on a fresh install of Centos 7.3 using latest RDO Newton Delorean repo's,
https://trunk.rdoproject.org/centos7-newton/current/delorean.repo

when I try to upgrade the undercloud to the Ocata release
https://trunk.rdoproject.org/centos7-ocata/current/delorean.repo

sudo rm /etc/yum.repos.d/delo*
sudo curl -L -o /etc/yum.repos.d/delorean-ocata.repo https://trunk.rdoproject.org/centos7-ocata/current/delorean.repo
sudo curl -L -o /etc/yum.repos.d/delorean-deps-ocata.repo https://trunk.rdoproject.org/centos7-ocata/delorean-deps.repo
sudo yum -y install --enablerepo=extras centos-release-ceph-jewel
sudo sed -i -e 's%gpgcheck=.*%gpgcheck=0%' /etc/yum.repos.d/CentOS-Ceph-Jewel.repo
sudo yum clean all
sudo systemctl stop openstack-*
sudo systemctl stop neutron-*
sudo systemctl stop openvswitch
sudo systemctl stop httpd
sudo yum -y update python-tripleoclient
openstack undercloud upgrade

I get the following error

2017-06-21 13:16:48,817 DEBUG: POST call to compute for http://10.13.13.1:8774/v2.1/flavors used request id req-484aa2da-f620-4bd8-846b-56626a3ecc39
2017-06-21 13:16:48,817 INFO: Not creating flavor "swift-storage" because it already exists.
2017-06-21 13:16:48,819 DEBUG: found extension EntryPoint.parse('keystone = mistralclient.auth.keystone:KeystoneAuthHandler')
2017-06-21 13:16:48,819 DEBUG: found extension EntryPoint.parse('keycloak-oidc = mistralclient.auth.keycloak:KeycloakAuthHandler')
2017-06-21 13:16:48,820 DEBUG: Making authentication request to http://10.13.13.1:5000/v2.0/tokens
2017-06-21 13:16:48,822 INFO: Starting new HTTP connection (1): 10.13.13.1
2017-06-21 13:16:49,059 DEBUG: "POST /v2.0/tokens HTTP/1.1" 200 1259
2017-06-21 13:16:49,062 INFO: Starting new HTTP connection (1): 10.13.13.1
2017-06-21 13:16:49,301 DEBUG: "GET /v2/environments/tripleo.undercloud-config HTTP/1.1" 200 286
2017-06-21 13:16:49,304 DEBUG: HTTP GET http://10.13.13.1:8989/v2/environments/tripleo.undercloud-config 200
2017-06-21 13:16:49,306 INFO: Starting new HTTP connection (1): 10.13.13.1
2017-06-21 13:16:49,322 DEBUG: "GET /v2/environments HTTP/1.1" 200 7148
2017-06-21 13:16:49,323 DEBUG: HTTP GET http://10.13.13.1:8989/v2/environments 200
2017-06-21 13:16:49,324 INFO: Not creating default plan "overcloud" because it already exists.
2017-06-21 13:16:49,325 INFO: Starting new HTTP connection (1): 10.13.13.1
2017-06-21 13:16:49,911 DEBUG: "POST /v2/executions HTTP/1.1" 400 2037
2017-06-21 13:16:49,912 DEBUG: HTTP POST http://10.13.13.1:8989/v2/executions 400
2017-06-21 13:16:49,913 ERROR:
#############################################################################
Undercloud upgrade failed.

Reason: Failed when querying database, error type: DBError, error message: (pymysql.err.InternalError) (1054, u"Unknown column 'task_executions_v2.type' in 'field list'") [SQL: u'SELECT task_executions_v2.scope AS task_executions_v2_scope, task_executions_v2.project_id AS task_executions_v2_project_id, task_executions_v2.created_at AS task_executions_v2_created_at, task_executions_v2.updated_at AS task_executions_v2_updated_at, task_executions_v2.id AS task_executions_v2_id, task_executions_v2.name AS task_executions_v2_name, task_executions_v2.description AS task_executions_v2_description, task_executions_v2.workflow_name AS task_executions_v2_workflow_name, task_executions_v2.workflow_id AS task_executions_v2_workflow_id, task_executions_v2.spec AS task_executions_v2_spec, task_executions_v2.state AS task_executions_v2_state, task_executions_v2.state_info AS task_executions_v2_state_info, task_executions_v2.tags AS task_executions_v2_tags, task_executions_v2.runtime_context AS task_executions_v2_runtime_context, task_executions_v2.action_spec AS task_executions_v2_action_spec, task_executions_v2.unique_key AS task_executions_v2_unique_key, task_executions_v2.type AS task_executions_v2_type, task_executions_v2.processed AS task_executions_v2_processed, task_executions_v2.in_context AS task_executions_v2_in_context, task_executions_v2.published AS task_executions_v2_published, task_executions_v2.workflow_execution_id AS task_executions_v2_workflow_execution_id \nFROM task_executions_v2 \nWHERE (task_executions_v2.project_id = %(project_id_1)s OR task_executions_v2.scope = %(scope_1)s) AND task_executions_v2.workflow_execution_id = %(workflow_execution_id_1)s AND task_executions_v2.state = %(state_1)s ORDER BY task_executions_v2.created_at ASC, task_executions_v2.id ASC'] [parameters: {u'state_1': 'IDLE', u'project_id_1': u'14cdab35471444148ed4bc826ce33fbd', u'workflow_execution_id_1': '728ba3f5-8f2e-4528-ad40-9a367670f797', u'scope_1': 'public'}]

tail /var/log/mistral/engine.log at http://paste.openstack.org/show/613290/

This error is consistent - this upgrade sucessfully completed on 19-Jun-17@10:25+01:00. Since then I have rebuilt the Newton undercloud/overcloud and tried the upgrade 4 times, and received the same error 4 times.

Expected result - successfully upgraded undercloud

Actual result - undercloud upgrade failed

Tags: upgrade
Simon Wright (simon-ocf)
description: updated
Simon Wright (simon-ocf)
description: updated
Changed in tripleo:
status: New → Triaged
importance: Undecided → High
milestone: none → pike-3
tags: added: upgrade
Changed in tripleo:
importance: High → Critical
Brad P. Crochet (brad-9)
Changed in tripleo:
assignee: nobody → Brad P. Crochet (brad-9)
Revision history for this message
Brad P. Crochet (brad-9) wrote :

For info:

Fresh Newton install:
[centos@undercloud-newton versions]$ sudo mistral-db-manage --config-file /etc/mistral/mistral.conf history
019 -> 020 (head), Increase environments_v2 column size from JsonDictType to JsonLongDictType
018 -> 019, Change scheduler schema.
017 -> 018, increate_task_execution_unique_key_size
016 -> 017, Add named lock table
015 -> 016, Increase size of task_executions_v2.unique_key
014 -> 015, add_unique_keys_for_non_locking_model
013 -> 014, fix_past_scripts_discrepancies
012 -> 013, split_execution_table_increase_names
011 -> 012, add event triggers table
010 -> 011, add workflow id for execution
009 -> 010, add_resource_members_v2_table
008 -> 009, Add database indices
007 -> 008, Increase size of state_info column from String to Text
006 -> 007, Move system flag to base definition
005 -> 006, add a Boolean column 'processed' to the table delayed_calls_v2
004 -> 005, Increase executions_v2 column size from JsonDictType to JsonLongDictType
003 -> 004, add description for execution
002 -> 003, cron_trigger_constraints
001 -> 002, Kilo
<base> -> 001, Kilo release

After upgrade:
[stack@undercloud versions]$ sudo mistral-db-manage --config-file /etc/mistral/mistral.conf history
020 -> 021 (head), Increase environments_v2 column size from JsonDictType to JsonLongDictType
019 -> 020, add type to task execution
018 -> 019, Change scheduler schema.
017 -> 018, increate_task_execution_unique_key_size
016 -> 017, Add named lock table
015 -> 016, Increase size of task_executions_v2.unique_key
014 -> 015, add_unique_keys_for_non_locking_model
013 -> 014, fix_past_scripts_discrepancies
012 -> 013, split_execution_table_increase_names
011 -> 012, add event triggers table
010 -> 011, add workflow id for execution
009 -> 010, add_resource_members_v2_table
008 -> 009, Add database indices
007 -> 008, Increase size of state_info column from String to Text
006 -> 007, Move system flag to base definition
005 -> 006, add a Boolean column 'processed' to the table delayed_calls_v2
004 -> 005, Increase executions_v2 column size from JsonDictType to JsonLongDictType
003 -> 004, add description for execution
002 -> 003, cron_trigger_constraints
001 -> 002, Kilo
<base> -> 001, Kilo release

Revision history for this message
Brad P. Crochet (brad-9) wrote :

So this is due to a bad cherry-pick. DB migrations can't be cherry-picked out of order, or they will mess with upgrades. What happens in this case, is that the DB is already at version 20 in newton (Increase environments_v2 column size from JsonDictType to JsonLongDictType), then ocata comes along and replaces that with 'add type to task execution' as migration 20, and makes the 'Increate environments_v2' migration number 21. So, it only applies that again, without actually applying the 'add type to task execution' migration.

Revision history for this message
Brad P. Crochet (brad-9) wrote :
Changed in tripleo:
status: Triaged → Fix Committed
Revision history for this message
Brad P. Crochet (brad-9) wrote :
Revision history for this message
Brad P. Crochet (brad-9) wrote :
Changed in tripleo:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.