tripleo-quickstarts undercloud deployments fails due to neutron db sync

Bug #1712901 reported by Luke Short
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Invalid
Medium
Unassigned

Bug Description

Description
===========
Using tripleo-quickstart, the Undercloud fails to deploy because the Neutron MySQL tables are not created properly.

Steps to reproduce
==================
# git clone https://github.com/openstack/tripleo-quickstart.git
# cd tripleo-quickstart
# bash quickstart.sh -v --clean --teardown all --release stable/newton 127.0.0.2

Expected result
===============
An all-in-one deployment of OpenStack should be created.

Actual result
=============
The Playbook fails at the "Install the undercloud" step.

The exact problem is shown in the undercloud_install.log. The neutron-db-sync cannot properly create the neutron tables. I also tried deploying Ocata and there was a similar problem with creating the neutron tables. This causes the rest of the installation to fail.

2017-08-24 17:41:07 | 2017-08-24 17:41:07 - Notice: /Stage[main]/Neutron::Db::Sync/Exec[neutron-db-sync]/returns: oslo_db.exception.DBError: (pymysql.err.InternalError) (1054, u"Unknown column 'networks.shared' in 'field list'") [SQL: u'SELECT networks.id AS networks_id, networks.tenant_id AS networks_tenant_id, networks.shared AS networks_shared \nFROM networks \nWHERE networks.shared = 1']

Environment
===========
Computer:
x1 KVM virtual machine with nested virtualization enabled

OS:
CentOS 7.3

Specs:
20 CPU cores
30GB RAM
200GB space

Logs & Configs
==============

The Undercloud installation log is attached for reference.

Tags: quickstart
Revision history for this message
Luke Short (ekultails) wrote :
tags: added: quickstart
Changed in tripleo:
status: New → Triaged
milestone: none → queens-1
importance: Undecided → Medium
Revision history for this message
Luke Short (ekultails) wrote :

When I try the same process with the trunk/ocata release it also fails relating to the Neutron database.

# bash quickstart.sh -v --clean --teardown all --release trunk/ocata 127.0.0.2

Here's the main error from this run:

2017-08-27 19:26:01 | 2017-08-27 19:26:01,197 INFO: Notice: /Stage[main]/Neutron::Db::Sync/Exec[neutron-db-sync]/returns: oslo_db.exception.DBError: (pymysql.err.InternalError) (1050, u"Table 'ml2_geneve_allocations' already exists") [SQL: u'\nCREATE TABLE ml2_geneve_allocations (\n\tgeneve_vni INTEGER NOT NULL, \n\tallocated BOOL NOT NULL DEFAULT false, \n\tPRIMARY KEY (geneve_vni), \n\tCHECK (allocated IN (0, 1))\n)ENGINE=InnoDB\n\n']

Revision history for this message
Luke Short (ekultails) wrote :

For the trunk/ocata deploy, here are all of the databases that were created, the tables created for the "neutron" database, and the structure of the neutron.ml2_geneve_allocations table: http://paste.openstack.org/show/619572/

Revision history for this message
Luke Short (ekultails) wrote :

Okay, I went back digging through the stable/newton undercloud_install.log that is attached to this report and I noticed a few things that I've pasted here: http://paste.openstack.org/show/619959/

(1) The Nova database fails to be fully created due to a timeout.

(2) The Neutron database also fails to be fully created due to a timeout. It looks like Puppet attempts to create the database again but then it fails because most of those tables already exist.

Because of #2 I am thinking that perhaps neutron-db-manage should be modified to run "CREATE IF NOT EXISTS" instead of a "CREATE" for tables. That's outside of the scope of tripleo, though. As for tripleo-quickstart, is there any way I can manually increase these timeouts? It seems that if the commands are allowed to run longer it may complete. I am running it inside of a virtual machine so there is likely a slight hit to performance.

Changed in tripleo:
milestone: queens-1 → queens-2
Changed in tripleo:
milestone: queens-2 → queens-3
Revision history for this message
Luke Short (ekultails) wrote :

The server hardware I have been testing on is 8 years old. The overhead of trying to use nested virtualization made this impossible to deploy. Even when deploying to bare-metal the installation was too slow to complete successfully.

This bug can be closed as invalid.

Changed in tripleo:
status: Triaged → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.