tripleo

Failure when deploying the overcloud on a predeployed server

Bug #1742237 reported by emanoel on 2018-01-09

10

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	tripleo	Incomplete	Medium	Unassigned	tripleo victoria-3 "tripleo victoria"

Bug Description

Description
===========
When following the steps documented here http://tripleo.org/install/advanced_deployment/deployed_server.html the overcloud deployment gets stuck eventually timing out during stack creation.

Steps to reproduce
==================
* Deployed the undercloud using the steps here https://docs.openstack.org/tripleo-docs/latest/install/installation/installation.html#installing-the-undercloud
* The repo used for the undercloud was current-tripleo-dev
* Followed the steps here http://tripleo.org/install/advanced_deployment/deployed_server.html to deploy the overcloud. All the documented steps passed except for the final overcloud deploy command (see logs and config used below)
* The VM used for the undercloud was a Centos 7 and all the needed packages were installed as documented above.

Expected result
===============
Overcloud deployment completes successfully and the overcloud services are present in the controller node as containers.

Actual result
=============
Overcloud deployment is stuck at this step `[overcloud.AllNodesDeploySteps.ControllerDeployedServerDeployment_Step1.0]: CREATE_IN_PROGRESS state change`

Logs & Configs
==============

Overcloud deploy command and output: http://paste.openstack.org/show/641415/
os-collect-config log: http://paste.openstack.org/show/641442/
deployed-server-ctrlr-data.yaml: http://paste.openstack.org/show/641433/
deployed-server-ips.yaml: http://paste.openstack.org/show/641438/
deployment-swift-data-map.yaml: http://paste.openstack.org/show/641439/

Emilien Macchi (emilienm) on 2018-01-09

Changed in tripleo:
milestone:	none → rocky-1
importance:	Undecided → Medium
status:	New → Triaged

Revision history for this message

Bogdan Dobrelya (bogdando) wrote on 2018-01-10:

#1

It seems that
OS::TripleO::DeployedServer::ControlPlanePort: ../deployed-server/deployed-neutron-port.yaml

needs to have an absolute path instead?

Revision history for this message

emanoel (emanoelxavier) wrote on 2018-01-10:

#2

Could be. After changing that, deleting the existing failed overcloud stack and retrying the deployment i got a different error. The overcloud deploy command now executed with the --debug option produced the output http://paste.openstack.org/show/642463/

Revision history for this message

Bogdan Dobrelya (bogdando) wrote on 2018-01-11:

#3

The failed software deployment's resources.NetworkDeployment should be analyzed then,
perhaps the commands like:

openstack software deployment show <id>
openstack stack resource list --nested-depth 5 <stack>
openstack --os-cloud rdo-cloud stack resource show <stack> <resource>

though, I'm not too good with navigating nested heat entities :/

Revision history for this message

emanoel (emanoelxavier) wrote on 2018-01-16:

#4

I did some steps similar to above, looks like the issue may be related with discovering or pinging the $METADATA_IP http://paste.openstack.org/show/645737/ ?

Revision history for this message

Bogdan Dobrelya (bogdando) wrote on 2018-01-18:

#5

It seems that you should double-check the EC2MetadataIp value that Heat takes from quickstart inventory vars, see network_environment_args and use_resource_registry_nic_configs that enables the former. If you configure networking via custom tht templates, look there instead.

Revision history for this message

emanoel (emanoelxavier) wrote on 2018-01-18:

#6

The value I currently have for EC2MetadaIP is the undercloud API IP. From the overcloud VM:
curl 192.168.24.1:8000
{"versions": [{"status": "CURRENT", "id": "v1.0", "links": [{"href": "http://192.168.24.1:8000/v1/", "rel": curl 192.168.24.1:8080
<html><h1>Not Found</h1><p>The resource could not be found.</p></html>[root@overcloud ~]

There is a route to that IP from the overcloud VM, and the behavior above is the expected one according to http://tripleo.org/install/advanced_deployment/deployed_server.html#testing-connectivity. The configuration I am currently using is based on http://tripleo.org/install/advanced_deployment/deployed_server.html#testing-connectivity. Should the value of the EC2MetadataIp be something else?

Revision history for this message

Sanjay Upadhyay (saneax) wrote on 2018-01-24:

#7

EC2MetadataIP should be your undercloud ip. The variable might be correctly set. However the failure is at the NetworkConfig stage. The error could be related to any other network config parameter issues. We might need to see if the packets for ping are being received on undercloud node. Mostly this could be the network setup.

Revision history for this message

emanoel (emanoelxavier) wrote on 2018-01-25:

#8

Is the ping happening from inside of one of the containers deployed in the overcloud guest VM? I did the steps here http://tripleo.org/install/advanced_deployment/deployed_server.html#testing-connectivity (ping, curl the undercloud IP 192.168.24.1) and everything worked as mentioned above. See traces and ip config here http://paste.openstack.org/show/653451/

Alex Schultz (alex-schultz) on 2018-04-20

Changed in tripleo:
milestone:	rocky-1 → rocky-2

Emilien Macchi (emilienm) on 2018-06-05

Changed in tripleo:
milestone:	rocky-2 → rocky-3

Emilien Macchi (emilienm) on 2018-07-26

Changed in tripleo:
milestone:	rocky-3 → rocky-rc1

Emilien Macchi (emilienm) on 2018-07-26

Changed in tripleo:
milestone:	rocky-rc1 → stein-1

Juan Antonio Osorio Robles (juan-osorio-robles) on 2018-10-30

Changed in tripleo:
milestone:	stein-1 → stein-2

Emilien Macchi (emilienm) on 2019-01-13

Changed in tripleo:
milestone:	stein-2 → stein-3

Alex Schultz (alex-schultz) on 2019-03-13

Changed in tripleo:
milestone:	stein-3 → train-1

Alex Schultz (alex-schultz) on 2019-06-07

Changed in tripleo:
milestone:	train-1 → train-2

Alex Schultz (alex-schultz) on 2019-07-29

Changed in tripleo:
milestone:	train-2 → train-3

Alex Schultz (alex-schultz) on 2019-09-12

Changed in tripleo:
milestone:	train-3 → ussuri-1

Emilien Macchi (emilienm) on 2019-12-19

Changed in tripleo:
milestone:	ussuri-1 → ussuri-2

wes hayutin (weshayutin) on 2020-02-10

Changed in tripleo:
milestone:	ussuri-2 → ussuri-3

wes hayutin (weshayutin) on 2020-04-13

Changed in tripleo:
milestone:	ussuri-3 → ussuri-rc3

wes hayutin (weshayutin) on 2020-05-11

Changed in tripleo:
status:	Triaged → Incomplete

wes hayutin (weshayutin) on 2020-05-26

Changed in tripleo:
milestone:	ussuri-rc3 → victoria-1

Emilien Macchi (emilienm) on 2020-07-28

Changed in tripleo:
milestone:	victoria-1 → victoria-3

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.