tripleo-ci-centos-7-scenario002-multinode-oooq-container failing with timeout in deploy overcloud

Bug #1741445 reported by Arx Cruz
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Arx Cruz

Bug Description

Please, notice, this is not job timeout, it's the overcloud deploy timeout, that is set to 69 minutes according [1].

I notice some details:

1 - This is in stable/pike branch
2 - Some warnings in overcloud deploy command:
2018-01-04 23:32:43 | The disable_upgrade_deployment flag is not set in the roles file. This flag is expected when you have a nova-compute or swift-storage role. Please check the contents of the roles file: [{'networks': ['External', 'InternalApi', 'Storage', 'StorageMgmt', 'Tenant'], 'CountDefault': 1, 'name': 'Controller', 'tags': ['primary', 'controller']}]
2018-01-04 23:32:52 | Waiting for messages on queue '3c627656-2e8b-4ad7-b9e1-995bd731f598' with no timeout.
2018-01-04 23:33:13 | Configuration has 8 errors, fix them before proceeding. Ignoring these errors is likely to lead to a failed deploy.
2018-01-04 23:35:39 | Started Mistral Workflow tripleo.validations.v1.check_pre_deployment_validations. Execution ID: d080e219-57a0-4d01-93eb-34da35a2ce71
2018-01-04 23:35:39 | {u'kernel_id': None, u'ramdisk_id': None, u'errors': [u"No image with the name 'bm-deploy-kernel' found - make sure you have uploaded boot images.", u"No image with the name 'bm-deploy-ramdisk' found - make sure you have uploaded boot images."], u'warnings': []}
2018-01-04 23:35:39 | {u'errors': [u'Error: There are no nodes in an available or active state and with maintenance mode off.'], u'warnings': []}
2018-01-04 23:35:39 | {u'errors': [u'Not enough baremetal nodes - available: 0, requested: 1'], u'result': {u'enough_nodes': False, u'statistics': {u'count': 0, u'vcpus_used': 0, u'local_gb_used': 0, u'manager': {u'api': {u'server_groups': None, u'keypairs': None, u'servers': None, u'server_external_events': None, u'server_migrations': None, u'agents': None, u'instance_action': None, u'glance': None, u'hypervisor_stats': None, u'virtual_interfaces': None, u'flavors': None, u'availability_zones': None, u'user_id': None, u'cloudpipe': None, u'os_cache': False, u'quotas': None, u'migrations': None, u'usage': None, u'logger': None, u'project_id': None, u'neutron': None, u'quota_classes': None, u'project_name': None, u'aggregates': None, u'flavor_access': None, u'services': None, u'list_extensions': None, u'limits': None, u'hypervisors': None, u'cells': None, u'versions': None, u'client': None, u'hosts': None, u'volumes': None, u'assisted_volume_snapshots': None, u'certs': None}}, u'x_openstack_request_ids': [u'req-6a673d6b-0f4a-4c6f-812e-1085d586156c'], u'memory_mb': 0, u'current_workload': 0, u'vcpus': 0, u'running_vms': 0, u'free_disk_gb': 0, u'disk_available_least': 0, u'_info': {u'count': 0, u'vcpus_used': 0, u'local_gb_used': 0, u'memory_mb': 0, u'current_workload': 0, u'vcpus': 0, u'running_vms': 0, u'free_disk_gb': 0, u'disk_available_least': 0, u'local_gb': 0, u'free_ram_mb': 0, u'memory_mb_used': 0}, u'local_gb': 0, u'free_ram_mb': 0, u'memory_mb_used': 0, u'_loaded': True}, u'requested_count': 1, u'available_count': 0}, u'warnings': []}

3 - Errors:
2018-01-04 23:35:39 | ERRORS
2018-01-04 23:35:39 | [u"No image with the name 'bm-deploy-kernel' found - make sure you have uploaded boot images.", u"No image with the name 'bm-deploy-ramdisk' found - make sure you have uploaded boot images.", u'Error: There are no nodes in an available or active state and with maintenance mode off.', u'Not enough baremetal nodes - available: 0, requested: 1', u"No image with the name 'bm-deploy-kernel' found - make sure you have uploaded boot images.", u"No image with the name 'bm-deploy-ramdisk' found - make sure you have uploaded boot images.", u'Error: There are no nodes in an available or active state and with maintenance mode off.', u'Not enough baremetal nodes - available: 0, requested: 1']

After this, the deployment continues until get this error:

2018-01-05 00:44:35 | 2018-01-05 00:44:29Z [overcloud.AllNodesDeploySteps]: CREATE_FAILED CREATE aborted
2018-01-05 00:44:35 | 2018-01-05 00:44:29Z [overcloud]: CREATE_FAILED Create timed out
2018-01-05 00:44:35 |
2018-01-05 00:44:35 | Stack overcloud CREATE_FAILED
2018-01-05 00:44:35 |
2018-01-05 00:44:35 | overcloud.AllNodesDeploySteps.ControllerDeployment_Step3.0:
2018-01-05 00:44:35 | resource_type: OS::Heat::StructuredDeployment
2018-01-05 00:44:35 | physical_resource_id: 39df029a-f457-4661-8fa4-ef3702e5ccff
2018-01-05 00:44:35 | status: CREATE_FAILED
2018-01-05 00:44:35 | status_reason: |
2018-01-05 00:44:35 | CREATE aborted
2018-01-05 00:44:35 | deploy_stdout: |
2018-01-05 00:44:35 | None
2018-01-05 00:44:35 | deploy_stderr: |
2018-01-05 00:44:35 | None
2018-01-05 00:44:35 | + status_code=1

IMHO, if the warning is already saying that we must fix the warnings otherwise the deployment will fail, it should actually not continue with the deployment

1 - http://logs.openstack.org/78/528078/7/gate/tripleo-ci-centos-7-scenario002-multinode-oooq-container/4bf4351/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz#_2018-01-04_23_32_38

Tags: ci
Revision history for this message
Arx Cruz (arxcruz) wrote :
Changed in tripleo:
importance: Undecided → High
status: New → Triaged
milestone: none → rocky-3
assignee: nobody → Arx Cruz (arxcruz)
Revision history for this message
Arx Cruz (arxcruz) wrote :
Revision history for this message
Alex Schultz (alex-schultz) wrote :
Changed in tripleo:
milestone: rocky-3 → queens-3
Arx Cruz (arxcruz)
Changed in tripleo:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.