queens-uc-newton-oc undercloud install fails on creating default plan

Bug #1764777 reported by Jiří Stránský on 2018-04-17
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
High
Unassigned

Bug Description

We're missing plan-environment.yaml in newton templates but the default plan creation workflow expects it.

Undercloud install fails with:

2018-04-17 14:15:38 | 2018-04-17 14:15:38,890 ERROR: ERROR error creating the default Deployment Plan overcloud Check the create_default_deployment_plan execution in Mistral with openstack workflow execution list Mistral execution ID: 82b17444-aafb-4e09-b4a6-fd5787d1beb4
2018-04-17 14:15:38 | 2018-04-17 14:15:38,891 DEBUG: An exception occurred
2018-04-17 14:15:38 | Traceback (most recent call last):
2018-04-17 14:15:38 | File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 2336, in install
2018-04-17 14:15:38 | _post_config(instack_env, upgrade)
2018-04-17 14:15:38 | File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 2028, in _post_config
2018-04-17 14:15:38 | _post_config_mistral(instack_env, mistral, swift)
2018-04-17 14:15:38 | File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 1964, in _post_config_mistral
2018-04-17 14:15:38 | _create_default_plan(mistral, plans)
2018-04-17 14:15:38 | File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 1906, in _create_default_plan
2018-04-17 14:15:38 | fail_on_error=True)
2018-04-17 14:15:38 | File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 1843, in _wait_for_mistral_execution
2018-04-17 14:15:38 | raise RuntimeError(error_message)
2018-04-17 14:15:38 | RuntimeError: ERROR error creating the default Deployment Plan overcloud Check the create_default_deployment_plan execution in Mistral with openstack workflow execution list Mistral execution ID: 82b17444-aafb-4e09-b4a6-fd5787d1beb4
2018-04-17 14:15:38 | 2018-04-17 14:15:38,891 ERROR:
2018-04-17 14:15:38 | #############################################################################
2018-04-17 14:15:38 | Undercloud install failed.
2018-04-17 14:15:38 |
2018-04-17 14:15:38 | Reason: ERROR error creating the default Deployment Plan overcloud Check the create_default_deployment_plan execution in Mistral with openstack workflow execution list Mistral execution ID: 82b17444-aafb-4e09-b4a6-fd5787d1beb4
2018-04-17 14:15:38 |
2018-04-17 14:15:38 | See the previous output for details about what went wrong. The full install
2018-04-17 14:15:38 | log can be found at /home/stack/.instack/install-undercloud.log.
2018-04-17 14:15:38 |
2018-04-17 14:15:38 | #############################################################################

And in mistral executor.log:

2018-04-17 14:15:32.373 19850 ERROR tripleo_common.actions.parameters [req-e7909a32-15d2-42ba-8c04-a170ef2f9ef0 9920b73f4a9c4aff8eb661da2e5818a9 23fb21a23c594c1fb4737a86fe46f6e9 - default default] Error retrieving environment for plan overcloud: Object GET failed: https://192.168.24.2:13808/v1/AUTH_23fb21a23c594c1fb4737a86fe46f6e9/overcloud/plan-environment.yaml 404 Not Found [first 60 chars of response] <html><h1>Not Found</h1><p>The resource could not be found.<: ClientException: Object GET failed: https://192.168.24.2:13808/v1/AUTH_23fb21a23c594c1fb4737a86fe46f6e9/overcloud/plan-environment.yaml 404 Not Found [first 60 chars of response] <html><h1>Not Found</h1><p>The resource could not be found.<
2018-04-17 14:15:32.373 19850 ERROR tripleo_common.actions.parameters Traceback (most recent call last):
2018-04-17 14:15:32.373 19850 ERROR tripleo_common.actions.parameters File "/usr/lib/python2.7/site-packages/tripleo_common/actions/parameters.py", line 249, in run
2018-04-17 14:15:32.373 19850 ERROR tripleo_common.actions.parameters env = plan_utils.get_env(swift, self.container)
2018-04-17 14:15:32.373 19850 ERROR tripleo_common.actions.parameters File "/usr/lib/python2.7/site-packages/tripleo_common/utils/plan.py", line 41, in get_env
2018-04-17 14:15:32.373 19850 ERROR tripleo_common.actions.parameters swift.get_object(name, constants.PLAN_ENVIRONMENT)[1]
2018-04-17 14:15:32.373 19850 ERROR tripleo_common.actions.parameters File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1799, in get_object
2018-04-17 14:15:32.373 19850 ERROR tripleo_common.actions.parameters headers=headers)
2018-04-17 14:15:32.373 19850 ERROR tripleo_common.actions.parameters File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1691, in _retry
2018-04-17 14:15:32.373 19850 ERROR tripleo_common.actions.parameters service_token=self.service_token, **kwargs)
2018-04-17 14:15:32.373 19850 ERROR tripleo_common.actions.parameters File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1179, in get_object
2018-04-17 14:15:32.373 19850 ERROR tripleo_common.actions.parameters raise ClientException.from_response(resp, 'Object GET failed', body)
2018-04-17 14:15:32.373 19850 ERROR tripleo_common.actions.parameters ClientException: Object GET failed: https://192.168.24.2:13808/v1/AUTH_23fb21a23c594c1fb4737a86fe46f6e9/overcloud/plan-environment.yaml 404 Not Found [first 60 chars of response] <html><h1>Not Found</h1><p>The resource could not be found.<
2018-04-17 14:15:32.373 19850 ERROR tripleo_common.actions.parameters

Changed in tripleo:
milestone: rocky-1 → rocky-2
Changed in tripleo:
assignee: nobody → Quique Llorente (quiquell)
Matt Young (halcyondude) on 2018-04-30
tags: removed: quickstart
Changed in tripleo:
assignee: Quique Llorente (quiquell) → nobody
Changed in tripleo:
milestone: rocky-2 → rocky-3
Changed in tripleo:
milestone: rocky-3 → rocky-rc1
Changed in tripleo:
milestone: rocky-rc1 → stein-1
Martin Kopec (mkopec) on 2018-10-10
tags: added: alert promotion-blocker
Dougal Matthews (d0ugal) wrote :
Download full text (3.1 KiB)

In the most recent report, I found this error in the Mistral logs...

http://logs.openstack.org/24/608324/1/gate/tripleo-ci-centos-7-undercloud-oooq/0843431/logs/undercloud/var/log/mistral/executor.log.txt.gz#_2018-10-10_08_03_55_473

2018-10-10 08:03:55.473 26208 ERROR tripleo_common.actions.templates [req-bed99655-45c0-4455-9628-00e97accb2d7 7fff713b28d647d4bb0564dae6a00d32 c5a85f06ef4f47468d7054f618c0febd - default default] Error storing file network/service_net_map.yaml in container overcloud: ClientException: put_object(u'overcloud', u'network/service_net_map.yaml', ...) failure and no ability to reset contents for reupload.
2018-10-10 08:03:55.474 26208 ERROR tripleo_common.actions.templates [req-bed99655-45c0-4455-9628-00e97accb2d7 7fff713b28d647d4bb0564dae6a00d32 c5a85f06ef4f47468d7054f618c0febd - default default] Error occurred while processing custom roles.: Exception: Error storing file network/service_net_map.yaml in container overcloud
2018-10-10 08:03:55.474 26208 ERROR tripleo_common.actions.templates Traceback (most recent call last):
2018-10-10 08:03:55.474 26208 ERROR tripleo_common.actions.templates File "/usr/lib/python2.7/site-packages/tripleo_common/actions/templates.py", line 368, in run
2018-10-10 08:03:55.474 26208 ERROR tripleo_common.actions.templates self._process_custom_roles(context)
2018-10-10 08:03:55.474 26208 ERROR tripleo_common.actions.templates File "/usr/lib/python2.7/site-packages/tripleo_common/actions/templates.py", line 346, in _process_custom_roles
2018-10-10 08:03:55.474 26208 ERROR tripleo_common.actions.templates context=context)
2018-10-10 08:03:55.474 26208 ERROR tripleo_common.actions.templates File "/usr/lib/python2.7/site-packages/tripleo_common/actions/templates.py", line 157, in _j2_render_and_put
2018-10-10 08:03:55.474 26208 ERROR tripleo_common.actions.templates raise Exception(error_msg)
2018-10-10 08:03:55.474 26208 ERROR tripleo_common.actions.templates Exception: Error storing file network/service_net_map.yaml in container overcloud
2018-10-10 08:03:55.474 26208 ERROR tripleo_common.actions.templates

It looks like the upload to swift failed, but this code isn't written to retry on failure. Looking at the swift logs it happened due to a timeout.

http://logs.openstack.org/24/608324/1/gate/tripleo-ci-centos-7-undercloud-oooq/0843431/logs/undercloud/var/log/swift/swift.log.txt.gz#_Oct_10_08_03_52

Oct 10 08:03:52 centos-7-inap-mtl01-0002810891 proxy-server: ERROR with Object server 192.168.24.1:6000/1 re: Trying to get final status of PUT to /v1/AUTH_c5a85f06ef4f47468d7054f618c0febd/overcloud/network/service_net_map.yaml: Timeout (60.0s) (txn: tx4931d10773dc4044b0baf-005bbdb22c)
Oct 10 08:03:52 centos-7-inap-mtl01-0002810891 proxy-server: Object PUT returning 503 for [503] (txn: tx4931d10773dc4044b0baf-005bbdb22c) (client_ip: 192.168.24.1)
Oct 10 08:03:52 centos-7-inap-mtl01-0002810891 proxy-server: 192.168.24.1 192.168.24.1 10/Oct/2018/08/03/52 PUT /v1/AUTH_c5a85f06ef4f47468d7054f618c0febd/overcloud/network/service_net_map.yaml HTTP/1.0 503 - python-swiftclient-3.5.0 gAAAAABbvbHJ-yJb... 6147 118 - tx4931d10773dc4044b0baf-005bbdb22c - 60.0126 - - 1539158572.454325914...

Read more...

Martin Kopec (mkopec) wrote :

Moved the failure from comment #2 to https://bugs.launchpad.net/tripleo/+bug/1797167 as it might be a different issue -> removing alert and promotion-blocker tags

tags: removed: alert promotion-blocker
Changed in tripleo:
milestone: stein-1 → stein-2
Changed in tripleo:
milestone: stein-2 → stein-3

Is this still an issue?

Jiří Stránský (jistr) wrote :

I think it probably is, but since nobody cared for so long, let's close as wontfix and reopen if we see fit.

Changed in tripleo:
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers