queens-uc-newton-oc undercloud install fails on creating default plan

Bug #1764777 reported by Jiří Stránský
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Won't Fix
High
Unassigned

Bug Description

We're missing plan-environment.yaml in newton templates but the default plan creation workflow expects it.

Undercloud install fails with:

2018-04-17 14:15:38 | 2018-04-17 14:15:38,890 ERROR: ERROR error creating the default Deployment Plan overcloud Check the create_default_deployment_plan execution in Mistral with openstack workflow execution list Mistral execution ID: 82b17444-aafb-4e09-b4a6-fd5787d1beb4
2018-04-17 14:15:38 | 2018-04-17 14:15:38,891 DEBUG: An exception occurred
2018-04-17 14:15:38 | Traceback (most recent call last):
2018-04-17 14:15:38 | File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 2336, in install
2018-04-17 14:15:38 | _post_config(instack_env, upgrade)
2018-04-17 14:15:38 | File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 2028, in _post_config
2018-04-17 14:15:38 | _post_config_mistral(instack_env, mistral, swift)
2018-04-17 14:15:38 | File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 1964, in _post_config_mistral
2018-04-17 14:15:38 | _create_default_plan(mistral, plans)
2018-04-17 14:15:38 | File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 1906, in _create_default_plan
2018-04-17 14:15:38 | fail_on_error=True)
2018-04-17 14:15:38 | File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 1843, in _wait_for_mistral_execution
2018-04-17 14:15:38 | raise RuntimeError(error_message)
2018-04-17 14:15:38 | RuntimeError: ERROR error creating the default Deployment Plan overcloud Check the create_default_deployment_plan execution in Mistral with openstack workflow execution list Mistral execution ID: 82b17444-aafb-4e09-b4a6-fd5787d1beb4
2018-04-17 14:15:38 | 2018-04-17 14:15:38,891 ERROR:
2018-04-17 14:15:38 | #############################################################################
2018-04-17 14:15:38 | Undercloud install failed.
2018-04-17 14:15:38 |
2018-04-17 14:15:38 | Reason: ERROR error creating the default Deployment Plan overcloud Check the create_default_deployment_plan execution in Mistral with openstack workflow execution list Mistral execution ID: 82b17444-aafb-4e09-b4a6-fd5787d1beb4
2018-04-17 14:15:38 |
2018-04-17 14:15:38 | See the previous output for details about what went wrong. The full install
2018-04-17 14:15:38 | log can be found at /home/stack/.instack/install-undercloud.log.
2018-04-17 14:15:38 |
2018-04-17 14:15:38 | #############################################################################

And in mistral executor.log:

2018-04-17 14:15:32.373 19850 ERROR tripleo_common.actions.parameters [req-e7909a32-15d2-42ba-8c04-a170ef2f9ef0 9920b73f4a9c4aff8eb661da2e5818a9 23fb21a23c594c1fb4737a86fe46f6e9 - default default] Error retrieving environment for plan overcloud: Object GET failed: https://192.168.24.2:13808/v1/AUTH_23fb21a23c594c1fb4737a86fe46f6e9/overcloud/plan-environment.yaml 404 Not Found [first 60 chars of response] <html><h1>Not Found</h1><p>The resource could not be found.<: ClientException: Object GET failed: https://192.168.24.2:13808/v1/AUTH_23fb21a23c594c1fb4737a86fe46f6e9/overcloud/plan-environment.yaml 404 Not Found [first 60 chars of response] <html><h1>Not Found</h1><p>The resource could not be found.<
2018-04-17 14:15:32.373 19850 ERROR tripleo_common.actions.parameters Traceback (most recent call last):
2018-04-17 14:15:32.373 19850 ERROR tripleo_common.actions.parameters File "/usr/lib/python2.7/site-packages/tripleo_common/actions/parameters.py", line 249, in run
2018-04-17 14:15:32.373 19850 ERROR tripleo_common.actions.parameters env = plan_utils.get_env(swift, self.container)
2018-04-17 14:15:32.373 19850 ERROR tripleo_common.actions.parameters File "/usr/lib/python2.7/site-packages/tripleo_common/utils/plan.py", line 41, in get_env
2018-04-17 14:15:32.373 19850 ERROR tripleo_common.actions.parameters swift.get_object(name, constants.PLAN_ENVIRONMENT)[1]
2018-04-17 14:15:32.373 19850 ERROR tripleo_common.actions.parameters File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1799, in get_object
2018-04-17 14:15:32.373 19850 ERROR tripleo_common.actions.parameters headers=headers)
2018-04-17 14:15:32.373 19850 ERROR tripleo_common.actions.parameters File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1691, in _retry
2018-04-17 14:15:32.373 19850 ERROR tripleo_common.actions.parameters service_token=self.service_token, **kwargs)
2018-04-17 14:15:32.373 19850 ERROR tripleo_common.actions.parameters File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1179, in get_object
2018-04-17 14:15:32.373 19850 ERROR tripleo_common.actions.parameters raise ClientException.from_response(resp, 'Object GET failed', body)
2018-04-17 14:15:32.373 19850 ERROR tripleo_common.actions.parameters ClientException: Object GET failed: https://192.168.24.2:13808/v1/AUTH_23fb21a23c594c1fb4737a86fe46f6e9/overcloud/plan-environment.yaml 404 Not Found [first 60 chars of response] <html><h1>Not Found</h1><p>The resource could not be found.<
2018-04-17 14:15:32.373 19850 ERROR tripleo_common.actions.parameters

Tags: upgrade
Changed in tripleo:
milestone: rocky-1 → rocky-2
Changed in tripleo:
assignee: nobody → Quique Llorente (quiquell)
Matt Young (halcyondude)
tags: removed: quickstart
Changed in tripleo:
assignee: Quique Llorente (quiquell) → nobody
Changed in tripleo:
milestone: rocky-2 → rocky-3
Changed in tripleo:
milestone: rocky-3 → rocky-rc1
Changed in tripleo:
milestone: rocky-rc1 → stein-1
Revision history for this message
Martin Kopec (mkopec) wrote :
Martin Kopec (mkopec)
tags: added: alert promotion-blocker
Revision history for this message
Dougal Matthews (d0ugal) wrote :
Download full text (3.1 KiB)

In the most recent report, I found this error in the Mistral logs...

http://logs.openstack.org/24/608324/1/gate/tripleo-ci-centos-7-undercloud-oooq/0843431/logs/undercloud/var/log/mistral/executor.log.txt.gz#_2018-10-10_08_03_55_473

2018-10-10 08:03:55.473 26208 ERROR tripleo_common.actions.templates [req-bed99655-45c0-4455-9628-00e97accb2d7 7fff713b28d647d4bb0564dae6a00d32 c5a85f06ef4f47468d7054f618c0febd - default default] Error storing file network/service_net_map.yaml in container overcloud: ClientException: put_object(u'overcloud', u'network/service_net_map.yaml', ...) failure and no ability to reset contents for reupload.
2018-10-10 08:03:55.474 26208 ERROR tripleo_common.actions.templates [req-bed99655-45c0-4455-9628-00e97accb2d7 7fff713b28d647d4bb0564dae6a00d32 c5a85f06ef4f47468d7054f618c0febd - default default] Error occurred while processing custom roles.: Exception: Error storing file network/service_net_map.yaml in container overcloud
2018-10-10 08:03:55.474 26208 ERROR tripleo_common.actions.templates Traceback (most recent call last):
2018-10-10 08:03:55.474 26208 ERROR tripleo_common.actions.templates File "/usr/lib/python2.7/site-packages/tripleo_common/actions/templates.py", line 368, in run
2018-10-10 08:03:55.474 26208 ERROR tripleo_common.actions.templates self._process_custom_roles(context)
2018-10-10 08:03:55.474 26208 ERROR tripleo_common.actions.templates File "/usr/lib/python2.7/site-packages/tripleo_common/actions/templates.py", line 346, in _process_custom_roles
2018-10-10 08:03:55.474 26208 ERROR tripleo_common.actions.templates context=context)
2018-10-10 08:03:55.474 26208 ERROR tripleo_common.actions.templates File "/usr/lib/python2.7/site-packages/tripleo_common/actions/templates.py", line 157, in _j2_render_and_put
2018-10-10 08:03:55.474 26208 ERROR tripleo_common.actions.templates raise Exception(error_msg)
2018-10-10 08:03:55.474 26208 ERROR tripleo_common.actions.templates Exception: Error storing file network/service_net_map.yaml in container overcloud
2018-10-10 08:03:55.474 26208 ERROR tripleo_common.actions.templates

It looks like the upload to swift failed, but this code isn't written to retry on failure. Looking at the swift logs it happened due to a timeout.

http://logs.openstack.org/24/608324/1/gate/tripleo-ci-centos-7-undercloud-oooq/0843431/logs/undercloud/var/log/swift/swift.log.txt.gz#_Oct_10_08_03_52

Oct 10 08:03:52 centos-7-inap-mtl01-0002810891 proxy-server: ERROR with Object server 192.168.24.1:6000/1 re: Trying to get final status of PUT to /v1/AUTH_c5a85f06ef4f47468d7054f618c0febd/overcloud/network/service_net_map.yaml: Timeout (60.0s) (txn: tx4931d10773dc4044b0baf-005bbdb22c)
Oct 10 08:03:52 centos-7-inap-mtl01-0002810891 proxy-server: Object PUT returning 503 for [503] (txn: tx4931d10773dc4044b0baf-005bbdb22c) (client_ip: 192.168.24.1)
Oct 10 08:03:52 centos-7-inap-mtl01-0002810891 proxy-server: 192.168.24.1 192.168.24.1 10/Oct/2018/08/03/52 PUT /v1/AUTH_c5a85f06ef4f47468d7054f618c0febd/overcloud/network/service_net_map.yaml HTTP/1.0 503 - python-swiftclient-3.5.0 gAAAAABbvbHJ-yJb... 6147 118 - tx4931d10773dc4044b0baf-005bbdb22c - 60.0126 - - 1539158572.454325914...

Read more...

Revision history for this message
Martin Kopec (mkopec) wrote :

Moved the failure from comment #2 to https://bugs.launchpad.net/tripleo/+bug/1797167 as it might be a different issue -> removing alert and promotion-blocker tags

tags: removed: alert promotion-blocker
Revision history for this message
Adriano Petrich (apetrich) wrote :
Changed in tripleo:
milestone: stein-1 → stein-2
Changed in tripleo:
milestone: stein-2 → stein-3
Revision history for this message
Juan Antonio Osorio Robles (juan-osorio-robles) wrote :

Is this still an issue?

Revision history for this message
Jiří Stránský (jistr) wrote :

I think it probably is, but since nobody cared for so long, let's close as wontfix and reopen if we see fit.

Changed in tripleo:
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.