[stable/queens] The tripleo-ci-centos-7-undercloud-upgrades is failing on queens for No such file or directory: '/etc/heat/policy.json

Bug #1796639 reported by Marios Andreou
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Sorin Sbarnea

Bug Description

The tripleo-ci-centos-7-undercloud-upgrades is failing on queens for No such file or directory: '/etc/heat/policy.json - example at [1] and trace looks like:

  2018-10-08 01:12:42 | Traceback (most recent call last):
  2018-10-08 01:12:42 | File "<string>", line 1, in <module>
  2018-10-08 01:12:42 | File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 2412, in pre_upgrade
  2018-10-08 01:12:42 | for stack in heat.stacks.list():
  2018-10-08 01:12:42 | File "/usr/lib/python2.7/site-packages/heatclient/v1/stacks.py", line 136, in paginate
  2018-10-08 01:12:42 | stacks = self._list(url, 'stacks')
  2018-10-08 01:12:42 | File "/usr/lib/python2.7/site-packages/heatclient/common/base.py", line 114, in _list
  2018-10-08 01:12:42 | body = self.client.get(url).json()
  2018-10-08 01:12:42 | File "/usr/lib/python2.7/site-packages/keystoneauth1/adapter.py", line 304, in get
  2018-10-08 01:12:42 | return self.request(url, 'GET', **kwargs)
  2018-10-08 01:12:42 | File "/usr/lib/python2.7/site-packages/heatclient/common/http.py", line 317, in request
  2018-10-08 01:12:42 | raise exc.from_response(resp)
  2018-10-08 01:12:42 | heatclient.exc.HTTPInternalServerError: ERROR: [Errno 2] No such file or directory: '/etc/heat/policy.json'
  2018-10-08 01:12:42 | Command 'instack-pre-upgrade-undercloud' returned non-zero exit status 1

[1]
http://logs.openstack.org/24/567224/117/check/tripleo-ci-centos-7-undercloud-upgrades/e4ad94e/logs/undercloud/home/zuul/undercloud_upgrade.log.txt.gz#_2018-10-08_01_12_42

Tags: alert ci
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to instack-undercloud (stable/queens)

Related fix proposed to branch: stable/queens
Review: https://review.openstack.org/608639

Revision history for this message
Marios Andreou (marios-b) wrote :

A check [0] was added for a deployed overcloud heat stack with https://review.openstack.org/#/c/570367/1 into the instack undercloud pre_upgrade function. However this is a problem for the undercloud jobs where there is no deployed overcloud.

We don't hit this on master, because we don't run the instack-undercloud code path there anymore. Tags were updated late last week so i suspect thats the reason we are now seeing it on queens (pike doesn't have this check at [2])

[0] https://github.com/openstack/instack-undercloud/blob/c220b84ef799960b3e4801940f5b5c77522840fa/instack_undercloud/undercloud.py#L2344-L2358
[1] https://github.com/openstack/python-tripleoclient/blob/71f51369a06ab0bf1ebc704ee3217b88bd069c01/tripleoclient/v1/undercloud.py#L159-L170
[2] https://github.com/openstack/instack-undercloud/blob/22320f51a4308398b0c4ab5bc04386d95c42424c/instack_undercloud/undercloud.py#L1937

Revision history for this message
Rabi Mishra (rabi) wrote :

As mentioned in the review, this is not related to that patch as it's failing to list the stacks in undercloud heat.

However from the heat api logs it's quite confusing.

It fails around 2018-10-08 01:12:42 and the corresponding heat api log

http://logs.openstack.org/24/567224/117/check/tripleo-ci-centos-7-undercloud-upgrades/e4ad94e/logs/undercloud/var/log/heat/heat_api.log.txt.gz#_2018-10-08_01_12_42_343

However, then it seems heat-api is restarted and it seems to work fine in a later time.

http://logs.openstack.org/24/567224/117/check/tripleo-ci-centos-7-undercloud-upgrades/e4ad94e/logs/undercloud/var/log/heat/heat_api.log.txt.gz#_2018-10-08_01_15_04_869

Revision history for this message
Marios Andreou (marios-b) wrote :

adding a note for now but something else must have fixed this the latest queens gate check https://review.openstack.org/#/c/567224 is green at [1] but previous run was not [2]

[1] http://logs.openstack.org/24/567224/117/check/tripleo-ci-centos-7-undercloud-upgrades/9938296/
[2] http://logs.openstack.org/24/567224/117/check/tripleo-ci-centos-7-undercloud-upgrades/15056cb/logs/undercloud/home/zuul/undercloud_upgrade.log.txt.gz

 but latest OK tripleo-ci-centos-7-undercloud-upgrades SUCCESS in 1h 18m 31s

Revision history for this message
Marios Andreou (marios-b) wrote :

still green at http://logs.openstack.org/24/567224/117/check/tripleo-ci-centos-7-undercloud-upgrades/4ee7d6d/ queens gate check https://review.openstack.org/#/c/567224 so closing the bug for now we can re-open if it is seen again

Changed in tripleo:
status: Triaged → Invalid
Revision history for this message
Alex Schultz (alex-schultz) wrote :
Changed in tripleo:
status: Invalid → Triaged
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on instack-undercloud (stable/queens)

Change abandoned by Marios Andreou (<email address hidden>) on branch: stable/queens
Review: https://review.openstack.org/608639
Reason: invalid fix

Revision history for this message
p.cipriano (ctpeter) wrote :

Hi,

After running the openstack undercloud upgrade from pike to queens. We've encountered errors. Kindly see below logs for the errors.

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 2411, in pre_upgrade
    user_domain_name='Default')
  File "/usr/lib/python2.7/site-packages/os_client_config/__init__.py", line 86, in make_client
    return cloud.get_legacy_client(service_key, constructor)
  File "/usr/lib/python2.7/site-packages/os_client_config/cloud_config.py", line 370, in get_legacy_client
    service_key, min_version=min_version, max_version=max_version)
  File "/usr/lib/python2.7/site-packages/os_client_config/cloud_config.py", line 309, in get_session_endpoint
    endpoint = self._get_highest_endpoint(service_types, kwargs)
  File "/usr/lib/python2.7/site-packages/os_client_config/cloud_config.py", line 267, in _get_highest_endpoint
    return session.get_endpoint(**kwargs)
  File "/usr/lib/python2.7/site-packages/keystoneauth1/session.py", line 942, in get_endpoint
    return auth.get_endpoint(self, **kwargs)
  File "/usr/lib/python2.7/site-packages/keystoneauth1/identity/base.py", line 379, in get_endpoint
    allow_version_hack=allow_version_hack, **kwargs)
  File "/usr/lib/python2.7/site-packages/keystoneauth1/identity/base.py", line 270, in get_endpoint_data
    service_catalog = self.get_access(session).service_catalog
  File "/usr/lib/python2.7/site-packages/keystoneauth1/identity/base.py", line 134, in get_access
    self.auth_ref = self.get_auth_ref(session)
  File "/usr/lib/python2.7/site-packages/keystoneauth1/identity/generic/base.py", line 199, in get_auth_ref
    self._plugin = self._do_create_plugin(session)
  File "/usr/lib/python2.7/site-packages/keystoneauth1/identity/generic/base.py", line 188, in _do_create_plugin
    'Cannot use v2 authentication with domain scope')
keystoneauth1.exceptions.discovery.DiscoveryFailure: Cannot use v2 authentication with domain scope
Command 'instack-pre-upgrade-undercloud' returned non-zero exit status 1
clean_up UpgradeUndercloud: Command 'instack-pre-upgrade-undercloud' returned non-zero exit status 1
END return value: 1

Revision history for this message
Sofer Athlan-Guyot (sofer-athlan-guyot) wrote :
Revision history for this message
Sofer Athlan-Guyot (sofer-athlan-guyot) wrote :

Oki, starting at comment we actually have a new bug, following it in https://bugs.launchpad.net/tripleo/+bug/1798553

Revision history for this message
Alex Schultz (alex-schultz) wrote :

This is failing in check now

Changed in tripleo:
importance: High → Critical
milestone: none → stein-2
Revision history for this message
Rafael Folco (rafaelfolco) wrote :
Changed in tripleo:
assignee: Marios Andreou (marios-b) → nobody
Revision history for this message
Janki Chhatbar (jankihchhatbar) wrote :
Revision history for this message
Martin Schuppert (mschuppert) wrote :
tags: added: alert
Revision history for this message
Sorin Sbarnea (ssbarnea) wrote :

Is anyone working to fix this? As this blocks merges of other fixes.

Sorin Sbarnea (ssbarnea)
Changed in tripleo:
assignee: nobody → Alex Schultz (alex-schultz)
Revision history for this message
Alex Schultz (alex-schultz) wrote :

So whatever is happening, the python-tripleoclient is not properly getting updated during the upgrade script execution.

http://logs.openstack.org/75/620975/1/check/tripleo-ci-centos-7-undercloud-upgrades/6d3151c/logs/undercloud/home/zuul/undercloud_upgrade.log.txt.gz

2018-11-29 22:13:20 | Loaded plugins: fastestmirror, priorities
2018-11-29 22:13:20 | Loading mirror speeds from cached hostfile
2018-11-29 22:13:21 | 122 packages excluded due to repository priority protections
2018-11-29 22:13:22 | No packages marked for update
2018-11-29 22:13:24 | Loaded plugins: fastestmirror, priorities
2018-11-29 22:13:25 | Loading mirror speeds from cached hostfile
2018-11-29 22:13:26 | 122 packages excluded due to repository priority protections
2018-11-29 22:13:26 | No packages marked for update
2018-11-29 22:13:30 | Traceback (most recent call last):
2018-11-29 22:13:30 | File "<string>", line 1, in <module>
2018-11-29 22:13:30 | File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 2410, in pre_upgrade
2018-11-29 22:13:30 | for stack in heat.stacks.list():
2018-11-29 22:13:30 | File "/usr/lib/python2.7/site-packages/heatclient/v1/stacks.py", line 136, in paginate
2018-11-29 22:13:30 | stacks = self._list(url, 'stacks')
2018-11-29 22:13:30 | File "/usr/lib/python2.7/site-packages/heatclient/common/base.py", line 114, in _list
2018-11-29 22:13:30 | body = self.client.get(url).json()
2018-11-29 22:13:30 | File "/usr/lib/python2.7/site-packages/keystoneauth1/adapter.py", line 304, in get
2018-11-29 22:13:30 | return self.request(url, 'GET', **kwargs)
2018-11-29 22:13:30 | File "/usr/lib/python2.7/site-packages/heatclient/common/http.py", line 317, in request
2018-11-29 22:13:30 | raise exc.from_response(resp)
2018-11-29 22:13:30 | heatclient.exc.HTTPInternalServerError: ERROR: [Errno 2] No such file or directory: '/etc/heat/policy.json'
2018-11-29 22:13:30 | Command 'instack-pre-upgrade-undercloud' returned non-zero exit status 1

The first yum update is a yum update to python-tripleoclient so if it's not getting updated, that means it's either not available for an update or was updated prior to when it was supposed to be. I'm still trying to figure out the ordering of what is happening because the reproducer is broken and if I manually run through the process it works fine.

Revision history for this message
Alex Schultz (alex-schultz) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-quickstart-extras (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/621259

Changed in tripleo:
status: Triaged → In Progress
Changed in tripleo:
assignee: Alex Schultz (alex-schultz) → Sorin Sbarnea (ssbarnea)
Revision history for this message
Sorin Sbarnea (ssbarnea) wrote :

Change is in gate queue now.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-quickstart-extras (master)

Reviewed: https://review.openstack.org/621259
Committed: https://git.openstack.org/cgit/openstack/tripleo-quickstart-extras/commit/?id=05b5b272f38217ce09bc0e9a4a484f97bffcd2e2
Submitter: Zuul
Branch: master

commit 05b5b272f38217ce09bc0e9a4a484f97bffcd2e2
Author: Alex Schultz <email address hidden>
Date: Fri Nov 30 11:46:05 2018 -0700

    Add ability to disable yum update with gating repo

    Previously when the gating repo was enabled, we would blindly yum update
    with the repo. This probably shouldn't be the case as we should just be
    installing or updating the new packages after the repo has been
    installed. However to maintain backwards compatibility, a new variable
    called ib_gating_repo_update has been created with the default of True.
    To disable the yum update after isntalling built repo, set this to
    false.

    Change-Id: I5172134738b5b3d5146e45f1e0a7720b6e1d5a2c
    Closes-Bug: #1796639

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-quickstart-extras 2.1.1

This issue was fixed in the openstack/tripleo-quickstart-extras 2.1.1 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.