periodic undercloud upgrade ussuri fails undercloud install cannot pull containers

Bug #1936825 reported by Marios Andreou
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Unassigned

Bug Description

The periodic-tripleo-ci-centos-8-undercloud-upgrade-ussuri fails at [1][2][3] during the undercloud install and when pulling containers with trace like:

        * 2021-07-17 13:46:45 | "tripleo_common.image.exception.ImageUploaderException: Pulling image failed: cmd \"buildah --debug pull 192.168.24.1:8787/tripleotraincentos8/centos-binary-keystone:38e34dbdcaeafd31bd7f383c6530aa6a\", stdout \"\", stderr \"level=debug msg=\"[graphdriver] trying provided driver \\\"overlay\\\"\"",

looks like all containers are failing to pull isn't related to a specific container.
This is not affecting the other branches green at [4][5][6] and not hitting the gate green at [7]

[1] https://logserver.rdoproject.org/openstack-periodic-integration-stable3/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-undercloud-upgrade-ussuri/2697621/logs/undercloud/home/zuul/undercloud_install.log.txt.gz
[2] https://logserver.rdoproject.org/openstack-periodic-integration-stable3/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-undercloud-upgrade-ussuri/278680c/logs/undercloud/home/zuul/undercloud_install.log.txt.
[3] https://logserver.rdoproject.org/openstack-periodic-integration-stable3/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-undercloud-upgrade-ussuri/4ccb44d/logs/undercloud/home/zuul/undercloud_install.log.txt.gz
[4] https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-8-undercloud-upgrade-master
[5] https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-8-undercloud-upgrade-wallaby
[6] https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-8-undercloud-upgrade-victoria
[7] https://zuul.openstack.org/builds?job_name=tripleo-ci-centos-8-undercloud-upgrade-ussuri&pipeline=gate

Revision history for this message
Marios Andreou (marios-b) wrote :

did some digging today and found that bug [1] which i believe is describing the same issue
There is a github issue at [2]. There was a fix at [3] but that is not currently on train so I had a go at cherrypick with [4] being tested at [5].

From the failing job [6] you can see the registries.conf looks like

# insecure registry list
[registries.insecure]
registries = ['undercloud.ctlplane.localdomain', '192.168.24.1', '192.168.24.3']

With the patch at [4] it should end up looking more like [7] (wallaby job)

# insecure registry list
[[registry]]
prefix = "undercloud.ctlplane.localdomain"
insecure = true
location = "undercloud.ctlplane.localdomain"

[1] https://bugs.launchpad.net/tripleo/+bug/1909658
[2] https://github.com/containers/podman/issues/5764
[3] https://review.opendev.org/c/openstack/tripleo-ansible/+/719584
[4] https://review.opendev.org/c/openstack/tripleo-ansible/+/801615
[5] https://review.rdoproject.org/r/c/testproject/+/34621
[6] https://logserver.rdoproject.org/openstack-periodic-integration-stable3/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-undercloud-upgrade-ussuri/049bb37/logs/undercloud/etc/containers/registries.conf.txt.gz
[7] https://logserver.rdoproject.org/openstack-periodic-integration-stable1/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-undercloud-upgrade-wallaby/ed2950e/logs/undercloud/etc/containers/registries.conf.txt.gz

Revision history for this message
Marios Andreou (marios-b) wrote :

the test at https://review.rdoproject.org/r/c/testproject/+/34621 fails because the proposed fix in https://review.opendev.org/c/openstack/tripleo-ansible/+/801615 was not included there.

we can't use depends-on there (build-test-packages won't build the depends-on because it has a different branch, i.e. https://opendev.org/openstack/tripleo-quickstart-extras/src/commit/e423bb068a96ed919ead08226f9447b2cdfc0332/roles/build-test-packages/tasks/main.yml#L200)

I'm going to add promotion-blocker on this so we can get a sanity check on the proposal tripleo-ansible/+/801615

tags: added: promotion-blocker
Changed in tripleo:
milestone: none → xena-3
importance: High → Critical
Revision history for this message
Marios Andreou (marios-b) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-ansible (stable/train)

Reviewed: https://review.opendev.org/c/openstack/tripleo-ansible/+/801615
Committed: https://opendev.org/openstack/tripleo-ansible/commit/27fe9d9ce471dd25a56d8a8b9e4b6480e763b07f
Submitter: "Zuul (22348)"
Branch: stable/train

commit 27fe9d9ce471dd25a56d8a8b9e4b6480e763b07f
Author: Marios Andreou <email address hidden>
Date: Wed Jul 21 14:22:28 2021 +0300

    Add support for v2 registries.conf

    https://github.com/containers/image/blob/master/docs/containers-registries.conf.5.md#version-2

    This allows for mirrors to be configured for specific hosts however it
    is incompatibilty with the default v1 configuration so we have to nuke
    the existing configuration. Additionally it uses TOML which there is
    currently no ansible module to manage.

    Adds to_json to address issue seen with py2 jobs at [1]. The depends-on
    is unrelated but needed to pass centos7 jobs in ci.

    [1] https://review.opendev.org/c/openstack/tripleo-ansible/+/801615/2#message-821960bb5908358d27225a12fdc5c84ab7bfe0d0
    Depends-On: https://review.opendev.org/c/openstack/tripleo-quickstart/+/802431
    Related-Bug: 1936825
    Change-Id: Ic35155f04bf05913b9e9b8eaa22fe6c02515396c
    (cherry picked from commit bf80fe922b2d915675f01794e73893c018688916)

tags: added: in-stable-train
Revision history for this message
Marios Andreou (marios-b) wrote :

Patch is merged Jul 28 6:48 PM at [1] but the latest run in periodic was before that at 2021-07-28 13:25:38 [2] (assuming those two systems are using the same timezone ;) not sure).

Anyway, trying a test with https://review.rdoproject.org/r/c/testproject/+/34748 lets see, holding move to fix-released until we see if the issue is resolved or we need something more here.

[1] https://review.opendev.org/c/openstack/tripleo-ansible/+/801615/4#message-376d3c458fcb6832bd4e96dd1b482bf2d68d6f67
[2] https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-8-undercloud-upgrade-ussuri

Revision history for this message
Marios Andreou (marios-b) wrote :
Changed in tripleo:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.