cepadmin install failing in the gate while pulling octopus container

Bug #1923529 reported by wes hayutin
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
John Fulton

Bug Description

2021-04-12 22:45:18,241 INFO Mon IP 192.168.24.1 is in CIDR network 192.168.24.0/24
2021-04-12 22:45:18,243 INFO Pulling container image 198.72.124.50:5001/tripleomaster/daemon:v5.0.7-stable-5.0-octopus-centos-8-x86_64...
2021-04-12 22:45:18,243 DEBUG Running command: /bin/podman pull 198.72.124.50:5001/tripleomaster/daemon:v5.0.7-stable-5.0-octopus-centos-8-x86_64
2021-04-12 22:45:18,467 DEBUG /bin/podman: stderr Trying to pull 198.72.124.50:5001/tripleomaster/daemon:v5.0.7-stable-5.0-octopus-centos-8-x86_64...
2021-04-12 22:45:18,491 DEBUG /bin/podman: stderr manifest unknown: manifest unknown
2021-04-12 22:45:18,491 DEBUG /bin/podman: stderr Error: Error initializing source docker://198.72.124.50:5001/tripleomaster/daemon:v5.0.7-stable-5.0-octopus-centos-8-x86_64: Error reading manifest v5.0.7-stable-5.0-octopus-centos-8-x86_64 in 198.72.124.50:5001/tripleomaster/daemon: manifest unknown: manifest unknown
2021-04-12 22:45:18,499 INFO Non-zero exit code 125 from /bin/podman pull 198.72.124.50:5001/tripleomaster/daemon:v5.0.7-stable-5.0-octopus-centos-8-x86_64
2021-04-12 22:45:18,500 INFO /bin/podman: stderr Trying to pull 198.72.124.50:5001/tripleomaster/daemon:v5.0.7-stable-5.0-octopus-centos-8-x86_64...
2021-04-12 22:45:18,500 INFO /bin/podman: stderr manifest unknown: manifest unknown
2021-04-12 22:45:18,500 INFO /bin/podman: stderr Error: Error initializing source docker://198.72.124.50:5001/tripleomaster/daemon:v5.0.7-stable-5.0-octopus-centos-8-x86_64: Error reading manifest v5.0.7-stable-5.0-octopus-centos-8-x86_64 in 198.72.124.50:5001/tripleomaster/daemon: manifest unknown: manifest unknown
2021-04-12 22:45:18,519 DEBUG Releasing lock 140237772404216 on /run/cephadm/4b5c8c0a-ff60-454b-a1b4-9747aa737d19.lock
2021-04-12 22:45:18,519 DEBUG Lock 140237772404216 released on /run/cephadm/4b5c8c0a-ff60-454b-a1b4-9747aa737d19.lock

https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_cb1/783993/7/gate/tripleo-ci-centos-8-scenario001-standalone/cb17a01/logs/undercloud/var/log/ceph/cephadm.log

https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_5ef/782891/6/gate/tripleo-ci-centos-8-scenario001-standalone/5efe01c/logs/undercloud/var/log/ceph/cephadm.log

Revision history for this message
wes hayutin (weshayutin) wrote :

we need to find a more reliable method or have a retry

Revision history for this message
Francesco Pantano (fmount) wrote :

The container tag is missing (it was removed) on quay.ceph.io, and this is the root cause of the CI issue:

````
> s podman pull quay.ceph.io/ceph-ci/daemon:5.0.7-stable-5.0-octopus-centos-8-x86_64
Trying to pull quay.ceph.io/ceph-ci/daemon:5.0.7-stable-5.0-octopus-centos-8-x86_64...
  manifest unknown: manifest unknown
Error: Error initializing source docker://quay.ceph.io/ceph-ci/daemon:5.0.7-stable-5.0-octopus-centos-8-x86_64:
Error reading manifest 5.0.7-stable-5.0-octopus-centos-8-x86_64 in quay.ceph.io/ceph-ci/daemon: manifest unknown: manifest unknown
````

Revision history for this message
wes hayutin (weshayutin) wrote :

 podman pull quay.ceph.io/ceph-ci/daemon:v5.0.7-stable-5.0-octopus-centos-8-x86_64
Trying to pull quay.ceph.io/ceph-ci/daemon:v5.0.7-stable-5.0-octopus-centos-8-x86_64...
Getting image source signatures
Copying blob c44dcd19725b done
Copying blob 2ec690a153f1 [=>------------------------------------] 1.3MiB / 32.3MiB
Copying blob 56160a35ca55 done
Copying blob 7a0437f04f83 [>-------------------------------------] 1.3MiB / 71.7MiB
Copying blob 3a4ee2d5dcbb [--------------------------------------] 1.3MiB / 253.8MiB
Copying blob 5973f16615d3 done
Copying blob 6beab61f5ca4 done
Copying blob da505a809d2c done
Copying blob 77bb13114c60 done
Copying blob 34763bfe47ea done

Revision history for this message
wes hayutin (weshayutin) wrote :

https://6199643189f3ab619c51-d48122bd19222adcfc0d4cf0ac66a905.ssl.cf2.rackcdn.com/786000/1/check/tripleo-ci-centos-8-content-provider/2c45953/job-output.txt

2021-04-13 02:49:20.512378 | primary | TASK [container-build : Store non-tripleo containers] **************************
2021-04-13 02:49:20.512471 | primary | Tuesday 13 April 2021 02:49:20 +0000 (0:00:00.058) 0:46:32.270 *********
2021-04-13 02:49:20.569849 | primary | ok: [undercloud]
2021-04-13 02:49:20.582702 | primary |
2021-04-13 02:49:20.582742 | primary | TASK [container-build : Print non-tripleo containers] **************************
2021-04-13 02:49:20.582885 | primary | Tuesday 13 April 2021 02:49:20 +0000 (0:00:00.070) 0:46:32.340 *********
2021-04-13 02:49:20.638394 | primary | ok: [undercloud] => {}
2021-04-13 02:49:20.638437 | primary |
2021-04-13 02:49:20.638453 | primary | MSG:
2021-04-13 02:49:20.638467 | primary |
2021-04-13 02:49:20.638482 | primary | ['quay.ceph.io/ceph-ci/daemon:v6.0.0-stable-6.0-pacific-centos-8-x86_64', 'quay.ceph.io/prometheus/prometheus:v2.7.2', 'quay.ceph.io/prometheus/alertmanager:v0.16.2', 'quay.ceph.io/prometheus/node-exporter:v0.17.0', 'quay.ceph.io/app-sre/grafana:6.7.4']
2021-04-13 02:49:20.683883 | primary |
2021-04-13 02:49:20.683924 | primary | TASK [container-build : Pull containers from 127.0.0.1:5001] *******************
2021-04-13 02:49:20.684080 | primary | Tuesday 13 April 2021 02:49:20 +0000 (0:00:00.101) 0:46:32.442 *********
2021-04-13 02:50:00.698878 | primary | changed: [undercloud] => (item=quay.ceph.io/ceph-ci/daemon:v6.0.0-stable-6.0-pacific-centos-8-x86_64)
2021-04-13 02:50:07.166212 | primary | changed: [undercloud] => (item=quay.ceph.io/prometheus/prometheus:v2.7.2)
2021-04-13 02:50:10.957351 | primary | changed: [undercloud] => (item=quay.ceph.io/prometheus/alertmanager:v0.16.2)
2021-04-13 02:50:13.841739 | primary | changed: [undercloud] => (item=quay.ceph.io/prometheus/node-exporter:v0.17.0)
2021-04-13 02:50:24.111233 | primary | changed: [undercloud] => (item=quay.ceph.io/app-sre/grafana:6.7.4)

Revision history for this message
John Fulton (jfulton-org) wrote :
Changed in tripleo:
status: Triaged → In Progress
assignee: nobody → John Fulton (jfulton-org)
Revision history for this message
John Fulton (jfulton-org) wrote :

Plan:

Put both containers in the content provider with this:
 https://review.opendev.org/c/openstack/tripleo-common/+/786053

After the transition is complete, revert the above with this:
 https://review.opendev.org/c/openstack/tripleo-common/+/786076

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-common (master)

Reviewed: https://review.opendev.org/c/openstack/tripleo-common/+/786053
Committed: https://opendev.org/openstack/tripleo-common/commit/b2860cacde0dd1f09bc96d4d278b1fbe1bba334d
Submitter: "Zuul (22348)"
Branch: master

commit b2860cacde0dd1f09bc96d4d278b1fbe1bba334d
Author: John Fulton <email address hidden>
Date: Tue Apr 13 13:49:53 2021 +0000

    Include both Ceph Octopus and Pacific in the content provider

    I9d38c1dbadb968f8c7721c69c57002f159cea619 changed the default
    Ceph container to Pacific but it took time for multiple patches
    to merge so there was a period of time a job was asking for a
    container which was different from what was available. This patch
    makes both available in the content provider (brings Ocotpus back)
    until the transition is complete and then it can be reverted. It
    establishes a pattern for how to make these transitions in the
    future.

    Closes-Bug: #1923529
    Change-Id: Ia00caf064921da362395afbfeb1b0259ff244c4a

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
John Fulton (jfulton-org) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-common 16.0.0

This issue was fixed in the openstack/tripleo-common 16.0.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-common (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/tripleo-common/+/790100

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-common (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/tripleo-common/+/790100
Committed: https://opendev.org/openstack/tripleo-common/commit/619e7da5c4031cdaf145b22155ae832cc8e6dca9
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit 619e7da5c4031cdaf145b22155ae832cc8e6dca9
Author: John Fulton <email address hidden>
Date: Tue Apr 13 13:49:53 2021 +0000

    Include both Ceph Octopus and Pacific in the content provider

    I9d38c1dbadb968f8c7721c69c57002f159cea619 changed the default
    Ceph container to Pacific but it took time for multiple patches
    to merge so there was a period of time a job was asking for a
    container which was different from what was available. This patch
    makes both available in the content provider (brings Ocotpus back)
    until the transition is complete and then it can be reverted. It
    establishes a pattern for how to make these transitions in the
    future.

    Closes-Bug: #1923529
    Change-Id: Ia00caf064921da362395afbfeb1b0259ff244c4a

tags: added: in-stable-wallaby
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-common 15.2.0

This issue was fixed in the openstack/tripleo-common 15.2.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.