Building container fails because manifest is not found

Bug #1901753 reported by Sagi (Sergey) Shnaidman
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Bogdan Dobrelya

Bug Description

Sometimes in content provider jobs when containers are building, happens error like:

error locating item named "manifest" for image with ID "....": file does not exist

Command: sudo buildah bud --volume /etc/yum.repos.d:/etc/yum.repos.d:z --volume /etc/pki/rpm-gpg:/etc/pki/rpm-gpg:z --volume /etc/yum.repos.d:/etc/yum.repos.d:z --volume /etc/pki/rpm-gpg:/etc/pki/rpm-gpg:z --format docker --tls-verify=False --layers --logfile /home/zuul/container-builds/7d364ba7-6c15-4676-98c7-813f36f44452/base/ovn-base/ovn-base-build.log -t 127.0.0.1:5001/tripleomaster/openstack-ovn-base:a9a790d0723c9fe6641e453c6a1f0c91 /home/zuul/container-builds/7d364ba7-6c15-4676-98c7-813f36f44452/base/ovn-base
Exit code: 1
Stdout: ''
Stderr: 'error checking if cached image exists from a previous build: error getting history of "68c139527f8459ba7c981a11f1872a6bb5b621c2140a6f2fa7858214df8e429f": error creating new image from reference to image "68c139527f8459ba7c981a11f1872a6bb5b621c2140a6f2fa7858214df8e429f": error locating item named "manifest" for image with ID "68c139527f8459ba7c981a11f1872a6bb5b621c2140a6f2fa7858214df8e429f": file does not exist\n'

The following errors were detected during container build(s):

Exception information: Unexpected error while running command.
Command: sudo buildah bud --volume /etc/yum.repos.d:/etc/yum.repos.d:z --volume /etc/pki/rpm-gpg:/etc/pki/rpm-gpg:z --volume /etc/yum.repos.d:/etc/yum.repos.d:z --volume /etc/pki/rpm-gpg:/etc/pki/rpm-gpg:z --format docker --tls-verify=False --layers --logfile /home/zuul/container-builds/7d364ba7-6c15-4676-98c7-813f36f44452/base/ovn-base/ovn-base-build.log -t 127.0.0.1:5001/tripleomaster/openstack-ovn-base:a9a790d0723c9fe6641e453c6a1f0c91 /home/zuul/container-builds/7d364ba7-6c15-4676-98c7-813f36f44452/base/ovn-base
Exit code: 1
Stdout: ''
Stderr: 'error checking if cached image exists from a previous build: error getting history of "68c139527f8459ba7c981a11f1872a6bb5b621c2140a6f2fa7858214df8e429f": error creating new image from reference to image "68c139527f8459ba7c981a11f1872a6bb5b621c2140a6f2fa7858214df8e429f": error locating item named "manifest" for image with ID "68c139527f8459ba7c981a11f1872a6bb5b621c2140a6f2fa7858214df8e429f": file does not exist\n'
Traceback (most recent call last):

https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_ae0/759793/2/check/tripleo-ci-centos-8-content-provider/ae05bd4/logs/undercloud/home/zuul/container_image_build.log

Revision history for this message
Sagi (Sergey) Shnaidman (sshnaidm) wrote :
Revision history for this message
wes hayutin (weshayutin) wrote :
tags: added: promotion-blocker
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

another thing that happens quite often is timed out push command in the build progress

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

See https://review.opendev.org/#/c/760116/ for the timeouts on push

tags: added: alert
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-common (master)

Change abandoned by Sagi Shnaidman (<email address hidden>) on branch: master
Review: https://review.opendev.org/760075

Changed in tripleo:
milestone: victoria-3 → wallaby-1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-quickstart-extras (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/761634

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-quickstart-extras (master)

Reviewed: https://review.opendev.org/761634
Committed: https://git.openstack.org/cgit/openstack/tripleo-quickstart-extras/commit/?id=8fd8a3c5b739e2e669fe2ddd5c33db926e9630dd
Submitter: Zuul
Branch: master

commit 8fd8a3c5b739e2e669fe2ddd5c33db926e9630dd
Author: Sagi Shnaidman <email address hidden>
Date: Thu Nov 5 17:55:54 2020 +0200

    Add retry for containers building

    If one of containers build failed, retry building one more time.
    See https://github.com/containers/buildah/issues/2521
    Related-Bug: #1901753

    Depends-On: https://review.opendev.org/#/c/761537/
    Change-Id: I9c5014ef050c9709cd768aa245187bce65462b7b

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-common (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/761996

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-common (master)

Reviewed: https://review.opendev.org/761996
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=76856521c56b66df02b73b2cd4233276dafcf1e4
Submitter: Zuul
Branch: master

commit 76856521c56b66df02b73b2cd4233276dafcf1e4
Author: Alex Schultz <email address hidden>
Date: Mon Nov 9 13:38:02 2020 -0700

    Drop --layers from buildah

    Since we parallelize buildah, we are hitting issues with layers
    disapearing. This is likely due the use of --layers which enables
    caching intermediate layers during build which means they might go away
    when one processs is done before the other uses it.

    Change-Id: I005043cc1c1438bcaa7c34c3368749e9b2377473
    Related-Bug: #1901753

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-common (stable/victoria)

Related fix proposed to branch: stable/victoria
Review: https://review.opendev.org/762345

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-common (stable/victoria)

Reviewed: https://review.opendev.org/762345
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=2a7d949b03d0a5da0d8b92e66274db7bc211ad37
Submitter: Zuul
Branch: stable/victoria

commit 2a7d949b03d0a5da0d8b92e66274db7bc211ad37
Author: Alex Schultz <email address hidden>
Date: Mon Nov 9 13:38:02 2020 -0700

    Drop --layers from buildah

    Since we parallelize buildah, we are hitting issues with layers
    disapearing. This is likely due the use of --layers which enables
    caching intermediate layers during build which means they might go away
    when one processs is done before the other uses it.

    Change-Id: I005043cc1c1438bcaa7c34c3368749e9b2377473
    Related-Bug: #1901753
    (cherry picked from commit 76856521c56b66df02b73b2cd4233276dafcf1e4)

tags: added: in-stable-victoria
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-common (master)

Fix proposed to branch: master
Review: https://review.opendev.org/762561

Changed in tripleo:
assignee: nobody → Bogdan Dobrelya (bogdando)
status: Triaged → In Progress
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-common (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/762690

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

So as my testing with https://github.com/containers/buildah/issues/2782 shows, that --layers option is only useful when building *exactly the same* containers but using different target directories. Which sounds bit pointless to me...

When building consequent containers that share some common base, the --layers buys nothing. Unsure if that's a bug or not, but here is nothing to fix anymore since we dropped --layers from our builds

Changed in tripleo:
status: In Progress → Fix Released
assignee: Bogdan Dobrelya (bogdando) → Alex Schultz (alex-schultz)
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

And better testing now shows that --layers actually works and speeds up things a lot. Maybe just not with the builah that we currently use. (will double check that).

Changed in tripleo:
status: Fix Released → In Progress
assignee: Alex Schultz (alex-schultz) → Bogdan Dobrelya (bogdando)
wes hayutin (weshayutin)
Changed in tripleo:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.