periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-master fails with race condition in podman

Bug #1892701 reported by Arx Cruz
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Sagi (Sergey) Shnaidman

Bug Description

https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-master/3de1342/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz

2020-08-24 02:23:02 | 2020-08-24 02:23:02.276870 | fa163ee0-2cfa-fa19-59c7-000000007a18 | FATAL | Gather podman infos | overcloud-controller-2 | error={"changed": false, "msg": "Unable to gather info for ['cceab9790805', 'd864779732cf', 'f387c04ecd1b', '7b63f30b5ee2', 'dbf9e9e9b8fa', 'cbf8bedca671', '512bd43c90f7', 'e1f5005f822c', '23942e8a2729', '1740e0ba8615', '26cd1ae18ee0', '9adcb25a49d5', '5ac05d9bdd1e', '89d5438a3dad', 'd3338da173db', '6a40e382603e', 'ab89db20a6cf', '1894b6061aca', '1b638fc4db63', '4981da500d00', '9d92da77ad0e', '66d5342da31e', '27707cfb17fb', 'c122842bd227', 'bfb02cb77fe6', 'f6479dfa0bec', '8c6cfa5aad28', '22b52a9f86d7', '7240be991976', '837c00eb050d', 'e3f77a7b6e67', '88bcd8941f6a', 'c25f5024a797', 'e1f982845e13', 'ab7b79b15d80', '7926a1b8c722', 'f2684d49852a', 'c6c7dc35895e', '0739f7e994d0', '5b51a952f1f7', 'a8ee7992d096', 'ecd0d0fa29c5', 'f94eb236b4a9', '222bf0c98d0f', 'b1d9d1b27975', 'd25bfbb3b760', '1cb30724a017', '142f6c66e496', 'c771cd9990e9', '55242ca2e8c3', '2fd08e0ee2c9', '68a01c103cde', 'cbb27f8c6f90', '1e9486f2b591', '5c2a2ab260cd', '96941b230fc1', '4595ab02289f', '358d30333960', 'e9c263a654b6', '29364a1dcb93', 'a71e4270c290', '95b95152ff81', '1d56163cff44', 'adf72b12dadd', '03fb34884987', 'f957ddb44559']: Error: error looking up container \"cceab9790805\": no container with name or ID cceab9790805 found: no such container\n"}

According sshnaidm it's a race condition:

 12:40:23 <arxcruz|ruck> EmilienM: hi, you were investigating few days ago an error in podman with error: no container with name or ID
12:40:42 <arxcruz|ruck> EmilienM: got one today https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-master/3de1342/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz
12:40:53 <arxcruz|ruck> EmilienM: do you have an open bug, or shall I open one ?
12:41:11 --> udesale (~udesale@219.91.250.237) has joined #tripleo
12:41:42 --> suuuper (~<email address hidden>) has joined #tripleo
12:42:23 <sshnaidm> arxcruz|ruck, EmilienM seems like race condition with ovn-dbs-bundle-podman-0 container, it's removed 1 second before gathering info about it

Changed in tripleo:
assignee: nobody → Sagi (Sergey) Shnaidman (sshnaidm)
status: Triaged → In Progress
wes hayutin (weshayutin)
tags: removed: quickstart
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-ansible (master)

Reviewed: https://review.opendev.org/747685
Committed: https://git.openstack.org/cgit/openstack/tripleo-ansible/commit/?id=72f0e09019ee13ad306287a8f60faba30d4fb60e
Submitter: Zuul
Branch: master

commit 72f0e09019ee13ad306287a8f60faba30d4fb60e
Author: Sagi Shnaidman <email address hidden>
Date: Mon Aug 24 14:04:50 2020 +0300

    Update containers info module from collection

    When container disappears between "podman ps -a" call and
    inspection call "podman inspect cont1 cont2, ..", the module fails.
    To avoid this run inspection of each container one by one if total
    inspection call failed.
    This is update of module from collection.
    Closes-Bug: #1892701

    Change-Id: I0c085c6c136e5d5b162feb8a1f72d906ab08502e

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-ansible (stable/ussuri)

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/749006

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-ansible (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/749008

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-ansible (stable/ussuri)

Reviewed: https://review.opendev.org/749006
Committed: https://git.openstack.org/cgit/openstack/tripleo-ansible/commit/?id=958030b1d92a1ab958ba46a00706b0320eca672e
Submitter: Zuul
Branch: stable/ussuri

commit 958030b1d92a1ab958ba46a00706b0320eca672e
Author: Sagi Shnaidman <email address hidden>
Date: Mon Aug 24 14:04:50 2020 +0300

    Update containers info module from collection

    When container disappears between "podman ps -a" call and
    inspection call "podman inspect cont1 cont2, ..", the module fails.
    To avoid this run inspection of each container one by one if total
    inspection call failed.
    This is update of module from collection.
    Closes-Bug: #1892701

    Change-Id: I0c085c6c136e5d5b162feb8a1f72d906ab08502e
    (cherry picked from commit 72f0e09019ee13ad306287a8f60faba30d4fb60e)

tags: added: in-stable-ussuri
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-ansible (stable/train)

Reviewed: https://review.opendev.org/749008
Committed: https://git.openstack.org/cgit/openstack/tripleo-ansible/commit/?id=53eac4aa241f7e6557d0624620854c3a7bd1f3b4
Submitter: Zuul
Branch: stable/train

commit 53eac4aa241f7e6557d0624620854c3a7bd1f3b4
Author: Sagi Shnaidman <email address hidden>
Date: Mon Aug 24 14:04:50 2020 +0300

    Update containers info module from collection

    When container disappears between "podman ps -a" call and
    inspection call "podman inspect cont1 cont2, ..", the module fails.
    To avoid this run inspection of each container one by one if total
    inspection call failed.
    This is update of module from collection.
    Closes-Bug: #1892701

    Change-Id: I0c085c6c136e5d5b162feb8a1f72d906ab08502e
    (cherry picked from commit 72f0e09019ee13ad306287a8f60faba30d4fb60e)

tags: added: in-stable-train
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-ansible 0.6.0

This issue was fixed in the openstack/tripleo-ansible 0.6.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.