periodic scenario1 standalone master inconsistent fail "delete orphan containers"->"Gather podman infos"

Bug #1937236 reported by Marios Andreou
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Unassigned

Bug Description

At [1][2] the periodic-tripleo-ci-centos-8-scenario001-standalone-master fails during the inclusion of the delete orphan containers tasks at [3] with a trace like:

 2021-07-22 01:55:06.773815 | fa163ec2-d6f1-0221-acad-000000003b9d | FATAL | Gather podman infos | standalone | error={"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result", "changed": false}

The issue is not consistently happening and we only have 2 examples thus far so it may be a race condition of some kind.

Since no_log was specified there (via [4]) we don't have any more information in the trace. I'll post a testproject with debug enabled but, the issue is not consistent so it depends on how lucky we are ;).

As seen at [5] we have 3 successful runs (2021-07-21 01:05:11, 2021-07-21 09:08:26, 2021-07-21 17:15:03) between the two failures.

[1] https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-scenario001-standalone-master/8d3584a/logs/undercloud/home/zuul/standalone_deploy.log.txt.gz
[2] https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-scenario001-standalone-master/03381f4/logs/undercloud/home/zuul/standalone_deploy.log.txt.gz
[3] https://opendev.org/openstack/tripleo-ansible/src/commit/51fa2814b2d2625db5193b1861cebd2aaa036ae8/tripleo_ansible/roles/tripleo_container_manage/tasks/delete_orphan.yml
[4] https://opendev.org/openstack/tripleo-ansible/src/commit/51fa2814b2d2625db5193b1861cebd2aaa036ae8/tripleo_ansible/roles/tripleo_container_manage/tasks/delete_orphan.yml#L20
[5] https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-8-scenario001-standalone-master

Revision history for this message
Marios Andreou (marios-b) wrote :

testproject with enable debug posted there https://review.rdoproject.org/r/c/testproject/+/34631

Changed in tripleo:
importance: High → Critical
Revision history for this message
Marios Andreou (marios-b) wrote :

14:44 < ykarel> sshnaidm|afk, noticed u not promoted https://review.rdoproject.org/r/c/rdoinfo/+/34033 to xena-testing
14:44 < ykarel> noticed https://bugs.launchpad.net/tripleo/+bug/1937236
14:45 < ykarel> i see 1.6.1 have the workaround for it
https://github.com/containers/ansible-podman-collections/commit/b7e904ae748568409b5636cfbbe77faa60b12741
14:45 < ykarel> marios|ruck, ^

Revision history for this message
Marios Andreou (marios-b) wrote :

this issue is sneaky/very likely a race condition of some kind

I found another example amongst green runs from the 24th there

 https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-scenario001-standalone-master/5460196/logs/undercloud/home/zuul/standalone_deploy.log.txt.gz
        * 2021-07-24 10:13:52.977814 | fa163ee0-a8c1-32e0-6f56-000000002ce9 | TASK | Delete orphan containers from /var/lib/tripleo-config/container-startup-config/step_3

Just pinged sshnaidm he is going to submit patches per comment #2 above

Revision history for this message
Sagi (Sergey) Shnaidman (sshnaidm) wrote :
Changed in tripleo:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.