master: Gather podman infos fails w/ no log

Bug #1926649 reported by wes hayutin
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
tripleo
In Progress
Critical
Unassigned

Bug Description

2021-04-29 18:02:09.670909 | fa163e5c-0288-0d5a-102a-000000006469 | TASK | Gather podman infos
2021-04-29 18:02:10.210797 | fa163e5c-0288-0d5a-102a-000000006469 | FATAL | Gather podman infos | standalone | error={"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result", "changed": false}
2021-04-29 18:02:10.212132 | fa163e5c-0288-0d5a-102a-000000006469 | TIMING | tripleo_container_manage : Gather podman infos | standalone | 0:35:28.766364 | 0.54s

PLAY RECAP *********************************************************************
localhost : ok=1 changed=0 unreachable=0 failed=0 skipped=2 rescued=0 ignored=0
standalone : ok=742 changed=349 unreachable=0 failed=1 skipped=234 rescued=0 ignored=0

https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_799/778083/2/gate/tripleo-ci-centos-8-scenario001-standalone/799eeb6/logs/undercloud/home/zuul/standalone_deploy.log

https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_799/778083/2/gate/tripleo-ci-centos-8-scenario001-standalone/799eeb6/logs/undercloud/var/log/extra/journal.txt

Revision history for this message
wes hayutin (weshayutin) wrote :
Changed in tripleo:
importance: Undecided → Critical
Revision history for this message
wes hayutin (weshayutin) wrote :

Is this the issue?

[2021-04-30 12:25:14,147][ceph_volume.devices.lvm.batch][INFO ] All data devices are unavailable
[2021-04-30 12:25:18,260][ceph_volume.main][INFO ] Running command: ceph-volume lvm list --format json
[2021-04-30 12:25:18,260][ceph_volume.main][ERROR ] ignoring inability to load ceph.conf
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 144, in main
    conf.ceph = configuration.load(conf.path)
  File "/usr/lib/python3.6/site-packages/ceph_volume/configuration.py", line 51, in load
    raise exceptions.ConfigurationError(abspath=abspath)
ceph_volume.exceptions.ConfigurationError: Unable to load expected Ceph config at: /etc/ceph/ceph.conf
[2021-04-30 12:25:18,262][ceph_volume.process][INFO ] Running command: /usr/sbin/lvs --noheadings --readonly --separ

https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_c5d/778083/2/gate/tripleo-ci-centos-8-scenario001-standalone/c5dd110/logs/undercloud/var/log/ceph/4b5c8c0a-ff60-454b-a1b4-9747aa737d19/ceph-volume.log

Revision history for this message
Francesco Pantano (fmount) wrote :

I don't think so, that log is the same even for green jobs (e.g. [1]) and that kind of error is ignored (see the osd [2] running).

I see the same task (Gather podman infos) is run for each step, and I guess it's used to manage the container order according to the info retrieved, so we should reproduce this error providing more logs on that specific task.

[1] https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_f9b/778915/17/check/tripleo-ci-centos-8-scenario001-standalone/f9b3777/logs/undercloud/var/log/ceph/4b5c8c0a-ff60-454b-a1b4-9747aa737d19/ceph-volume.log

[2] https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_c5d/778083/2/gate/tripleo-ci-centos-8-scenario001-standalone/c5dd110/logs/undercloud/var/log/ceph/4b5c8c0a-ff60-454b-a1b4-9747aa737d19/ceph-osd.0.log

Revision history for this message
Rabi Mishra (rabi) wrote :

We probably need to run with debug and reproduce this to see what's going on. I've seen this intermittent failure since long time,

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-ansible (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/tripleo-ansible/+/789548

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-heat-templates (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/789615

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-ansible (master)

Reviewed: https://review.opendev.org/c/openstack/tripleo-ansible/+/789548
Committed: https://opendev.org/openstack/tripleo-ansible/commit/1136ca69fee750792584ac6bb1975b7da2aae75f
Submitter: "Zuul (22348)"
Branch: master

commit 1136ca69fee750792584ac6bb1975b7da2aae75f
Author: Sagi Shnaidman <email address hidden>
Date: Tue May 4 13:47:56 2021 +0300

    Temporarily disable no_log for podmans info task

    In order to investigate the issue, let's disable no_log.

    Related-Bug: #1926649
    Change-Id: I30861aa289355efff0d39d373e3f20e95d1094cc

Revision history for this message
Sagi (Sergey) Shnaidman (sshnaidm) wrote :

This is the issue in podman, opened a bug there: https://github.com/containers/podman/issues/10225
Will add retry to avoid that.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-ansible (master)
Changed in tripleo:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-ansible (master)

Reviewed: https://review.opendev.org/c/openstack/tripleo-ansible/+/789928
Committed: https://opendev.org/openstack/tripleo-ansible/commit/84f18e73af3dfda9e1c5f89cd0a88a772ff383cb
Submitter: "Zuul (22348)"
Branch: master

commit 84f18e73af3dfda9e1c5f89cd0a88a772ff383cb
Author: Sagi Shnaidman <email address hidden>
Date: Wed May 5 20:25:20 2021 +0300

    Add retries to containers listing

    To avoid failure in container listing add 4 retries with 1 sec
    delay between.
    See podman issue: https://github.com/containers/podman/issues/10225
    Closes-Bug: #1926649

    Mark tripleo-ansible-centos-8-molecule-tripleo-modules as non voting,
    for later investigation and fix:
    https://zuul.opendev.org/t/openstack/builds?job_name=
    tripleo-ansible-centos-8-molecule-tripleo-modules&project=openstack/tripleo-ansible
    Change-Id: Id0575ece56462f78439bc6ff99e9bb971eebb0ac

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-ansible (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/tripleo-ansible/+/790266

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-ansible (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/tripleo-ansible/+/790266
Committed: https://opendev.org/openstack/tripleo-ansible/commit/f2f9110938b79afb9b9abbf162406f7be48100c5
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit f2f9110938b79afb9b9abbf162406f7be48100c5
Author: Sagi Shnaidman <email address hidden>
Date: Wed May 5 20:25:20 2021 +0300

    Add retries to containers listing

    To avoid failure in container listing add 4 retries with 1 sec
    delay between.
    See podman issue: https://github.com/containers/podman/issues/10225
    Closes-Bug: #1926649

    Mark tripleo-ansible-centos-8-molecule-tripleo-modules as non voting,
    for later investigation and fix:
    https://zuul.opendev.org/t/openstack/builds?job_name=
    tripleo-ansible-centos-8-molecule-tripleo-modules&project=openstack/tripleo-ansible
    Change-Id: Id0575ece56462f78439bc6ff99e9bb971eebb0ac
    (cherry picked from commit 84f18e73af3dfda9e1c5f89cd0a88a772ff383cb)

tags: added: in-stable-wallaby
Revision history for this message
wes hayutin (weshayutin) wrote :
Revision history for this message
wes hayutin (weshayutin) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-ansible 3.1.1

This issue was fixed in the openstack/tripleo-ansible 3.1.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-ansible 4.0.0

This issue was fixed in the openstack/tripleo-ansible 4.0.0 release.

wes hayutin (weshayutin)
Changed in tripleo:
status: Fix Released → In Progress
tags: added: promotion-blocker
wes hayutin (weshayutin)
tags: removed: promotion-blocker
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-heat-templates (master)

Change abandoned by "James Slagle <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/789615
Reason: Abandoning this patch per the TripleO Patch Abandonment guidelines
(https://specs.openstack.org/openstack/tripleo-specs/specs/policy/patch-abandonment.html).
If you wish to have this restored and cannot do so yourself, please reach out
via #tripleo on OFTC or the OpenStack Dev mailing list.

wes hayutin (weshayutin)
tags: added: promotion-blocker
wes hayutin (weshayutin)
tags: removed: promotion-blocker
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.