Activity log for bug #1985981

Date Who What changed Old value New value Message
2022-08-12 12:29:19 Sandeep Yadav bug added bug
2022-08-15 05:00:45 chandan kumar summary Sc010 kvm internal job failing with Error: container-init binary not found on the host: stat /usr/libexec/podman/catatonit: no such file or directory" standalone ceph job failing with Error: container-init binary not found on the host: stat /usr/libexec/podman/catatonit: no such file or directory"
2022-08-15 05:01:39 chandan kumar description We are running Sc010 kvm in both vexx cloud and in the internal cloud. The job which runs in the internal cloud fails with the below error:- ~~~ 2022-08-12 07:38:27,832 p=89450 u=root n=ansible | 2022-08-12 07:38:27.831192 | fa163e0d-40f2-7933-9109-000000000070 | FATAL | Run cephadm bootstrap . . Non-zero exit code 125 from /bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint ceph --init -e CONTAINER_IMAGE=quay.rdoproject.org/tripleomastercentos9/daemon:current-ceph -e NODE_NAME=standalone.localdomain -e CEPH_USE_RANDOM_NONCE=1 quay.rdoproject.org/tripleomastercentos9/daemon:current-ceph --version ceph: stderr Error: container-init binary not found on the host: stat /usr/libexec/podman/catatonit: no such file or directory Traceback (most recent call last): File "/usr/sbin/cephadm", line 9106, in <module> main() File "/usr/sbin/cephadm", line 9094, in main r = ctx.func(ctx) File "/usr/sbin/cephadm", line 1969, in _default_image return func(ctx) File "/usr/sbin/cephadm", line 4707, in command_bootstrap image_ver = CephContainer(ctx, ctx.image, 'ceph', ['--version']).run().strip() File "/usr/sbin/cephadm", line 3739, in run out, _, _ = call_throws(self.ctx, self.run_cmd(), File "/usr/sbin/cephadm", line 1636, in call_throws raise RuntimeError(f'Failed command: {" ".join(command)}: {s}') RuntimeError: Failed command: /bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint ceph --init -e CONTAINER_IMAGE=quay.rdoproject.org/tripleomastercentos9/daemon:current-ceph -e NODE_NAME=standalone.localdomain -e CEPH_USE_RANDOM_NONCE=1 quay.rdoproject.org/tripleomastercentos9/daemon:current-ceph --version: Error: container-init binary not found on the host: stat /usr/libexec/podman/catatonit: no such file or directory", "stderr_lines": ["Verifying podman|docker is present...", "Verifying lvm2 is present...", "Verifying time synchronization is in place...", "Unit chronyd.service is enabled and running", "Repeating the final host check...", "podman (/bin/podman) version 4.1.1 is present", "systemctl is present", "lvcreate is present", "Unit chronyd.service is enabled and running", "Host looks OK", "Cluster fsid: e1f5356e-8579-59d7-a01c-bd09ff028582", "Verifying IP 192.168.42.1 port 3300 ...", "Verifying IP 192.168.42.1 port 6789 ...", "Internal network (--cluster-network) has not been provided, OSD replication will default to the public_network", "Adjusting default settings to suit single-host cluster...", "Pulling container image quay.rdoproject.org/tripleomastercentos9/daemon:current-ceph...", "Non-zero exit code 125 from /bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint ceph --init -e CONTAINER_IMAGE=quay.rdoproject.org/tripleomastercentos9/daemon:current-ceph -e NODE_NAME=standalone.localdomain -e CEPH_USE_RANDOM_NONCE=1 quay.rdoproject.org/tripleomastercentos9/daemon:current-ceph --version", "ceph: stderr Error: container-init binary not found on the host: stat /usr/libexec/podman/catatonit: no such file or directory", "Traceback (most recent call last):", " File "/usr/sbin/cephadm", line 9106, in <module>", " main()", " File "/usr/sbin/cephadm", line 9094, in main", " r = ctx.func(ctx)", " File "/usr/sbin/cephadm", line 1969, in _default_image", " return func(ctx)", " File "/usr/sbin/cephadm", line 4707, in command_bootstrap", " image_ver = CephContainer(ctx, ctx.image, 'ceph', ['--version']).run().strip()", " File "/usr/sbin/cephadm", line 3739, in run", " out, _, _ = call_throws(self.ctx, self.run_cmd(),", " File "/usr/sbin/cephadm", line 1636, in call_throws", " raise RuntimeError(f'Failed command: {" ".join(command)}: {s}')", "RuntimeError: Failed command: /bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint ceph --init -e CONTAINER_IMAGE=quay.rdoproject.org/tripleomastercentos9/daemon:current-ceph -e NODE_NAME=standalone.localdomain -e CEPH_USE_RANDOM_NONCE=1 quay.rdoproject.org/tripleomastercentos9/daemon:current-ceph --version: Error: container-init binary not found on the host: stat /usr/libexec/podman/catatonit: no such file or directory"], "stdout": "", "stdout_lines": []} ~~~ Same sc010 kvm job is passing in vexx Cloud. https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-9-scenario010-kvm-standalone-master&skip=0 As per blog[1] This can happen due to the missing catatonit package which is a weak dependency of podman. [1] https://unix.stackexchange.com/questions/619212/podman-run-with-init-gives-me-error-container-init-binary-not-found-on-the-h From logs, I can confirm podman-catatonit.x86_64 missing in the internal job but present in the job running in vexx cloud. Another difference is in the podman version and the source repo of the podman package:- In vexx job:- https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-9-scenario010-kvm-standalone-master/97aa4c5/logs/undercloud/var/log/extra/package-list-installed.txt.gz ~~~ podman.x86_64 2:4.1.1-3.el9 @appstream podman-catatonit.x86_64 2:4.1.1-3.el9 @appstream ~~~ Internal job:- ~~~ podman.x86_64 2:4.1.1-6.el9 @quickstart-centos-appstreams ~~~ We are running Sc010 kvm in both vexx cloud and in the internal cloud. The job which runs in the internal cloud fails with the below error:- ~~~ 2022-08-12 07:38:27,832 p=89450 u=root n=ansible | 2022-08-12 07:38:27.831192 | fa163e0d-40f2-7933-9109-000000000070 | FATAL | Run cephadm bootstrap . . Non-zero exit code 125 from /bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint ceph --init -e CONTAINER_IMAGE=quay.rdoproject.org/tripleomastercentos9/daemon:current-ceph -e NODE_NAME=standalone.localdomain -e CEPH_USE_RANDOM_NONCE=1 quay.rdoproject.org/tripleomastercentos9/daemon:current-ceph --version ceph: stderr Error: container-init binary not found on the host: stat /usr/libexec/podman/catatonit: no such file or directory Traceback (most recent call last):   File "/usr/sbin/cephadm", line 9106, in <module>     main()   File "/usr/sbin/cephadm", line 9094, in main     r = ctx.func(ctx)   File "/usr/sbin/cephadm", line 1969, in _default_image     return func(ctx)   File "/usr/sbin/cephadm", line 4707, in command_bootstrap     image_ver = CephContainer(ctx, ctx.image, 'ceph', ['--version']).run().strip()   File "/usr/sbin/cephadm", line 3739, in run     out, _, _ = call_throws(self.ctx, self.run_cmd(),   File "/usr/sbin/cephadm", line 1636, in call_throws     raise RuntimeError(f'Failed command: {" ".join(command)}: {s}') RuntimeError: Failed command: /bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint ceph --init -e CONTAINER_IMAGE=quay.rdoproject.org/tripleomastercentos9/daemon:current-ceph -e NODE_NAME=standalone.localdomain -e CEPH_USE_RANDOM_NONCE=1 quay.rdoproject.org/tripleomastercentos9/daemon:current-ceph --version: Error: container-init binary not found on the host: stat /usr/libexec/podman/catatonit: no such file or directory", "stderr_lines": ["Verifying podman|docker is present...", "Verifying lvm2 is present...", "Verifying time synchronization is in place...", "Unit chronyd.service is enabled and running", "Repeating the final host check...", "podman (/bin/podman) version 4.1.1 is present", "systemctl is present", "lvcreate is present", "Unit chronyd.service is enabled and running", "Host looks OK", "Cluster fsid: e1f5356e-8579-59d7-a01c-bd09ff028582", "Verifying IP 192.168.42.1 port 3300 ...", "Verifying IP 192.168.42.1 port 6789 ...", "Internal network (--cluster-network) has not been provided, OSD replication will default to the public_network", "Adjusting default settings to suit single-host cluster...", "Pulling container image quay.rdoproject.org/tripleomastercentos9/daemon:current-ceph...", "Non-zero exit code 125 from /bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint ceph --init -e CONTAINER_IMAGE=quay.rdoproject.org/tripleomastercentos9/daemon:current-ceph -e NODE_NAME=standalone.localdomain -e CEPH_USE_RANDOM_NONCE=1 quay.rdoproject.org/tripleomastercentos9/daemon:current-ceph --version", "ceph: stderr Error: container-init binary not found on the host: stat /usr/libexec/podman/catatonit: no such file or directory", "Traceback (most recent call last):", " File "/usr/sbin/cephadm", line 9106, in <module>", " main()", " File "/usr/sbin/cephadm", line 9094, in main", " r = ctx.func(ctx)", " File "/usr/sbin/cephadm", line 1969, in _default_image", " return func(ctx)", " File "/usr/sbin/cephadm", line 4707, in command_bootstrap", " image_ver = CephContainer(ctx, ctx.image, 'ceph', ['--version']).run().strip()", " File "/usr/sbin/cephadm", line 3739, in run", " out, _, _ = call_throws(self.ctx, self.run_cmd(),", " File "/usr/sbin/cephadm", line 1636, in call_throws", " raise RuntimeError(f'Failed command: {" ".join(command)}: {s}')", "RuntimeError: Failed command: /bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint ceph --init -e CONTAINER_IMAGE=quay.rdoproject.org/tripleomastercentos9/daemon:current-ceph -e NODE_NAME=standalone.localdomain -e CEPH_USE_RANDOM_NONCE=1 quay.rdoproject.org/tripleomastercentos9/daemon:current-ceph --version: Error: container-init binary not found on the host: stat /usr/libexec/podman/catatonit: no such file or directory"], "stdout": "", "stdout_lines": []} ~~~ Same sc010 kvm job is passing in vexx Cloud. https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-9-scenario010-kvm-standalone-master&skip=0 As per blog[1] This can happen due to the missing catatonit package which is a weak dependency of podman. [1] https://unix.stackexchange.com/questions/619212/podman-run-with-init-gives-me-error-container-init-binary-not-found-on-the-h From logs, I can confirm podman-catatonit.x86_64 missing in the internal job but present in the job running in vexx cloud. Another difference is in the podman version and the source repo of the podman package:- In vexx job:- https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-9-scenario010-kvm-standalone-master/97aa4c5/logs/undercloud/var/log/extra/package-list-installed.txt.gz ~~~ podman.x86_64 2:4.1.1-3.el9 @appstream podman-catatonit.x86_64 2:4.1.1-3.el9 @appstream ~~~ Internal job:- ~~~ podman.x86_64 2:4.1.1-6.el9 @quickstart-centos-appstreams ~~~ Now it is seen in most of the standalone jobs where ceph is deployed.
2022-08-15 05:01:51 chandan kumar tags ci ci promotion-blocker
2022-08-15 05:39:42 chandan kumar summary standalone ceph job failing with Error: container-init binary not found on the host: stat /usr/libexec/podman/catatonit: no such file or directory" standalone job deploying ceph failing with Error: container-init binary not found on the host: stat /usr/libexec/podman/catatonit: no such file or directory"
2022-08-16 09:07:58 Sandeep Yadav tripleo: status Triaged Fix Released