Master/Train Scenario 1 standalone deployment failed on RHEL8 while deploying standalone with following error:
http://logs.rdoproject.org/openstack-periodic-latest-released/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-rhel-8-scenario001-standalone-train/384d9a9/logs/undercloud/home/zuul/standalone_deploy.log.txt.gz
2019-12-12 22:56:45 | "+ command -v python3",
2019-12-12 22:56:45 | "+ python3 /container-config-scripts/nova_wait_for_compute_service.py",
2019-12-12 22:56:45 | "The following containers failed validations and were not started: collectd"
While looking at paunch log http://logs.rdoproject.org/openstack-periodic-latest-released/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-rhel-8-scenario001-standalone-train/384d9a9/logs/undercloud/var/log/paunch.log.txt.gz
2019-12-12 22:56:43.366 102944 DEBUG paunch [ ] Completed $ podman run --name ceilometer_gnocchi_upgrade --label config_id=tripleo_step5 --label container_name=ceilometer_gnocchi_upgrade --label managed_by=tripleo-Standalone --label config_data={"command": ["/usr/bin/bootstrap_host_exec", "ceilometer_agent_central", "su ceilometer -s /bin/bash -c 'for n in {1..10}; do /usr/bin/ceilometer-upgrade && exit 0 || sleep 30; done; exit 1'"], "detach": false, "healthcheck": {"test": "/openstack/healthcheck"}, "image": "192.168.24.1:8787/tripleotrain/rhel-binary-ceilometer-central:a8589c8a36e9984c5744c00a528d12bfe2c33e59_39b0634f-updated-20191212171530", "net": "host", "privileged": false, "start_order": 99, "user": "root", "volumes": ["/etc/hosts:/etc/hosts:ro", "/etc/localtime:/etc/localtime:ro", "/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro", "/etc/pki/ca-trust/source/anchors:/etc/pki/ca-trust/source/anchors:ro", "/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro", "/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro", "/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro", "/dev/log:/dev/log", "/etc/ssh/ssh_known_hosts:/etc/ssh/ssh_known_hosts:ro", "/etc/puppet:/etc/puppet:ro", "/var/lib/config-data/ceilometer/etc/ceilometer/:/etc/ceilometer/:ro", "/var/log/containers/ceilometer:/var/log/ceilometer:z"]} --conmon-pidfile=/var/run/ceilometer_gnocchi_upgrade.pid --log-driver k8s-file --log-opt path=/var/log/containers/stdouts/ceilometer_gnocchi_upgrade.log --net=host --privileged=false --user=root --volume=/etc/hosts:/etc/hosts:ro --volume=/etc/localtime:/etc/localtime:ro --volume=/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro --volume=/etc/pki/ca-trust/source/anchors:/etc/pki/ca-trust/source/anchors:ro --volume=/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro --volume=/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro --volume=/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro --volume=/dev/log:/dev/log --volume=/etc/ssh/ssh_known_hosts:/etc/ssh/ssh_known_hosts:ro --volume=/etc/puppet:/etc/puppet:ro --volume=/var/lib/config-data/ceilometer/etc/ceilometer/:/etc/ceilometer/:ro --volume=/var/log/containers/ceilometer:/var/log/ceilometer:z --cpuset-cpus=0,1,2,3,4,5,6,7 192.168.24.1:8787/tripleotrain/rhel-binary-ceilometer-central:a8589c8a36e9984c5744c00a528d12bfe2c33e59_39b0634f-updated-20191212171530 /usr/bin/bootstrap_host_exec ceilometer_agent_central su ceilometer -s /bin/bash -c 'for n in {1..10}; do /usr/bin/ceilometer-upgrade && exit 0 || sleep 30; done; exit 1'
2019-12-12 22:56:43.366 102944 INFO paunch [ ] stdout:
2019-12-12 22:56:43.366 102944 INFO paunch [ ] stderr:
2019-12-12 22:56:43.366 102944 ERROR paunch [ ] The following containers failed validations and were not started: collectd
We are also seeing the same issue on master also:
http://logs.rdoproject.org/openstack-periodic-master/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-rhel-8-scenario001-standalone-master/c7ec105/logs/undercloud/home/zuul/standalone_deploy.log.txt.gz
and
http://logs.rdoproject.org/openstack-periodic-master/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-rhel-8-scenario001-standalone-master/c7ec105/logs/undercloud/var/log/paunch.log.txt.gz
https://review.opendev.org/#/c/697666/ and https://review.opendev.org/#/c/698570/1 are added in paunch to fix this bug: https://bugs.launchpad.net/tripleo/+bug/1855444
2019-12-12 22:56:05.161 102944 DEBUG paunch [ ] Running container: collectd _name=collectd --filter label=config_ id=tripleo_ step5 --format {{.Names}} container_ name=collectd' , '--filter', 'label= config_ id=tripleo_ step5', '--format', '{{.Names}}']" - retrying without config_id _name=collectd --format {{.Names}} container_ name=collectd' , '--format', '{{.Names}}']"
2019-12-12 22:56:05.228 102944 DEBUG paunch [ ] $ podman ps -a --filter label=container
2019-12-12 22:56:05.328 102944 DEBUG paunch [ ] b''
2019-12-12 22:56:05.328 102944 DEBUG paunch [ ] b''
2019-12-12 22:56:05.328 102944 WARNING paunch [ ] Did not find container with "['podman', 'ps', '-a', '--filter', 'label=
2019-12-12 22:56:05.328 102944 DEBUG paunch [ ] $ podman ps -a --filter label=container
2019-12-12 22:56:05.432 102944 DEBUG paunch [ ] b''
2019-12-12 22:56:05.432 102944 DEBUG paunch [ ] b''
2019-12-12 22:56:05.432 102944 WARNING paunch [ ] Did not find container with "['podman', 'ps', '-a', '--filter', 'label=
2019-12-12 22:56:05.433 102944 DEBUG paunch [ ] Start container collectd as collectd.
2019-12-12 22:56:05.434 102944 DEBUG paunch [ ] Path seperator found in volume (/var/log/journal), but did not exist on the file system
2019-12-12 22:56:05.434 102944 ERROR paunch [ ] /var/log/journal is not a valid volume source
2019-12-12 22:56:05.434 102944 DEBUG paunch [ ] Validations failed. Skipping container: collectd
So apparently it wants to bind-mount /var/log/journal but it doesn't exist on the host filesystem ?!
Sounds related to: journal: /var/log/ journal: ro
2742319ba7 (Martin Magr 2019-09-13 14:46:33 +0200 634) - /var/log/
https:/ /review. opendev. org/682039