ERROR paunch [ ] Error running ['podman', 'run', '--name', 'rabbitmq_init_bundle', '--label', 'config_id=tripleo_step2', '--label', 'container_name=rabbitmq_init_bundle' failing periodic tripleo-ci-centos-8 scenario004-standalone-train and ovb-1ctlr_1comp-featureset002-train

Bug #1879292 reported by Bhagyashri Shewale
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Expired
Critical
Unassigned

Bug Description

Error log:

2020-05-17 06:01:49.562 64497 WARNING paunch [ ] Did not find container with "['podman', 'ps', '-a', '--filter', 'label=container_name=rabbitmq_init_bundle', '--format', '{{.Names}}']"
2020-05-17 06:34:31.420 64497 ERROR paunch [ ] Error running ['podman', 'run', '--name', 'rabbitmq_init_bundle', '--label', 'config_id=tripleo_step2', '--label', 'container_name=rabbitmq_init_bundle', '--label', 'managed_by=tripleo-Standalone', '--label', 'config_data={"command": ["/container_puppet_apply.sh", "2", "file,file_line,concat,augeas,pacemaker::resource::bundle,pacemaker::property,pacemaker::resource::ocf,pacemaker::constraint::order,pacemaker::constraint::colocation,rabbitmq_policy,rabbitmq_user,rabbitmq_ready", "include ::tripleo::profile::base::pacemaker;include ::tripleo::profile::pacemaker::rabbitmq_bundle", ""], "detach": false, "environment": {"KOLLA_BOOTSTRAP": true, "KOLLA_CONFIG_STRATEGY": "COPY_ALWAYS", "RABBITMQ_CLUSTER_COOKIE": "KMGRMyJyETWffCWHOihF", "TRIPLEO_DEPLOY_IDENTIFIER": "1589693907"}, "image": "192.168.24.1:8787/tripleotraincentos8/centos-binary-rabbitmq:79862dd7652d8856e954f5db056bc0d34bafc64f_28315d20-updated-20200517053321", "ipc": "host", "net": "host", "start_order": 0, "user": "root", "volumes": ["/etc/hosts:/etc/hosts:ro", "/etc/localtime:/etc/localtime:ro", "/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro", "/etc/pki/ca-trust/source/anchors:/etc/pki/ca-trust/source/anchors:ro", "/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro", "/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro", "/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro", "/dev/log:/dev/log", "/var/lib/container-config-scripts/container_puppet_apply.sh:/container_puppet_apply.sh:ro", "/etc/puppet:/tmp/puppet-etc:ro", "/usr/share/openstack-puppet/modules:/usr/share/openstack-puppet/modules:ro", "/bin/true:/bin/epmd"]}', '--conmon-pidfile=/var/run/rabbitmq_init_bundle.pid', '--log-driver', 'k8s-file', '--log-opt', 'path=/var/log/containers/stdouts/rabbitmq_init_bundle.log', '--env=KOLLA_BOOTSTRAP=True', '--env=KOLLA_CONFIG_STRATEGY=COPY_ALWAYS', '--env=RABBITMQ_CLUSTER_COOKIE=KMGRMyJyETWffCWHOihF', '--env=TRIPLEO_DEPLOY_IDENTIFIER=1589693907', '--net=host', '--ipc=host', '--user=root', '--volume=/etc/hosts:/etc/hosts:ro', '--volume=/etc/localtime:/etc/localtime:ro', '--volume=/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro', '--volume=/etc/pki/ca-trust/source/anchors:/etc/pki/ca-trust/source/anchors:ro', '--volume=/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro', '--volume=/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro', '--volume=/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro', '--volume=/dev/log:/dev/log', '--volume=/var/lib/container-config-scripts/container_puppet_apply.sh:/container_puppet_apply.sh:ro', '--volume=/etc/puppet:/tmp/puppet-etc:ro', '--volume=/usr/share/openstack-puppet/modules:/usr/share/openstack-puppet/modules:ro', '--volume=/bin/true:/bin/epmd', '--cpuset-cpus=0,1,2,3,4,5,6,7', '192.168.24.1:8787/tripleotraincentos8/centos-binary-rabbitmq:79862dd7652d8856e954f5db056bc0d34bafc64f_28315d20-updated-20200517053321', '/container_puppet_apply.sh', '2', 'file,file_line,concat,augeas,pacemaker::resource::bundle,pacemaker::property,pacemaker::resource::ocf,pacemaker::constraint::order,pacemaker::constraint::colocation,rabbitmq_policy,rabbitmq_user,rabbitmq_ready', 'include ::tripleo::profile::base::pacemaker;include ::tripleo::profile::pacemaker::rabbitmq_bundle', '']. [6]

2020-05-17 06:34:31.422 64497 ERROR paunch [ ] stdout: Info: Loading facts

Affected jobs:

1. periodic-tripleo-ci-centos-8-ovb-1ctlr_1comp-featureset002-train

Reference link:

https://logserver.rdoproject.org/openstack-periodic-24hr/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-scenario004-standalone-train/2a37116/logs/undercloud/home/zuul/standalone_deploy.log.txt.gz

https://logserver.rdoproject.org/openstack-periodic-24hr/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-scenario004-standalone-train/2a37116/logs/undercloud/var/log/extra/errors.txt.txt.gz

https://logserver.rdoproject.org/openstack-periodic-24hr/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-scenario004-standalone-train/2a37116/logs/undercloud/var/log/paunch.log.txt.gz

https://logserver.rdoproject.org/openstack-periodic-24hr/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-1ctlr_1comp-featureset002-train/4590af3/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz

Tags: alert ci
summary: ERROR paunch [ ] Error running ['podman', 'run', '--name',
'rabbitmq_init_bundle', '--label', 'config_id=tripleo_step2', '--label',
'container_name=rabbitmq_init_bundle' failing periodic tripleo-ci-
- centos-8-scenario004-standalone-train
+ centos-8 scenario004-standalone-train and ovb-1ctlr_1comp-
+ featureset002-train
description: updated
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

https://logserver.rdoproject.org/openstack-periodic-24hr/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-scenario004-standalone-train/2a37116/logs/undercloud/var/log/paunch.log.txt.gz

+ puppet apply --verbose --detailed-exitcodes --summarize --color=false --modulepath /etc/puppet/modules:/opt/stack/puppet-modules:/usr/share/openstack-puppet/modules --tags file,file_line,concat,augeas,pacemaker::resource::bundle,pacemaker::property,pacemaker::resource::ocf,pacemaker::constraint::order,pacemaker::constraint::colocation,rabbitmq_policy,rabbitmq_user,rabbitmq_ready -e 'noop_resource('\''package'\''); include ::tripleo::profile::base::pacemaker;include ::tripleo::profile::pacemaker::rabbitmq_bundle'
Error: Facter: error while resolving custom fact "rabbitmq_nodename": undefined method `[]' for nil:NilClass

Revision history for this message
Michele Baldessari (michele) wrote :

Facter errors are benign. The issue is the following:
2020-05-17T06:34:31.158273146+00:00 stderr F Error: 'rabbitmqctl status | grep -F "{rabbit,"' returned 1 instead of one of [0]

I.e. it seems this is not returning one ever. It tries:
2020-05-17T06:02:19.763745454+00:00 stdout F Notice: /Stage[main]/Tripleo::Profile::Pacemaker::Rabbitmq_bundle/Pacemaker::Resource::Ocf[rabbitmq]/Pcmk_resource[rabbitmq]/ensure: created
2020-05-17T06:34:31.158273146+00:00 stderr F Error: 'rabbitmqctl status | grep -F "{rabbit,"' returned 1 instead of one of [0]
2020-05-17T06:34:31.165014450+00:00 stderr F Error: /Stage[main]/Tripleo::Profile::Pacemaker::Rabbitmq_bundle/Exec[rabbitmq-ready]/returns: change from 'notrun' to ['0'] failed: 'rabbitmqctl status | grep -F "{rabbit,"' returned 1 instead of one of [0]
2020-05-17T06:34:31.167860042+00:00 stdout F Notice: /Stage[main]/Tripleo::Profile::Pacemaker::Rabbitmq_bundle/Rabbitmq_policy[ha-all@/]: Dependency Exec[rabbitmq-ready] has failures: true

So it tried for 32mins but it never returned 0.
rabbitmq seems to be up:
2020-05-17T06:02:34.208968474+00:00 stderr F (log_finished) info: finished - rsc:rabbitmq action:start call_id:13 pid:208 exit-code:0 exec-time:9838ms queue-time:0ms

https://logserver.rdoproject.org/openstack-periodic-24hr/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-scenario004-standalone-<email address hidden>
2020-05-17 06:02:31.538 [info] <0.818.0> Starting worker pool 'management_worker_pool' with 3 processes in it
2020-05-17 06:02:31.551 [notice] <0.105.0> Changed loghwm of /<email address hidden> to 50
2020-05-17 06:02:31.789 [info] <0.8.0> Server startup complete; 3 plugins started.
 * rabbitmq_management
 * rabbitmq_management_agent
 * rabbitmq_web_dispatch

So now we need to figure out what the output of 'rabbitmqctl status' actually is.
Either somebody gives me a live env or we add puppet debug set to true (i checked and it is not set to true https://logserver.rdoproject.org/openstack-periodic-24hr/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-scenario004-standalone-train/2a37116/logs/undercloud/var/log/extra/podman/containers/rabbitmq_init_bundle/podman_info.log.txt.gz)

Changed in tripleo:
importance: Medium → Critical
Revision history for this message
wes hayutin (weshayutin) wrote :
Changed in tripleo:
milestone: none → ussuri-rc3
Revision history for this message
Rafael Folco (rafaelfolco) wrote :
wes hayutin (weshayutin)
Changed in tripleo:
milestone: ussuri-rc3 → victoria-1
Changed in tripleo:
milestone: victoria-1 → victoria-3
Changed in tripleo:
milestone: victoria-3 → wallaby-1
Changed in tripleo:
milestone: wallaby-1 → wallaby-2
Changed in tripleo:
milestone: wallaby-2 → wallaby-3
Changed in tripleo:
milestone: wallaby-3 → wallaby-rc1
Changed in tripleo:
milestone: wallaby-rc1 → xena-1
Changed in tripleo:
milestone: xena-1 → xena-2
Revision history for this message
Marios Andreou (marios-b) wrote :

This is an automated action. Bug status has been set to 'Incomplete' and target milestone has been removed due to inactivity. If you disagree please re-set these values and reach out to us on freenode #tripleo

Changed in tripleo:
milestone: xena-2 → none
status: Triaged → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for tripleo because there has been no activity for 60 days.]

Changed in tripleo:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.