Containers multinode jobs fails on stable pike because of pacemaker

Bug #1771551 reported by Sagi (Sergey) Shnaidman
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Sagi (Sergey) Shnaidman
Tags: ci
Revision history for this message
Michele Baldessari (michele) wrote :

From http://logs.openstack.org/98/564698/2/check/tripleo-ci-centos-7-containers-multinode/22c050e/logs/undercloud/home/zuul/failed_deployment_list.log.txt.gz
"cibadmin: Connection to local file '/var/lib/pacemaker/cib/puppet-cib-backup20180516-8-16s5w2v' failed: Update does not conform to the configured schema",

So the rabbitmq container has (form http://logs.openstack.org/98/564698/2/check/tripleo-ci-centos-7-containers-multinode/22c050e/logs/undercloud/home/zuul/overcloud_prep_containers.log.txt.gz):
2018-05-16 08:09:25 | - imagename: docker.io/tripleopike/centos-binary-rabbitmq:d52ad67500aacdb4c2a1321363bfe87de4e6b518_88c9954e

So inside the container we have:
$ sudo docker run -it docker.io/tripleopike/centos-binary-rabbitmq:d52ad67500aacdb4c2a1321363bfe87de4e6b518_88c9954e /bin/bash -c "rpm -q pacemaker"
pacemaker-1.1.16-12.el7_4.8.x86_64

And on the host we have (http://logs.openstack.org/98/564698/2/check/tripleo-ci-centos-7-containers-multinode/22c050e/logs/subnode-2/var/log/host_info.txt.gz) we have:
pacemaker-1.1.18-11.el7.x86_64

So when pike was promoted on 2018-05-15, it did not get a new pacemaker package?

tags: added: alert
Revision history for this message
Matt Young (halcyondude) wrote :

# This is another instance of

https://bugs.launchpad.net/tripleo/+bug/1770692
potential for pacemaker version mismatch (BM vs.container) is a risk for OC deploy failures in gates

# Note RFE to identify this class of issue:

https://bugs.launchpad.net/tripleo/+bug/1771605
RFE: create canary and/or gating jobs for CI reproducer scripts (libvirt + ovb)

# this should be resolved by the next pike promotion

Revision history for this message
Matt Young (halcyondude) wrote :

paste fail in comment #2. Trying again.

---

# This is another instance of

https://bugs.launchpad.net/tripleo/+bug/1770692
potential for pacemaker version mismatch (BM vs.container) is a risk for OC deploy failures in gates

# Note RFE to identify this class of issue:

https://bugs.launchpad.net/tripleo/+bug/1771602
RFE: detect and warn when package versions in bare metal vs. container don't match

# this should be resolved by the next pike promotion

Revision history for this message
wes hayutin (weshayutin) wrote :

noticed jobs in zuul running tempest.. this is probably fixed

Revision history for this message
wes hayutin (weshayutin) wrote :
Revision history for this message
wes hayutin (weshayutin) wrote :

2018-05-16 17:15:01.610659 | primary | TASK [validate-tempest : Exit with tempest result code if configured] **********
2018-05-16 17:15:01.634144 | primary | Wednesday 16 May 2018 17:15:01 +0000 (0:00:04.164) 0:09:23.618 *********
2018-05-16 17:15:01.679972 | primary | skipping: [undercloud]
2018-05-16 17:15:01.706956 | primary |
2018-05-16 17:15:01.707266 | primary | PLAY RECAP *********************************************************************
2018-05-16 17:15:01.707529 | primary | subnode-2 : ok=3 changed=2 unreachable=0 failed=0
2018-05-16 17:15:01.707777 | primary | undercloud : ok=36 changed=22 unreachable=0 failed=0

Changed in tripleo:
status: Triaged → Fix Committed
tags: removed: alert
wes hayutin (weshayutin)
Changed in tripleo:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.