Master fs001 centos 7 jobs are timing out on overcloud deploy - log hidden

Bug #1857356 reported by Ronelle Landy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
yatin

Bug Description

fs001 centos 7 master jobs in bother periodic and check are timing out during overcloud deploy - most commonly in the following tasks:

 TASK [Run puppet host configuration for step 1] ********************************

or

 Write kolla config json

The log outputs are suppressed in these steps. The errors started on 12/21. Full logs are below:

http://logs.rdoproject.org/openstack-regular/opendev.org/openstack/tripleo-ci/master/tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001/215472a/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz

http://logs.rdoproject.org/openstack-periodic-master/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-master/b95cc90/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz

Note the timeout lines:

2019-12-23 05:25:00 | Ansible timed out at 5712 seconds.

Ronelle Landy (rlandy)
Changed in tripleo:
milestone: none → ussuri-1
importance: Undecided → Critical
status: New → Triaged
tags: added: promotion-blocker
tags: added: ci
Revision history for this message
Alex Schultz (alex-schultz) wrote :

pcs might be hanging

http://logs.rdoproject.org/openstack-regular/opendev.org/openstack/tripleo-ci/master/tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001/215472a/logs/overcloud-controller-0/var/log/pcsd/pcsd.log.txt.gz

I, [2019-12-23T13:20:34.664665 #32010] INFO -- : No response from: overcloud-controller-1 request: auth, error: operation_timedout
I, [2019-12-23T13:20:34.665776 #32010] INFO -- : No response from: overcloud-controller-2 request: auth, error: operation_timedout

Revision history for this message
Ronelle Landy (rlandy) wrote :

Possibly related to https://review.opendev.org/#/c/699318/ ( and dependent patches) ...
see https://bugs.launchpad.net/tripleo/+bug/1856626 - merged on Dec 21

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (master)

Fix proposed to branch: master
Review: https://review.opendev.org/700439

Changed in tripleo:
assignee: nobody → Alex Schultz (alex-schultz)
status: Triaged → In Progress
Changed in tripleo:
assignee: Alex Schultz (alex-schultz) → yatin (yatinkarel)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (master)

Reviewed: https://review.opendev.org/700439
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=daef223cc31d0d67a3b8f6ad45a928e2fa063346
Submitter: Zuul
Branch: master

commit daef223cc31d0d67a3b8f6ad45a928e2fa063346
Author: Alex Schultz <email address hidden>
Date: Mon Dec 23 10:22:24 2019 -0700

    Fix pacemaker firewall rules

    We switched to ansible for firewall rule management but the pacemaker
    file wasn't properly converted.

    Depends-On: https://review.opendev.org/#/c/700472/
    Change-Id: Id7f2942436565015211dcfa19fe95adb454c667e
    Closes-Bug: #1857356

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 12.1.0

This issue was fixed in the openstack/tripleo-heat-templates 12.1.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.