Ironic PXE healthcheck is broken in master

Bug #1856191 reported by Damien Ciabrini
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Undecided
Emilien Macchi

Bug Description

Seen on various job failure, e.g. :
https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_702/651207/4/check/tripleo-ci-centos-7-undercloud-containers/70251d9/logs/undercloud/var/log/extra/failed_services.txt.gz

Dec 11 21:57:32 undercloud.localdomain systemd[1]: Starting tripleo_ironic_pxe_tftp healthcheck...
Dec 11 21:57:34 undercloud.localdomain podman[175256]: 2019-12-11 21:57:34.031867406 +0000 UTC m=+1.908984798 container exec c472357ee7db823ab5a82d98af421ca658e4751bd4c5c310c49d98b5f8acdadc (image=192.168.24.1:8787/tripleomaster/centos-binary-ironic-pxe:38c4e3104abdeb4699cfbe7a78fa2f37d7a863b4_93bde36c-updated-20191211203849, name=ironic_pxe_tftp)
Dec 11 21:57:34 undercloud.localdomain podman[175256]: curl: (6) Could not resolve host: nil; Unknown error
Dec 11 21:57:34 undercloud.localdomain podman[175256]: 000 :0 0.029 seconds
Dec 11 21:57:34 undercloud.localdomain podman[175256]: Error: non zero exit code: 1: OCI runtime error

The healthcheck fail because of a call to hiera:

    bind_host=$(hiera ironic::pxe::tftp_bind_host)

running hiera without "-c" returns nil.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-common (master)

Fix proposed to branch: master
Review: https://review.opendev.org/698719

Changed in tripleo:
assignee: emili.pv@gmail.com (emili) → Emilien Macchi (emilienm)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-common (master)

Reviewed: https://review.opendev.org/698719
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=22257657771bde1d10f36e6b3385fe415fc8f458
Submitter: Zuul
Branch: master

commit 22257657771bde1d10f36e6b3385fe415fc8f458
Author: Emilien Macchi <email address hidden>
Date: Thu Dec 12 08:39:40 2019 -0500

    Fix ironic-pxe container healthcheck

    When calling Hiera, we need to use the following option or the parameter
    isn't resolved:
    -c /etc/puppet/hiera.yaml

    It'll fix the ironic pxe healthcheck.
    Co-Authored-By: Damien Ciabrini <email address hidden>
    Closes-Bug: #1856191

    Change-Id: I79f5cc3010eed9a721eec61441a749ff984097f3

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-common (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/698851

Revision history for this message
Marios Andreou (marios-b) wrote :

filed a train gate blocker in https://bugs.launchpad.net/tripleo/+bug/1856288 but i may end up marking it duplicate at least it seems the same but I don't understand how if the hiera was the problem at least on train it only happens some of the time

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-common (stable/train)

Reviewed: https://review.opendev.org/698851
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=8488e44df0cf7675d01a66bbb761cd9fc3390f4b
Submitter: Zuul
Branch: stable/train

commit 8488e44df0cf7675d01a66bbb761cd9fc3390f4b
Author: Emilien Macchi <email address hidden>
Date: Thu Dec 12 08:39:40 2019 -0500

    Fix ironic-pxe container healthcheck

    When calling Hiera, we need to use the following option or the parameter
    isn't resolved:
    -c /etc/puppet/hiera.yaml

    It'll fix the ironic pxe healthcheck.
    Co-Authored-By: Damien Ciabrini <email address hidden>
    Closes-Bug: #1856191

    Change-Id: I79f5cc3010eed9a721eec61441a749ff984097f3
    (cherry picked from commit 22257657771bde1d10f36e6b3385fe415fc8f458)

tags: added: in-stable-train
Revision history for this message
mathieu bultel (mat-bultel) wrote :

I think it's a timing issue, that's why we don't see it each time.
There is probably another issue behind that.

I'm going to debug that asap.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-common 11.3.2

This issue was fixed in the openstack/tripleo-common 11.3.2 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-common 12.1.0

This issue was fixed in the openstack/tripleo-common 12.1.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.