relabel failed /var/lib/config-data: no such file or directory

Bug #1800737 reported by wes hayutin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Emilien Macchi

Bug Description

2018-10-30 21:13:37.120 17849 WARNING tripleoclient.v1.tripleo_deploy.Deploy [ ] "Storing signatures",
2018-10-30 21:13:37.120 17849 WARNING tripleoclient.v1.tripleo_deploy.Deploy [ ] "panic: runtime error: invalid memory address or nil pointer dereference",
2018-10-30 21:13:37.120 17849 WARNING tripleoclient.v1.tripleo_deploy.Deploy [ ] "[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0xc46e67]",
2018-10-30 21:13:37.120 17849 WARNING tripleoclient.v1.tripleo_deploy.Deploy [ ] "goroutine 1 [running]:",
2018-10-30 21:13:37.120 17849 WARNING tripleoclient.v1.tripleo_deploy.Deploy [ ] "github.com/containers/libpod/vendor/github.com/containers/image/storage.(*storageImageDestination).Commit(0xc4203bc000, 0x13eee40, 0xc420034140, 0x0, 0x0)",

http://logs.openstack.org/40/613640/1/gate/tripleo-ci-centos-7-containers-multinode/e212bb8/logs/undercloud/home/zuul/install-undercloud.log.txt.gz#_2018-10-30_21_13_37_120

http://logs.openstack.org/58/609858/6/gate/tripleo-ci-centos-7-containers-multinode/27bedd3/logs/undercloud/home/zuul/install-undercloud.log.txt.gz

Emilien wrote up the bug in github for podman libpod
https://github.com/containers/libpod/issues/1730

Tags: tech-debt
Revision history for this message
wes hayutin (weshayutin) wrote :
Changed in tripleo:
milestone: stein-1 → stein-2
Revision history for this message
Emilien Macchi (emilienm) wrote :

The actual bug your links is:
    "2018-10-30 20:07:57,926 ERROR: 9471 -- relabel failed \"/var/lib/config-data\": no such file or directory",

I thought it was related to podman pull but it is not. The podman bug is a red hearing, that will be fixed separately but it doesn't block us, as we have retries. Take a look at the logs, the podman pull finally works after retrying.

The actual problem is this config-data directory thing.

summary: - podman, "panic: runtime error: invalid memory address or nil pointer
- dereference",
+ relabel failed /var/lib/config-data: no such file or directory
Revision history for this message
Cédric Jeanneret (cjeanner) wrote :

So this looks like a race condition:
step1 does create the config-data directory here:
https://github.com/openstack/tripleo-heat-templates/blob/master/common/deploy-steps-tasks.yaml#L246

It should probably be in the host_prep_tasks, but I don't feel like adding this directory in every service template. Is there a way to push a host_prep_tasks in the "common" directory? After all, we actually NEED this directory on every node.

Revision history for this message
Cédric Jeanneret (cjeanner) wrote :

update: NOT a race condition - the directory is created way, way before the failure. In addition, other containers are actually launched before, and are working.

Still digging...

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-quickstart (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/614537

Revision history for this message
Emilien Macchi (emilienm) wrote :
Changed in tripleo:
assignee: nobody → Emilien Macchi (emilienm)
tags: added: tech-debt
removed: alert promotion-blocker
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-heat-templates (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/614639

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-quickstart (master)

Reviewed: https://review.openstack.org/614537
Committed: https://git.openstack.org/cgit/openstack/tripleo-quickstart/commit/?id=de5ca4162e2da9643efedd840d09c98ef1597b34
Submitter: Zuul
Branch: master

commit de5ca4162e2da9643efedd840d09c98ef1597b34
Author: Emilien Macchi <email address hidden>
Date: Wed Oct 31 11:01:00 2018 -0400

    (squash) disabling podman everywhere in gate

    Until we fix the situation, we want to remove podman from our gate.

    Revert "fs010: switch undercloud to podman"

    This reverts commit 39d1da5267724244d4d1f0bfabae187e4bfcfa92.

    Revert "fs050: upgrade the undercloud to Podman containers"

    This reverts commit ab6cbcb0cea3c296b6ee5b873c9b9c08ba8285b2.

    Revert "Switch fs027 to deploy with podman"

    This reverts commit f77771843fe60d370135ce345b680be0209082e4.

    Change-Id: I3715a0432ead1eb1d18deb5893858e051a0b5539
    Related-Bug: #1800737

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-quickstart (master)

Fix proposed to branch: master
Review: https://review.openstack.org/614664

Changed in tripleo:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-heat-templates (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/614825

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-heat-templates (master)

Reviewed: https://review.openstack.org/614825
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=1b8618159e54d7d43861c12f7a48d4fbeea847fd
Submitter: Zuul
Branch: master

commit 1b8618159e54d7d43861c12f7a48d4fbeea847fd
Author: Emilien Macchi <email address hidden>
Date: Thu Nov 1 14:25:57 2018 -0400

    docker-puppet: remove -z from /var/lib/config-data mount

    Context: https://github.com/containers/libpod/issues/1739
    The relabeling of /var/lib/config-data fails with Podman and since we
    run the docker-puppet containers with label=disable, we shouldn't need
    to relabel it anyway.

    Change-Id: I5e0715a9b7a052126fb01c8d2c3da36a38bec2bf
    Related-Bug: #1800737

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-heat-templates (master)

Change abandoned by Emilien Macchi (<email address hidden>) on branch: master
Review: https://review.openstack.org/614639
Reason: it was a bad idea.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-quickstart (master)

Reviewed: https://review.openstack.org/614664
Committed: https://git.openstack.org/cgit/openstack/tripleo-quickstart/commit/?id=9989134abcd9bae00d9863203025c9963ba901b5
Submitter: Zuul
Branch: master

commit 9989134abcd9bae00d9863203025c9963ba901b5
Author: Emilien Macchi <email address hidden>
Date: Thu Nov 1 01:21:35 2018 +0000

    Revert "(squash) disabling podman everywhere in gate"

    This reverts commit de5ca4162e2da9643efedd840d09c98ef1597b34.

    Closes-Bug: #1800737
    Depends-On: I5e0715a9b7a052126fb01c8d2c3da36a38bec2bf
    Change-Id: I23a50cff9ee8b052fe2352ee4b4c70427bfd4811

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-quickstart 2.1.1

This issue was fixed in the openstack/tripleo-quickstart 2.1.1 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.