libvirt pod on CrashLoopBackOff state due to unreachable hugetlb sysfs.

Bug #1824567 reported by Maria Guadalupe Perez Ibara on 2019-04-12
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
StarlingX
Critical
Al Bailey

Bug Description

Brief Description
-----------------
Sanity is failing "libvirt pod on CrashLoopBackOff state due to unreachable hugetlb sysfs"

Severity
--------
Critical

Steps to Reproduce
------------------
kubectl get pods --all-namespaces | egrep -v "Running|Completed"

Expected Behavior
------------------
All Pods: Running or Completed

Actual Behavior
----------------
libvirt-libvirt-default-9xhvr : CrashLoopBackOff

Reproducibility
---------------
100%

System Configuration
--------------------
simplex, duplex, Standard - Local Storage, Standard - Dedicated Storage

Branch/Pull Time/Commit
-----------------------
iso : 20190412T013000Z

Timestamp/Logs
--------------
Evidence.log

Test Activity
-------------
Sanity

Reviewed: https://review.openstack.org/652149
Committed: https://git.openstack.org/cgit/openstack/stx-config/commit/?id=bdf1d603b0b75a217d5d98d153ab0f45a3f51ffe
Submitter: Zuul
Branch: master

commit bdf1d603b0b75a217d5d98d153ab0f45a3f51ffe
Author: Al Bailey <email address hidden>
Date: Fri Apr 12 15:49:29 2019 -0500

    Fix an issue with libvirt pods not starting

    The cgroup folder has been changed by commit:
    https://review.openstack.org/#/c/648511/

    As a result the new hugepage folder was being created at:
    /sys/fs/cgroup/hugetlb/k8s-infra
    However the helm-chart default location was still looking at:
    /sys/fs/cgroup/hugetlb/kubepods

    The k8s-infra label for the cgroup folder has now been added
    to the armada manifest, and libvirt pods are able to launch.

    Closes-Bug: 1824567
    Change-Id: I3f420dc4643b37f56cec3b38449ca9b0d3b8fe4f
    Signed-off-by: Al Bailey <email address hidden>

Changed in starlingx:
status: New → Fix Released
Ghada Khalil (gkhalil) wrote :

Marking as release gating; critical as it blocks sanity.

Changed in starlingx:
assignee: nobody → Al Bailey (albailey1974)
importance: Undecided → Critical
tags: added: stx.2.0 stx.containers
Ghada Khalil (gkhalil) wrote :

As per Al Bailey, this issue was introduced by two recent commits:
https://review.openstack.org/#/c/648511/
https://review.openstack.org/#/c/645806/

The first commit had a bug which became a fatal error once the second commit went in to upversion helm. Both commits were merged at the same time. That's why the issue was not seen in individual designer testing.

Gerry Kopec (gerry-kopec) wrote :

Workaround:
system helm-override-update libvirt openstack --set conf.kubernetes.cgroup="k8s-infra"
system application-apply stx-openstack

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Bug attachments