intermittent issue, Kubernetes bind mounts not created on boot

Bug #1989022 reported by Chris Friesen
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Kyle MacLeod

Bug Description

There is an intermittent issue where on some boots the two kubernetes bind mounts to stage1 and stage2 are not created:

From /etc/fstab, these two bind mounts are not active after booting:

/usr/local/kubernetes/1.23.1/stage1 /usr/local/kubernetes/current/stage1 none rw,bind 0 0
/usr/local/kubernetes/1.23.1/stage2 /usr/local/kubernetes/current/stage2 none rw,bind 0 0

Severity: Critical: System/Feature is not usable after the defect

Steps to Reproduce:

Bootstrap system controller. Also happens on subclouds.

Expected Behavior:

Expect that the bind mounts are mounted at /usr/local/kubernetes/current/stage1, /usr/local/kubernetes/current/stage2

Actual Behavior:

Directories /usr/local/kubernetes/current/stage{1,2} are not mounted. This affects the symlinks in /usr/bin/kube{adm,ctl,let} etc

Reproducibility:

This is intermittent. Sometimes the mounts are created, sometimes not.

System Configuration:

This is a virtualized AIO-DX system running debian, in distributedcloud role. Seen on virtualbox and libvirt.

Load info (eg: 2022-03-10_20-00-07)

Recent debian builds of both STX and WRCP

Changed in starlingx:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to stx-puppet (master)

Reviewed: https://review.opendev.org/c/starlingx/stx-puppet/+/852886
Committed: https://opendev.org/starlingx/stx-puppet/commit/3ed94d5c805ee2fb7e1e462b8a2e8839caf09e47
Submitter: "Zuul (22348)"
Branch: master

commit 3ed94d5c805ee2fb7e1e462b8a2e8839caf09e47
Author: Kyle MacLeod <email address hidden>
Date: Thu Aug 11 12:21:02 2022 -0400

    Add ostree-remount dependency on kubernetes bind mounts

    During system boot systemd parses /etc/fstab and converts
    the two fstab /usr/local/kubernetes/current/stage{1,2} entries
    into these units:

    - usr-local-kubernetes-current-stage1.mount
    - usr-local-kubernetes-current-stage2.mount

    Currently there's no dependency of these units on the
    ostree-remount systemd unit, and so intermittently we see
    an installation where the bind mounts don't exist.

    This commit adds a dependency on the ostree-remount systemd
    unit, in order to ensure that /usr is mounted before the
    K8s bind mounts that want to be "on top" of it.

    Closes-Bug: 1989022

    Test Plan:
    Tested system boot with added options, ensured that corresponding
    systemd mount units had "After=ostree-remount" options.

    Signed-off-by: Kyle MacLeod <email address hidden>
    Signed-off-by: Chris Friesen <email address hidden>
    Change-Id: I5843ab61790d2b77f776d6788301b7a1ed492e52

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
tags: added: stx.8.0 stx.config
Changed in starlingx:
importance: Undecided → Medium
assignee: nobody → Kyle MacLeod (kmacleod)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.