Comment 24 for bug 1856078

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/721171
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=24a0284e3d182faac2b613ddb9f9f36c5ba3995a
Submitter: Zuul
Branch: master

commit 24a0284e3d182faac2b613ddb9f9f36c5ba3995a
Author: Robert Church <email address hidden>
Date: Sun Apr 19 06:22:50 2020 -0400

    Patch Tiller deployment to ensure self-recovery

    On node startup, there appears to be a race condition between when
    kubelet sees a pod and when kubelet sees a service. Due to this race,
    required environment variable are missing to allow tiller to function
    properly.

    See the comment at
    https://github.com/kubernetes/kubernetes/blob/v1.18.1/pkg/kubelet/kubelet_pods.go#L566

    This change patches the tiller deployment to make sure the four classes
    of environment variables are present prior to starting tiller. If any
    class of variables are not present in the environment, then exit. This
    will recreate the pod and will populate the correct environment for
    tiller to function.

    Since the upgrade to v1.18.1, this has been seen in simplex and duplex
    controller configurations.

    Review https://review.opendev.org/#/c/699307/ will cover patching during
    initial provisioning via ansible. This change will check that tiller is
    patched every time the conductor starts as part of the tiller upgrade
    logic. This will cover scenarios where tiller is manually removed from
    the cluster and reinstalled via helm.

    This change should be reverted once StarlingX moves to helm v3.

    Also removed dead code: get_k8s_secret()

    Change-Id: Icd199ec1b1e10840094c0eae59d53838f32ffd6f
    Closes-Bug: #1856078
    Signed-off-by: Robert Church <email address hidden>