Kube-host-upgrade control-plane fails due to missing images when the image cache is cleared

Bug #2007616 reported by João Victor Portal
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
João Victor Portal

Bug Description

Brief Description
-----------------
When the Kubernetes version is being upgraded, if the image cache is cleared in both controllers before the execution of either "system kube-host-upgrade controller-0 control-plane" or "system kube-host-upgrade controller-1 control-plane", the upgrade process will change to a failed state because it fails to download the needed images from the local registry.

Severity
--------
Major.

Steps to Reproduce
------------------
Following the guide https://docs.starlingx.io/updates/kubernetes/manual-kubernetes-components-upgrade.html , execute "crictl rmi --prune" on both controllers before either "system kube-host-upgrade controller-0 control-plane" or "system kube-host-upgrade controller-1 control-plane".

Expected Behavior
------------------
The control-plane phases are successfully completed.

Actual Behavior
----------------
The upgrade process transitions to a failed state.

Reproducibility
---------------
100% Reproducible.

System Configuration
--------------------
Any.

Branch/Pull Time/Commit
-----------------------
N/A.

Last Pass
---------
N/A.

Timestamp/Logs
--------------
N/A.

Test Activity
-------------
Feature Testing.

Workaround
----------
Manually pull the needed images with the command "crictl pull --creds <local_registry_user>:<local_registry_password> <image_name>".

Changed in starlingx:
assignee: nobody → João Victor Portal (jvictorp)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to stx-puppet (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/stx-puppet/+/874168

Changed in starlingx:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to stx-puppet (master)

Reviewed: https://review.opendev.org/c/starlingx/stx-puppet/+/874168
Committed: https://opendev.org/starlingx/stx-puppet/commit/423278c4a277bb6c5f148aefd0853e739d4005d9
Submitter: "Zuul (22348)"
Branch: master

commit 423278c4a277bb6c5f148aefd0853e739d4005d9
Author: Joao Victor Portal <email address hidden>
Date: Thu Feb 16 19:20:35 2023 -0300

    Assert needed images in cache during K8s upgrade

    During the Kubernetes upgrade progress, if the image cache is cleared on
    controllers before 'system kube-host-upgrade controller-0 control-plane'
    or 'system kube-host-upgrade controller-1 control-plane', the upgrade
    process transitions to an error state because the images cannot be
    downloaded from the local registry during the execution of 'kubeadm
    upgrade apply' or 'kubeadm upgrade node'. This change assures that the
    needed images are in the local cache just before the execution of these
    kubeadm commands.

    Test Plan:

    PASS: In a Standard deploy with 2 controllers and 1 worker, successfully
    complete the upgrade process from Kubernetes v1.23.1 to v1.24.4
    executing "crictl rmi --prune" on both controllers before the commands
    "system kube-host-upgrade controller-0 control-plane" and
    "system kube-host-upgrade controller-1 control-plane".

    Closes-Bug: 2007616
    Signed-off-by: Joao Victor Portal <email address hidden>
    Change-Id: I6a15e49c74e2ae91b6d5ddebfd7cb9057740b1af

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → Medium
tags: added: stx.9.0 stx.update
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.