Fix for kubelet failure on deployment with insecure registries

Bug #2072535 reported by Rakshith M R
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Rakshith M R

Bug Description

Brief Description
-----------------
Lab configured with insecure registries did not have the correct
containerd config format, since containerd v1.5 code deprecation
fix x was introduced.

While deploying system with insecure registries, i.e., when the
flag "secure" is set to False in deployment yml. It is observed
that there was critical kubelet failure.
Code changes to be made according to
https://github.com/containerd/containerd/blob/main/docs/hosts.md

Severity
--------
Major: System/Feature is usable but degraded

Steps to Reproduce
------------------
1. Install AIO-SX

Build "2024-06-14_19-00-14"
VDM updated on master branch (last commit: 51f8985ecb)
Using dm_dir './vdm/files/deployment-manager/latest/'
2. After unlocking controller-0, check alarms and use of Kubernetes commands

Expected Behavior
-----------------
No Kubernetes alarms and system working.

Actual Behavior
---------------
Critical kubelet failure on AIO-SX.

Reproducibility
---------------
Reproducible

System Configuration
--------------------
AIO-SX.

Alarms
-------
[sysadmin@controller-0 ~(keystone_admin)]$ fm alarm-list
-------{}--------------------------------------------{}+{}-----------{}{}------{}+{}-----------

Alarm Reason Text Entity ID Severity Time Stamp
ID
-------{}--------------------------------------------{}+{}-----------{}{}------{}+{}-----------

250.001 controller-0 Configuration is out-of-date. host= major 2024-06-17T
        (applied: ede4a670-a468-4d4c-963e-2276e8476c49 controller-0 04:58:47.
        target: 33b2eb3c-b15f-48a2-bb17-682e8d34ef98) 673196

200.004 controller-0 experienced a service-affecting host= critical 2024-06-17T
        failure. Auto-recovery in progress. Manual controller-0 04:56:53.
        Lock and Unlock may be required if auto- 549692
        recovery is unsuccessful.

200.006 controller-0 critical 'kubelet' process has host= critical 2024-06-17T
        failed and could not be auto-recovered controller-0. 04:56:53.
        gracefully. Auto-recovery progression by host process= 525962
        reboot is required and in progress. Manual kubelet
        Lock and Unlock may be required if auto-
        recovery is unsuccessful.

-------{}--------------------------------------------{}+{}-----------{}{}------{}+{}-----------

Test Activity
---------------
Developer Testing

Changed in starlingx:
assignee: nobody → Rakshith M R (rakshith-mr)
Changed in starlingx:
status: New → In Progress
description: updated
description: updated
summary: - Critical kubelet failure on AIO-SX installed with VDM
+ Fix for kubelet failure on deployment with insecure registries
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to stx-puppet (master)

Reviewed: https://review.opendev.org/c/starlingx/stx-puppet/+/923627
Committed: https://opendev.org/starlingx/stx-puppet/commit/cb7b963bf364786d0630277fa94322a63509068d
Submitter: "Zuul (22348)"
Branch: master

commit cb7b963bf364786d0630277fa94322a63509068d
Author: rakshith mr <email address hidden>
Date: Mon Jul 8 06:10:40 2024 -0400

    Fix for kubelet failure on deployment with insecure registries

    Containerd v1.5 code deprecation fix extended for insecure
    registries.
    While deploying system with insecure registries, i.e., when the
    flag "secure" is set to False in deployment yml. It is observed
    that there was critical kubelet failure. This is to fix that
    issue and code changes made according to
    https://github.com/containerd/containerd/blob/main/docs/hosts.md

    Test Plan:
    PASS: Deploy AIO-SX and verify containerd config file format is
          as expected.
    PASS: Deploy AIO-SX with insecure registries and verify containerd
          config file format is as expected.
    PASS: k8s upgrade from 1.27.5 to 1.28.4 in AIO-SX
    PASS: Multi-version k8s upgrade from 1.26.1 to 1.28.4 in AIO-SX
    PASS: Kubernetes version upgrade from v1.28.4 to v1.29.2 using
          Cloud Orchestration Strategy.
    PASS: Installing on a subcloud in a DC environment.

    Closes-bug: 2072535

    Change-Id: If884a6710991dd243c023a93ea91e92122173a26
    Signed-off-by: rakshith mr <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → Medium
tags: added: stx.10.0 stx.containers
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.