Comment 3 for bug 2043534

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to stx-puppet (master)

Reviewed: https://review.opendev.org/c/starlingx/stx-puppet/+/899436
Committed: https://opendev.org/starlingx/stx-puppet/commit/105a7508253e1657c60a742eb5ec8945eaa78771
Submitter: "Zuul (22348)"
Branch: master

commit 105a7508253e1657c60a742eb5ec8945eaa78771
Author: Jorge Saffe <email address hidden>
Date: Tue Nov 21 18:55:32 2023 -0500

    Enhance configmap patching and recovery workflow

    The existing method for patching the kubeadm-config
    configmap via the Kubernetes CLI is experiencing issues
    due to concurrent processes updating it
    simultaneously.

    The code has been updated to utilize the Kubernetes API for
    performing the "kubeadm-config" configmap patch operation once
    the configuration parameters are modified.

    This approach avoids issues associated with the etcd versioning
    mechanism, thereby enhancing the reliability of the script
    execution. However, if a simultaneous update occurs after
    retrieving the data and before updating it, this mechanism will
    result in the loss of that subsequent update. To resolve these
    concurrency issues, it is necessary to implement a comprehensive
    solution that encompasses all scripts with access to the "kubeadm-
    config" configmap.

    By default, kube_operator is set up to use a floating IP
    address. The introduced modification allows for configuring
    kube_operator with the localhost IP address. This change is
    required to enable independent operation with the
    kube_apiserver of both controllers.

    In addition, changes have been implemented to ensure that only
    the active controller updates common etcd resources such as the
    kubeadm-config configmap.

    The operating logic has been modified as follows:
    * The current configmap is loaded.
    * Cluster configuration information is extracted from etcd.
    * Configuration information from service-parameters is extracted.
    * Modified parameters/values related to cluster_config are updated.
    * An on-disk copy of the updated cluster_config is made with the
      structure expected by kubeadm (avoiding the use of etcd).
    * All control-plane components are updated.
    * The backup copy of cluster_config is updated from current changes.
    * The configmap patch is performed only by the active controller.
      Disk backups of the kubeadm-config configmap are not performed.

    The restore procedure has also been altered. During this process
    the kubeadm-config configmap will not be restored, only the
    the k8s control-plane components will be restored.

    Test Plan:
      PASS Backup & Restore SX Env
      PASS Backup & Restore DX Env

      PASS Runtime k8s service-parameter changes:
       - Add or Modify a valid kubernetes service-parameter.
       - Apply the changes.
       - Ensure the changes are successfully applied.
       - Confirm the changes on modified control-plane
         components of both controllers.
       - Review the state and changes of the kubeadm-config configmap.
       - Verify changes on cluster_config backup file.

      PASS Runtime restore of k8s control-plane components:
       - Add a new invalid kubernetes service-parameter or modify
         a valid parameter with and invalid value.
       - Apply changes.
       - Verify the updating process fails.
       - Ensure the restore process completes successfully.
       - Verify controller-0 retains the latest valid changes.
       - Verify controller-1 retains the latest valid changes.
       - Verify configmap retains the latest valid changes.
       - Ensure that the “250.001 - Configuration is out-of-date”
         alarm is active.

      PASS kubeadm-config configmap updated only from active controller.
       - Add or Modify a valid kubernetes service-parameter.
       - Apply the changes.
       - Ensure the changes are successfully applied.
       - Confirm the changes on modified control-plane
         components of both controllers.
       - Review the state and changes of the kubeadm-config configmap.
       - Verify changes on cluster_config backup file.
       - Verify kubeadm-config configmap be updated only for active
         controller.

       - Swact active controller.
       - Add or Modify a valid kubernetes service-parameter.
       - Apply the changes.
       - Ensure the changes are successfully applied.
       - Confirm the changes on modified control-plane
         components of both controllers.
       - Review the state and changes of the kubeadm-config configmap.
       - Verify changes on cluster_config backup file.
       - Verify kubeadm-config configmap be updated only for active
         controller.

    Closes-Bug: 2043534
    Depends-on: https://review.opendev.org/c/starlingx/config/+/900976

    Signed-off-by: Jorge Saffe <email address hidden>
    Change-Id: I4e232e26a6c9c6cb49a2d1c5c3ab893491787035