250.001 alarm is not cleared when service-parameter 'audit-policy-file' is added

Bug #2043534 reported by Jorge Saffe
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Jorge Saffe

Bug Description

Brief Description
--------------------
250.001 alarm is not cleared when service-parameter 'audit-policy-file' is added

Severity
---------
Standard

Steps to Reproduce
-------------------
Create service-parameter audit-policy-file

system service-parameter-add kubernetes kube_apiserver audit-policy-file="/etc/kubernetes/default-audit-policy.yaml"

system service-parameter-apply kubernetes

Expected Behavior
-----------------
250.001 alarms cleared

Actual Behavior
---------------
The config has not finished applying after timeout

Reproducibility
----------------
Intermittent

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/config/+/900976

Changed in starlingx:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/c/starlingx/config/+/900976
Committed: https://opendev.org/starlingx/config/commit/61c5dbd8209c2f17d98d17776c590fc4580a758f
Submitter: "Zuul (22348)"
Branch: master

commit 61c5dbd8209c2f17d98d17776c590fc4580a758f
Author: Jorge Saffe <email address hidden>
Date: Tue Nov 14 23:39:36 2023 -0500

    IP customization for kube_operator

    By default, kube_operator is configured to utilize the
    floating IP address.

    The introduced modification add the capability to configure
    kube_operator with a specific IP address.

    This is necessary in order to be able to operate independently
    with the kube_apiserver of both controllers.

    Test Plan:
      PASS
        - Initialize kube_operators on both controllers with their
          respective IP addresses.
        - Run a kube_operator method on both controllers
          (e.g. kube_get_node_status).
        - Verify that a successful response is received on both
          controllers.
        - Stop kube-apiserver on controller-0.
        - Run a kube_operator method on both controllers.
        - Verify the occurrence of the connection error (Errno 111) on
          controller-0.
        - Verify response successfully received on controller-1.
        - Restart kube-apiserver on controller-0.
        - Run a kube_operator method on controller-0.
        - Verify that a successful response is received on controller-0.

      PASS
        - Add or Modify k8s service-parameter.
        - Apply changes.
        - Verify parameter/value has been successfully applied.
        - Perform host-swact.
        - Add or Modify k8s service-parameter.
        - Apply changes.
        - Verify parameter/value has been successfully applied.

    Partial-Bug: 2043534
    Signed-off-by: Jorge Saffe <email address hidden>
    Change-Id: I08063eea5eb79cfc77d58445379907601effe28c

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to stx-puppet (master)
Download full text (4.8 KiB)

Reviewed: https://review.opendev.org/c/starlingx/stx-puppet/+/899436
Committed: https://opendev.org/starlingx/stx-puppet/commit/105a7508253e1657c60a742eb5ec8945eaa78771
Submitter: "Zuul (22348)"
Branch: master

commit 105a7508253e1657c60a742eb5ec8945eaa78771
Author: Jorge Saffe <email address hidden>
Date: Tue Nov 21 18:55:32 2023 -0500

    Enhance configmap patching and recovery workflow

    The existing method for patching the kubeadm-config
    configmap via the Kubernetes CLI is experiencing issues
    due to concurrent processes updating it
    simultaneously.

    The code has been updated to utilize the Kubernetes API for
    performing the "kubeadm-config" configmap patch operation once
    the configuration parameters are modified.

    This approach avoids issues associated with the etcd versioning
    mechanism, thereby enhancing the reliability of the script
    execution. However, if a simultaneous update occurs after
    retrieving the data and before updating it, this mechanism will
    result in the loss of that subsequent update. To resolve these
    concurrency issues, it is necessary to implement a comprehensive
    solution that encompasses all scripts with access to the "kubeadm-
    config" configmap.

    By default, kube_operator is set up to use a floating IP
    address. The introduced modification allows for configuring
    kube_operator with the localhost IP address. This change is
    required to enable independent operation with the
    kube_apiserver of both controllers.

    In addition, changes have been implemented to ensure that only
    the active controller updates common etcd resources such as the
    kubeadm-config configmap.

    The operating logic has been modified as follows:
    * The current configmap is loaded.
    * Cluster configuration information is extracted from etcd.
    * Configuration information from service-parameters is extracted.
    * Modified parameters/values related to cluster_config are updated.
    * An on-disk copy of the updated cluster_config is made with the
      structure expected by kubeadm (avoiding the use of etcd).
    * All control-plane components are updated.
    * The backup copy of cluster_config is updated from current changes.
    * The configmap patch is performed only by the active controller.
      Disk backups of the kubeadm-config configmap are not performed.

    The restore procedure has also been altered. During this process
    the kubeadm-config configmap will not be restored, only the
    the k8s control-plane components will be restored.

    Test Plan:
      PASS Backup & Restore SX Env
      PASS Backup & Restore DX Env

      PASS Runtime k8s service-parameter changes:
       - Add or Modify a valid kubernetes service-parameter.
       - Apply the changes.
       - Ensure the changes are successfully applied.
       - Confirm the changes on modified control-plane
         components of both controllers.
       - Review the state and changes of the kubeadm-config configmap.
       - Verify changes on cluster_config backup file.

      PASS Runtime restore of k8s control-plane components:
       -...

Read more...

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
tags: added: stx.9.0 stx.config stx.containers
Changed in starlingx:
importance: Undecided → Medium
assignee: nobody → Jorge Saffe (jsaffe)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.