User-overrides are not applied to initially disabled charts

Bug #2029303 reported by Joshua Reed
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Low
Joshua Reed

Bug Description

Brief Description
-----------------

During app-dell-storage development we noticed that user-overrides were not being applied to charts that were initially disabled.

Upon further analysis, we found that the problem was not with the user-overrides usages,
but the kube_app (sysinv-conductor) using an outdated cache to generate combined system-overrides.

Severity
-----------------

Minor

Steps to Reproduce
-----------------

system application-upload dell-storage-1.0-*.tgz
system helm-chart-attribute-modify --enabled true dell-storage csm-replication dell-storage
system helm-override-update dell-storage csm-replication dell-storage --set config.clusterId=ClusterA
system application-apply dell-storage

Expected Behavior
-----------------

User-override applied (clusterId = ClusterA).

kubectl get -n dell-storage configmap/dell-replication-controller-config -o yaml

apiVersion: v1
data:
  config.yaml: |
    clusterId: ClusterA
    targets: []
    CSI_LOG_LEVEL: info
kind: ConfigMap
...

Actual Behavior
-----------------

kubectl get -n dell-storage configmap/dell-replication-controller-config -o yaml

apiVersion: v1
data:
  config.yaml: "clusterId: \ntargets: []\nCSI_LOG_LEVEL: info\n"
kind: ConfigMap
...

Reproducibility
-----------------

Reproducible

System Configuration
-----------------

Any

Last Pass

N/A

Timestamp/Logs

N/A

Alarms

N/A

Test Activity

Developer Testing

Workaround

system application-remove dell-storage
system application-apply dell-storage

Note: This app is under development, so its images is not in private registries. For testing is needed to use public registries. You can use the command below to delete them:

system service-parameter-list| grep '\-registry'| awk '{ print $2} '|xargs -L 1 system service-parameter-delete

Joshua Reed (jreed7)
Changed in starlingx:
assignee: nobody → Joshua Reed (jreed7)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/config/+/890570

Changed in starlingx:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on config (master)

Change abandoned by "Joshua Reed <email address hidden>" on branch: master
Review: https://review.opendev.org/c/starlingx/config/+/890250
Reason: Superseded by: https://review.opendev.org/c/starlingx/config/+/890570

Revision history for this message
Joshua Reed (jreed7) wrote :

Please note, chant is not abandoned. I simply had to redo the code review, and the change is here:

https://review.opendev.org/c/starlingx/config/+/890570

Change-Id: Ia737322100159d342a4ac4f2471bb3f0c18adafb

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/c/starlingx/config/+/890570
Committed: https://opendev.org/starlingx/config/commit/612e70f4e20498a2f6ea528a6aa1072840287477
Submitter: "Zuul (22348)"
Branch: master

commit 612e70f4e20498a2f6ea528a6aa1072840287477
Author: Joshua Reed <email address hidden>
Date: Fri Aug 4 11:15:27 2023 -0700

    Better control over chart list generation.

    The previous implementation of the _get_list_of_charts
    method would not take into account whether or not a
    particular application chart was enabled or disabled.
    Instead the logic would rely on the resources present in
    the top-level application kustomization.yaml manifest to
    determine what FluxCD charts are enabled. As this manifest
    is changed dynamically based on the the application plugin
    use of the FluxCDKustomizeOperator, the resulting top-level
    application kustomization.yaml could change from one
    overrides generation call to the next.

    This fix ensures that:

    1. When all charts are needed then an option can be specified
       (i.e. when determining all the container images needed for
       the application)
    3. All possible charts, as filtered by the metadata/user
       driven and DB stored enabled status, are consistently
       returned regardless of the current state of the top-level
       application kustomization.yaml.

    Test Plan:
    PASS: build-pkgs -a && build-image
    PASS: AIO-SX full install with clean bootstrap
    PASS: Enable the cms-replication chart on the dell-storage app
    PASS: Use system helm-override-update to pass
          --set config.clusterID=ClusterA
    PASS: system application-apply dell-storage
    PASS: Check the YAML structure of the
          configmap/dell-replication-controller-config
          for ClusterA and properly formatted.

    Closes-Bug: 2029303

    Change-Id: Ia737322100159d342a4ac4f2471bb3f0c18adafb
    Signed-off-by: Joshua Reed <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → Low
tags: added: stx.9.0 stx.apps
Revision history for this message
Joshua Reed (jreed7) wrote :
Changed in starlingx:
status: Fix Released → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/config/+/915377

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)
Download full text (3.2 KiB)

Reviewed: https://review.opendev.org/c/starlingx/config/+/915377
Committed: https://opendev.org/starlingx/config/commit/967eedadb73b93a4098975663b3b221d83b4d5ce
Submitter: "Zuul (22348)"
Branch: master

commit 967eedadb73b93a4098975663b3b221d83b4d5ce
Author: Joshua Reed <email address hidden>
Date: Tue Apr 9 14:11:59 2024 -0600

    Apply Helm Overrides to initially disabled charts.

    The previous implementation of the _get_list_of_charts
    method would not take into account whether or not a
    particular application chart was enabled or disabled.

    This change now only includes charts that are enabled, or
    if the function caller asks for all of them with the
    include_disabled override set. The override is set as a
    part of the perform_app_upload routine to ensure overrides
    are generated and applied to all charts, including those
    which are initially disabled.

    This change also seeks to handle issues where the
    kustomize-orig.yaml file is not created by the time the
    perform_app_upload routine runs _get_list_of_charts by
    including an extra check.

    Finally, the override generation in the perform_app_apply
    function is moved to happen first in the sequence of events
    such that the app object is populated with overrides prior
    to any other operations occuring. This must be done to
    ensure the correct chart list is used.

    This fix ensures that:

    1. When all charts are needed then an option can be specified
       (i.e. when determining all the container images needed for
       the application) This is done with include_disabled flag.
    2. All possible charts, as filtered by the metadata/user
       driven and DB stored enabled status, are consistently
       returned regardless of the current state of the top-level
       application kustomization.yaml.
    3. A final check for kustomization-orig.yaml is performed and
       the file is created, if missing, before
       _get_list_of_charts executes with include_disabled=True

    Test Plan:
    PASS: build-pkgs -a && build-image
    PASS: AIO-SX full install with clean bootstrap
    PASS: Enable the cms-replication chart on the dell-storage app
    PASS: Use system helm-override-update to pass
          --set config.clusterID=ClusterA
    PASS: system application-apply dell-storage
    PASS: Check the YAML structure of the
          configmap/dell-replication-controller-config
          for ClusterA and properly formatted.
    PASS: Additional check to ensure that stx-openstack application
          successfully uploads and applies.
    PASS: Check that a helm override are generated even for an
          application that doesn't have a kustomize operator. This
          was done for the metrics-server app. A helm override was
          created and the subsequent metrics-server.yaml file in
          /opt/platform/helm contained the override after the
          system applciation-apply command was run.

    Relates to previous attempt at a fix:
    https://review.opendev.org/c/starlingx/config/+/890570

    Closes-Bug: 2029303

    Change-Id: I4c501b982e4061e5067ca0e8e43f37a9ee...

Read more...

Changed in starlingx:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.