App Framework: Override generation error with FluxCD app

Bug #1970804 reported by Bob Church
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Bob Church

Bug Description

Brief Description
-----------------
For a FluxCD application, when re-evaluating overrides based on MTC actions a traceback is seen preventing override generation

sysinv 2022-04-28 01:15:32.075 135253 INFO sysinv.conductor.manager [-] Evaluating app reapply of cert-manager
sysinv 2022-04-28 01:15:32.130 135253 ERROR sysinv.helm.manifest_base [-] Manifest file /opt/platform/armada/22.02/cert-manager/1.0-34/cert-manager-fluxcd-manifests does not exist: NotFound: Resource could not be found.
sysinv 2022-04-28 01:15:32.301 135253 ERROR sysinv.conductor.manager [-] Failed to regenerate the overrides for app cert-manager. coercing to Unicode: need string or buffer, NoneType found: TypeError: coercing to Unicode: need string or b
uffer, NoneType found
2022-04-28 01:15:32.301 135253 ERROR sysinv.conductor.manager Traceback (most recent call last):
2022-04-28 01:15:32.301 135253 ERROR sysinv.conductor.manager File "/usr/lib64/python2.7/site-packages/sysinv/conductor/manager.py", line 13701, in evaluate_app_reapply
2022-04-28 01:15:32.301 135253 ERROR sysinv.conductor.manager armada_format=True, armada_chart_info=app.charts, combined=True)
2022-04-28 01:15:32.301 135253 ERROR sysinv.conductor.manager File "/usr/lib64/python2.7/site-packages/sysinv/helm/helm.py", line 58, in _wrapper
2022-04-28 01:15:32.301 135253 ERROR sysinv.conductor.manager return func(self, *args, **kwargs)
2022-04-28 01:15:32.301 135253 ERROR sysinv.conductor.manager File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 328, in inner
2022-04-28 01:15:32.301 135253 ERROR sysinv.conductor.manager return f(*args, **kwargs)
2022-04-28 01:15:32.301 135253 ERROR sysinv.conductor.manager File "/usr/lib64/python2.7/site-packages/sysinv/helm/helm.py", line 873, in generate_helm_application_overrides
2022-04-28 01:15:32.301 135253 ERROR sysinv.conductor.manager manifest_op.save_overrides()
2022-04-28 01:15:32.301 135253 ERROR sysinv.conductor.manager File "/usr/lib64/python2.7/site-packages/sysinv/helm/manifest_base.py", line 216, in save_overrides
2022-04-28 01:15:32.301 135253 ERROR sysinv.conductor.manager if os.path.exists(self.manifest_path):
2022-04-28 01:15:32.301 135253 ERROR sysinv.conductor.manager File "/usr/lib64/python2.7/genericpath.py", line 18, in exists
2022-04-28 01:15:32.301 135253 ERROR sysinv.conductor.manager os.stat(path)
2022-04-28 01:15:32.301 135253 ERROR sysinv.conductor.manager TypeError: coercing to Unicode: need string or buffer, NoneType found
2022-04-28 01:15:32.301 135253 ERROR sysinv.conductor.manager

Severity
--------
Provide the severity of the defect.
Major: System/Feature is usable but degraded

Steps to Reproduce
------------------
- Install a AIO-DX.
- Lock and unlock controller-1.
- Observe only 1 replica for pods that should be on both controllers
- Observe error on /var/log/sysinv.log

Expected Behavior
------------------
Should have the same behavior as Armada app: generate overrides and eval if a reapply is needed

Actual Behavior
----------------
Traceback prevents override from generating

Reproducibility
---------------
100% reproducable

System Configuration
--------------------
Duplex controller setup

Branch/Pull Time/Commit
-----------------------
Master branch

Last Pass
---------
Never

Timestamp/Logs
--------------
See description

Test Activity
-------------
Feature Testing

Workaround
----------
Remove and apply the app after MTC actions are complete

Bob Church (rchurch)
Changed in starlingx:
assignee: nobody → Bob Church (rchurch)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/config/+/839836

Changed in starlingx:
status: New → In Progress
Revision history for this message
Ghada Khalil (gkhalil) wrote :

screening: stx.7.0 / medium - issue related to new feature: https://storyboard.openstack.org/#!/story/2009138

Changed in starlingx:
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/c/starlingx/config/+/839836
Committed: https://opendev.org/starlingx/config/commit/af8954bbdc1553442abc1ea190035ec9eaabf79b
Submitter: "Zuul (22348)"
Branch: master

commit af8954bbdc1553442abc1ea190035ec9eaabf79b
Author: Robert Church <email address hidden>
Date: Thu Apr 28 17:29:07 2022 -0400

    FluxCD: Fixes to support app auto re-applies

    When the system is evaluating application overrides it needs to pass to
    generate_helm_application_overrides() if the application is a FluxCD or
    an Armada app.

    Commit 8ab0a835e introduced changes for support the FluxCD application
    framework and updated generate_helm_application_overrides(). This
    change was missed.

    In addition, when the application overrides are determined to have
    changed and a auto-reapply is needed a flag file is generated to signal
    that the app should be re-applied at the appropriate time.

    This will also change the location of the generated flag files to the
    Helm overrides base directory from the Armada base directory as this
    directory will only exist after the first Armada application is
    uploaded. Without this update, FluxCD apps are unable to auto re-apply
    without an Armada app being present.

    Test Plan:
     - Include change in a running AIO-DX CentOS system
     - Apply a FluxCD app
     - Lock/unlock controller-1 requiring override changes for replicas
     - Confirm no errors in sysinv log
     - Confirm the correct number of replicas for 1 and 2 unlocked
       controllers
     - Confirm auto apply
     - Perform the above with and Armada app present

    Change-Id: Iee797a654a171e5b24ce3b04e503ffc8dc6f870b
    Closes-Bug: #1970804
    Signed-off-by: Robert Church <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
tags: added: stx.7.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.