config-out-of-date on all nodes after removing stx-openstack

Bug #1884408 reported by Yang Liu
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
High
Bob Church

Bug Description

Brief Description
-----------------
After removing openstack application, config out-of-date alarms appeared on all nodes. Following exception is seen in sysinv log:

/var/log/sysinv.log:sysinv 2020-06-21 02:33:25.448 2063516 ERROR sysinv.openstack.common.rpc.amqp [-] Exception during message handling: AttributeError: 'NoneType' object has no attribute 'HELM_NS_HELM_TOOLKIT'
/var/log/sysinv.log:2020-06-21 02:33:25.448 2063516 ERROR sysinv.openstack.common.rpc.amqp Traceback (most recent call last):
/var/log/sysinv.log:2020-06-21 02:33:25.448 2063516 ERROR sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sysinv/openstack/common/rpc/amqp.py", line 437, in _process_data
/var/log/sysinv.log:2020-06-21 02:33:25.448 2063516 ERROR sysinv.openstack.common.rpc.amqp **args)
/var/log/sysinv.log:2020-06-21 02:33:25.448 2063516 ERROR sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sysinv/openstack/common/rpc/dispatcher.py", line 172, in dispatch
/var/log/sysinv.log:2020-06-21 02:33:25.448 2063516 ERROR sysinv.openstack.common.rpc.amqp result = getattr(proxyobj, method)(ctxt, **kwargs)
/var/log/sysinv.log:2020-06-21 02:33:25.448 2063516 ERROR sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sysinv/conductor/manager.py", line 11284, in perform_app_remove
/var/log/sysinv.log:2020-06-21 02:33:25.448 2063516 ERROR sysinv.openstack.common.rpc.amqp self._update_vim_config(context)
/var/log/sysinv.log:2020-06-21 02:33:25.448 2063516 ERROR sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sysinv/conductor/manager.py", line 7037, in _update_vim_config
/var/log/sysinv.log:2020-06-21 02:33:25.448 2063516 ERROR sysinv.openstack.common.rpc.amqp config_dict)
/var/log/sysinv.log:2020-06-21 02:33:25.448 2063516 ERROR sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sysinv/conductor/manager.py", line 8809, in _config_apply_runtime_manifest
/var/log/sysinv.log:2020-06-21 02:33:25.448 2063516 ERROR sysinv.openstack.common.rpc.amqp self.evaluate_app_reapply(context, app_name)
/var/log/sysinv.log:2020-06-21 02:33:25.448 2063516 ERROR sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sysinv/conductor/manager.py", line 11190, in evaluate_app_reapply
/var/log/sysinv.log:2020-06-21 02:33:25.448 2063516 ERROR sysinv.openstack.common.rpc.amqp armada_format=True, armada_chart_info=app.charts, combined=True)
/var/log/sysinv.log:2020-06-21 02:33:25.448 2063516 ERROR sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sysinv/helm/helm.py", line 49, in _wrapper
/var/log/sysinv.log:2020-06-21 02:33:25.448 2063516 ERROR sysinv.openstack.common.rpc.amqp return func(self, *args, **kwargs)
/var/log/sysinv.log:2020-06-21 02:33:25.448 2063516 ERROR sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sysinv/helm/helm.py", line 643, in generate_helm_application_overrides
/var/log/sysinv.log:2020-06-21 02:33:25.448 2063516 ERROR sysinv.openstack.common.rpc.amqp cnamespace)
/var/log/sysinv.log:2020-06-21 02:33:25.448 2063516 ERROR sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sysinv/helm/helm.py", line 406, in _get_helm_application_overrides
/var/log/sysinv.log:2020-06-21 02:33:25.448 2063516 ERROR sysinv.openstack.common.rpc.amqp cnamespace)})
/var/log/sysinv.log:2020-06-21 02:33:25.448 2063516 ERROR sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sysinv/helm/helm.py", line 311, in _get_helm_chart_overrides
/var/log/sysinv.log:2020-06-21 02:33:25.448 2063516 ERROR sysinv.openstack.common.rpc.amqp cnamespace))
/var/log/sysinv.log:2020-06-21 02:33:25.448 2063516 ERROR sysinv.openstack.common.rpc.amqp File "/opt/platform/helm/20.06/stx-openstack/1.0-1-centos-stable-versioned/plugins/k8sapp_openstack/helm/helm_toolkit.py", line 26, in get_overrides
/var/log/sysinv.log:2020-06-21 02:33:25.448 2063516 ERROR sysinv.openstack.common.rpc.amqp common.HELM_NS_HELM_TOOLKIT: {}
/var/log/sysinv.log:2020-06-21 02:33:25.448 2063516 ERROR sysinv.openstack.common.rpc.amqp AttributeError: 'NoneType' object has no attribute 'HELM_NS_HELM_TOOLKIT'

Severity
--------
Major

Steps to Reproduce
------------------
- Install and configure system
- Apply and configure stx-openstack
- Remove stx-openstack

Expected Behavior
------------------
- system remains healthy

Actual Behavior
----------------
- config out-of-date alarms appeared on all hosts, and swact cannot be performed

Workaround is to lock/unlock all hosts

Reproducibility
---------------
Reproducible

System Configuration
--------------------
Two node system
Lab-name: ip-5-6

Branch/Pull Time/Commit
-----------------------
20200619 load

Last Pass
---------
20200616 load

Timestamp/Logs
--------------
2020-06-21T02:31:37.000 controller-1 -sh: info HISTORY: PID=3716729 UID=42425 system application-remove stx-openstack

Test Activity
-------------
Regression Testing

Yang Liu (yliu12)
summary: - config-out-of-date on all nodes after removing stx-open
+ config-out-of-date on all nodes after removing stx-openstack
Revision history for this message
Yang Liu (yliu12) wrote :
Revision history for this message
Ghada Khalil (gkhalil) wrote :

This appears to be introduced by recent code changes related to pod security policy:
https://review.opendev.org/#/c/736002/
Merged on 2020-06-17

Ghada Khalil (gkhalil)
description: updated
Ghada Khalil (gkhalil)
tags: added: stx.containers
Revision history for this message
Ghada Khalil (gkhalil) wrote :

stx.4.0 / high priority - given this appears to be recently introduced (at least that's the current theory).

Changed in starlingx:
assignee: nobody → Jerry Sun (jerry-sun-u)
importance: Undecided → High
status: New → Triaged
tags: added: stx.4.0 stx.distro.openstack
Ghada Khalil (gkhalil)
tags: added: stx.apps
Ghada Khalil (gkhalil)
Changed in starlingx:
assignee: Jerry Sun (jerry-sun-u) → Bob Church (rchurch)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/737401
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=c6e6c914d53b7b5a60fef12d87b27ff3286fa52d
Submitter: Zuul
Branch: master

commit c6e6c914d53b7b5a60fef12d87b27ff3286fa52d
Author: Robert Church <email address hidden>
Date: Mon Jun 22 20:00:22 2020 -0400

    Provide plugin access beyond application removal

    Application plugins are currently deactivated once the application is
    removed and is no longer active in the k8s cluster. For more complex
    applications, allow the plugin deactivation to be deferred so that the
    plugins are still available for post removal platform configuration
    actions.

    After a platform configuration action only evaluate app re-applies for
    'applied' applications.

    Makes sure an app is active and applied before re-evaluating the
    overrides when determining if a re-apply is required.

    Remove log pollution from the HelmOperator. The logs aren't overly
    useful. Move them from info to debug.

    Add a synchronized lock around the key plugin setup/access methods in
    the HelmOperator.

    Change-Id: I2ba2f3f1da83aae73100507bd6795e68438940d3
    Closes-Bug: #1884408
    Signed-off-by: Robert Church <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.