Brief Description
After fresh install of WRCP with kubernetes version 1.20.9, metrics-server app 1.0-4 fails to get uploaded.
Severity
Critical
Steps to Reproduce
Install WRCP_Dev_Build any configuration with k8s version 1.20.9
Login to the system and run source /etc/platform/openrc
Change your work directory: cd /usr/local/share/applications/helm
Upload the application-metrics server: system application-upload metrics-server-1.0-4.tgz
Expected Behavior
metrics-server application should get uploaded successfully.
Actual Behavior
metrics-server application fails to get uploaded with error:
"Upload of server_manifest.yaml application metrics-server (1.0-4) failed:
Failed to validate application manifest."
Reproducibility
Reproducible
System Configuration
Any
Branch/Pull Time/Commit
-
Last Pass
Works well with k8s 1.19.
Timestamp/Logs
sysinv 2021-10-19 01:42:44.386 117875 INFO sysinv.api.controllers.v1.kube_app [-] Tar file of application metrics-server verified.
sysinv 2021-10-19 01:42:44.401 116262 INFO sysinv.conductor.kube_app [-] Application metrics-server (1.0-4) upload started.
sysinv 2021-10-19 01:42:44.466 116262 INFO sysinv.conductor.kube_app [-] PluginHelper: metrics-server does not contains any platform plugins.
sysinv 2021-10-19 01:42:45.645 116262 INFO sysinv.conductor.kube_app [-] Copy /opt/platform/armada/21.12/metrics-server to armada-api-556879b56f-2pfsc:/tmp/manifests .
sysinv 2021-10-19 01:42:45.869 116262 INFO sysinv.conductor.kube_app [-] Copy /opt/platform/helm/21.12/metrics-server to armada-api-556879b56f-2pfsc:/tmp/overrides .
sysinv 2021-10-19 01:42:45.941 116262 ERROR sysinv.conductor.kube_app [-] Failed to copy /opt/platform/helm/21.12/metrics-server to armada-api-556879b56f-2pfsc:/tmp/overrides, error: Unexpected error while running command.
Command: kubectl --kubeconfig /etc/kubernetes/admin.conf cp -n armada /opt/platform/helm/21.12/metrics-server armada-api-556879b56f-2pfsc:/tmp/overrides --container armada-api
Exit code: 1
Stdout: ''
Stderr: "error: /opt/platform/helm/21.12/metrics-server doesn't exist in local filesystem\n"
sysinv 2021-10-19 01:42:45.947 116262 WARNING sysinv.common.kubernetes [-] Failed to delete custom object, Namespace kube-system: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"locks.armada.process \"locks.armada.process.lock\" not found","reason":"NotFound","details":{"name":"locks.armada.process.lock","group":"armada.process","kind":"locks"},"code":404} : ApiException: (404)
sysinv 2021-10-19 01:42:45.948 116262 ERROR sysinv.conductor.kube_app [-] Armada request validate for manifest /manifests/metrics-server/1.0-4/metrics-server-metrics-server_manifest.yaml failed: could not access armada pod : RuntimeError: could not access armada pod
sysinv 2021-10-19 01:42:45.949 116262 ERROR sysinv.conductor.kube_app [-] Upload of application metrics-server (1.0-4) failed: Failed to validate application manifest.: KubeAppUploadFailure: Upload of application metrics-server (1.0-4) failed: Failed to validate application manifest.
2021-10-19 01:42:45.949 116262 ERROR sysinv.conductor.kube_app Traceback (most recent call last):
2021-10-19 01:42:45.949 116262 ERROR sysinv.conductor.kube_app File "/usr/lib64/python2.7/site-packages/sysinv/conductor/kube_app.py", line 1812, in perform_app_upload
2021-10-19 01:42:45.949 116262 ERROR sysinv.conductor.kube_app reason="Failed to validate application manifest.")
2021-10-19 01:42:45.949 116262 ERROR sysinv.conductor.kube_app KubeAppUploadFailure: Upload of application metrics-server (1.0-4) failed: Failed to validate application manifest.
2021-10-19 01:42:45.949 116262 ERROR sysinv.conductor.kube_app
sysinv 2021-10-19 01:42:45.999 116262 ERROR sysinv.conductor.kube_app [-] Application upload aborted!.: KubeAppUploadFailure: Upload of application metrics-server (1.0-4) failed: Failed to validate application manifest.
sysinv 2021-10-19 01:42:46.000 116262 ERROR sysinv.openstack.common.rpc.amqp [-] Exception during message handling: KubeAppUploadFailure: Upload of application metrics-server (1.0-4) failed: Failed to validate application manifest.
2021-10-19 01:42:46.000 116262 ERROR sysinv.openstack.common.rpc.amqp Traceback (most recent call last):
2021-10-19 01:42:46.000 116262 ERROR sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sysinv/openstack/common/rpc/amqp.py", line 437, in _process_data
2021-10-19 01:42:46.000 116262 ERROR sysinv.openstack.common.rpc.amqp **args)
2021-10-19 01:42:46.000 116262 ERROR sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sysinv/openstack/common/rpc/dispatcher.py", line 172, in dispatch
2021-10-19 01:42:46.000 116262 ERROR sysinv.openstack.common.rpc.amqp result = getattr(proxyobj, method)(ctxt, **kwargs)
2021-10-19 01:42:46.000 116262 ERROR sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sysinv/conductor/manager.py", line 13100, in perform_app_upload
2021-10-19 01:42:46.000 116262 ERROR sysinv.openstack.common.rpc.amqp self._app.perform_app_upload(rpc_app, tarfile, lifecycle_hook_info_app_upload, images)
2021-10-19 01:42:46.000 116262 ERROR sysinv.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sysinv/conductor/kube_app.py", line 1812, in perform_app_upload
2021-10-19 01:42:46.000 116262 ERROR sysinv.openstack.common.rpc.amqp reason="Failed to validate application manifest.")
2021-10-19 01:42:46.000 116262 ERROR sysinv.openstack.common.rpc.amqp KubeAppUploadFailure: Upload of application metrics-server (1.0-4) failed: Failed to validate application manifest.
2021-10-19 01:42:46.000 116262 ERROR sysinv.openstack.common.rpc.amqp
Alarms
-
Test Activity
Developer Testing
Workaround
Describe workaround if available
I suspect the problem dates back to http:// bitbucket. wrs.com/ projects/ CGCS/repos/ opendev. org.starlingx. config/ commits/ f53c96f7dfdc378 7fa176f90e94dc4 8dca7f1db5 (Add support for Helm v3 and containerized armada)
As a workaround I tried adding the following code in copy_manifests_ and_overrides_ to_armada( ) and it allowed me to upload and apply the application:
if not os.path. exists( src_dir) :
LOG.info("%s doesn't exist, skipping" % src_dir)
continue
I'm not sure if this is the correct fix or if we should ensure that the " /opt/platform/ helm/21. 12/<application >" directory gets created earlier.
According to https:/ /airshipit. readthedocs. io/projects/ armada/ en/latest/ commands/ validate. html the "validate" command only takes the manifest file, so we don't need the overrides to be available yet when validating it.