System Application stuck in "remove-failed" state

Bug #1987115 reported by Leonardo Fagundes Luz Serrano
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Leonardo Fagundes Luz Serrano

Bug Description

Brief Description
-----------------
If an 'application-apply' command results in an 'apply-failed' state due to an error while trying to download docker images, the 'application-remove' command will not succeed, changing the app state to 'remove-failed' instead of 'uploaded'

Severity
--------
<Critical: System/Feature is not usable due to the defect>

Steps to Reproduce
------------------
(Without a route to the docker registries)

system application-upload /usr/local/share/applications/helm/ptp-notification-1.0-60.tgz
system application-apply ptp-notification

(If apply fails due to external docker registry, with sysinv.log saying it failed to download the images)

system application-remove ptp-notification

(remove fails and app ends in 'remove-failed' state)
(repeating remove ou using --force doesn't fix issue)

Expected Behavior
------------------
system application-remove should delete all pods and leave application in the uploaded state

Actual Behavior
----------------
Application is stuck in remove-failed and cannot be applied or deleted

Reproducibility
---------------
Reproducible

System Configuration
--------------------
AIO-SX

Branch/Pull Time/Commit
-----------------------
BUILD_ID="2022-06-25_02-46-51"

Last Pass
---------

Timestamp/Logs
--------------

=== APPLY FAILED ===
sysinv 2022-08-11 21:30:02.110 76553 INFO sysinv.conductor.kube_app [-] lifecycle hook for application snmp (1.0-1) started {'mode': 'manual', 'lifecycle_type': 'check', 'relative_timing': 'pre', 'operation': 'apply', 'extra': {}}.
sysinv 2022-08-11 21:30:02.129 76553 INFO sysinv.conductor.kube_app [-] lifecycle hook for application snmp (1.0-1) started {'mode': 'manual', 'lifecycle_type': 'operation', 'relative_timing': 'pre', 'operation': 'apply', 'extra': {}}.
sysinv 2022-08-11 21:30:02.131 76553 INFO sysinv.conductor.kube_app [-] Register the initial abort status of app snmp
sysinv 2022-08-11 21:30:02.875 76553 INFO sysinv.conductor.kube_app [-] Application snmp (1.0-1) apply started.
sysinv 2022-08-11 21:30:02.898 76553 INFO sysinv.conductor.kube_app [-] lifecycle hook for application snmp (1.0-1) started {'mode': 'manual', 'lifecycle_type': 'resource', 'relative_timing': 'pre', 'operation': 'apply', 'extra': {}}.
sysinv 2022-08-11 21:30:02.928 76553 INFO sysinv.conductor.kube_app [-] lifecycle hook for application snmp (1.0-1) started {'mode': 'manual', 'lifecycle_type': 'rbd', 'relative_timing': 'pre', 'operation': 'apply', 'extra': {}}.
sysinv 2022-08-11 21:30:02.987 76553 INFO sysinv.conductor.kube_app [-] Generating application overrides...
sysinv 2022-08-11 21:30:03.102 76553 INFO sysinv.helm.kustomize_base [-] helmrelease_resource_map: {'ptp-notification': {'name': 'ptp-notification', 'namespace': 'notification', 'resource': 'ptp-notification'}, 'ptp-notification-psp-rolebinding': {'name': 'ptp-notification-psp-rolebinding', 'namespace': 'notification', 'resource': 'ptp-notification-psp-rolebinding'}, 'snmp': {'name': 'snmp', 'namespace': 'kube-system', 'resource': 'snmp'}}
sysinv 2022-08-11 21:30:03.137 76553 INFO sysinv.helm.kustomize_base [-] /opt/platform/fluxcd/22.12/snmp/1.0-1/snmp-fluxcd-manifests/helmrelease_cleanup.yaml is not needed. All charts are enabled.
sysinv 2022-08-11 21:30:03.138 76553 INFO sysinv.conductor.kube_app [-] Application overrides generated.
sysinv 2022-08-11 21:30:03.168 76553 INFO sysinv.conductor.kube_app [-] Image registry.local:9001/docker.io/starlingx/stx-snmp:stx.6.0-v1.0.1 download started from local registry
sysinv 2022-08-11 21:30:03.179 76553 INFO sysinv.conductor.kube_app [-] Image registry.local:9001/docker.io/starlingx/stx-fm-subagent:stx.6.0-v1.0.3 download started from local registry
sysinv 2022-08-11 21:30:03.198 76553 INFO sysinv.conductor.kube_app [-] Image registry.local:9001/docker.io/starlingx/stx-fm-trap-subagent:stx.7.0-v1.0.3 download started from local registry
sysinv 2022-08-11 21:30:03.259 76553 INFO sysinv.conductor.kube_app [-] Image registry.local:9001/docker.io/starlingx/stx-snmp:stx.6.0-v1.0.1 is not available in local registry, download started from public/private registry
sysinv 2022-08-11 21:30:03.944 76553 INFO sysinv.conductor.kube_app [-] Image registry.local:9001/docker.io/starlingx/stx-fm-subagent:stx.6.0-v1.0.3 is not available in local registry, download started from public/private registry
sysinv 2022-08-11 21:30:04.028 76553 INFO sysinv.conductor.kube_app [-] Image registry.local:9001/docker.io/starlingx/stx-fm-trap-subagent:stx.7.0-v1.0.3 is not available in local registry, download started from public/private registry
sysinv 2022-08-11 21:30:18.118 76553 INFO sysinv.conductor.kube_app [-] lifecycle hook for application cert-manager (1.0-1) started {'mode': 'auto', 'lifecycle_type': 'check', 'relative_timing': 'pre', 'operation': 'update', 'extra': {'from_app': True}}.
sysinv 2022-08-11 21:30:19.994 76553 INFO sysinv.conductor.kube_app [-] lifecycle hook for application platform-integ-apps (1.0-1) started {'mode': 'auto', 'lifecycle_type': 'check', 'relative_timing': 'pre', 'operation': 'update', 'extra': {'from_app': True}}.
sysinv 2022-08-11 21:30:23.800 76553 ERROR sysinv.conductor.kube_app [-] Image registry.local:9001/docker.io/starlingx/stx-snmp:stx.6.0-v1.0.1 download failed from public/privateregistry: 500 Server Error: Internal Server Error ("Get "https://admin-2.cumulus.wrs.com:30093/v2/": dial tcp: lookup admin-2.cumulus.wrs.com on 192.168.204.1:53: no such host"): docker.errors.APIError: 500 Server Error: Internal Server Error ("Get "https://admin-2.cumulus.wrs.com:30093/v2/": dial tcp: lookup admin-2.cumulus.wrs.com on 192.168.204.1:53: no such host")
sysinv 2022-08-11 21:30:23.801 76553 INFO sysinv.conductor.kube_app [-] Failed to download image: registry.local:9001/docker.io/starlingx/stx-snmp:stx.6.0-v1.0.1
sysinv 2022-08-11 21:30:23.802 76553 INFO sysinv.conductor.kube_app [-] Retry docker images download for application snmp after 65 seconds
sysinv 2022-08-11 21:30:23.803 76553 ERROR sysinv.conductor.kube_app [-] Image registry.local:9001/docker.io/starlingx/stx-fm-subagent:stx.6.0-v1.0.3 download failed from public/privateregistry: 500 Server Error: Internal Server Error ("Get "https://admin-2.cumulus.wrs.com:30093/v2/": dial tcp: lookup admin-2.cumulus.wrs.com on 192.168.204.1:53: no such host"): docker.errors.APIError: 500 Server Error: Internal Server Error ("Get "https://admin-2.cumulus.wrs.com:30093/v2/": dial tcp: lookup admin-2.cumulus.wrs.com on 192.168.204.1:53: no such host")
sysinv 2022-08-11 21:30:23.805 76553 ERROR sysinv.conductor.kube_app [-] Image registry.local:9001/docker.io/starlingx/stx-fm-trap-subagent:stx.7.0-v1.0.3 download failed from public/privateregistry: 500 Server Error: Internal Server Error ("Get "https://admin-2.cumulus.wrs.com:30093/v2/": dial tcp: lookup admin-2.cumulus.wrs.com on 192.168.204.1:53: no such host"): docker.errors.APIError: 500 Server Error: Internal Server Error ("Get "https://admin-2.cumulus.wrs.com:30093/v2/": dial tcp: lookup admin-2.cumulus.wrs.com on 192.168.204.1:53: no such host")
sysinv 2022-08-11 21:31:18.201 76553 INFO sysinv.conductor.kube_app [-] lifecycle hook for application cert-manager (1.0-1) started {'mode': 'auto', 'lifecycle_type': 'check', 'relative_timing': 'pre', 'operation': 'update', 'extra': {'from_app': True}}.
sysinv 2022-08-11 21:31:19.680 76553 INFO sysinv.conductor.kube_app [-] lifecycle hook for application platform-integ-apps (1.0-1) started {'mode': 'auto', 'lifecycle_type': 'check', 'relative_timing': 'pre', 'operation': 'update', 'extra': {'from_app': True}}.
sysinv 2022-08-11 21:31:29.195 76553 INFO sysinv.conductor.kube_app [-] Image registry.local:9001/docker.io/starlingx/stx-snmp:stx.6.0-v1.0.1 download started from local registry
sysinv 2022-08-11 21:31:29.207 76553 INFO sysinv.conductor.kube_app [-] Image registry.local:9001/docker.io/starlingx/stx-fm-subagent:stx.6.0-v1.0.3 download started from local registry
sysinv 2022-08-11 21:31:29.292 76553 INFO sysinv.conductor.kube_app [-] Image registry.local:9001/docker.io/starlingx/stx-fm-trap-subagent:stx.7.0-v1.0.3 download started from local registry
sysinv 2022-08-11 21:31:29.356 76553 INFO sysinv.conductor.kube_app [-] Image registry.local:9001/docker.io/starlingx/stx-snmp:stx.6.0-v1.0.1 is not available in local registry, download started from public/private registry
sysinv 2022-08-11 21:31:29.443 76553 INFO sysinv.conductor.kube_app [-] Image registry.local:9001/docker.io/starlingx/stx-fm-subagent:stx.6.0-v1.0.3 is not available in local registry, download started from public/private registry
sysinv 2022-08-11 21:31:29.497 76553 INFO sysinv.conductor.kube_app [-] Image registry.local:9001/docker.io/starlingx/stx-fm-trap-subagent:stx.7.0-v1.0.3 is not available in local registry, download started from public/private registry
sysinv 2022-08-11 21:31:49.486 76553 ERROR sysinv.conductor.kube_app [-] Image registry.local:9001/docker.io/starlingx/stx-fm-subagent:stx.6.0-v1.0.3 download failed from public/privateregistry: 500 Server Error: Internal Server Error ("Get "https://admin-2.cumulus.wrs.com:30093/v2/": dial tcp: lookup admin-2.cumulus.wrs.com on 192.168.204.1:53: no such host"): docker.errors.APIError: 500 Server Error: Internal Server Error ("Get "https://admin-2.cumulus.wrs.com:30093/v2/": dial tcp: lookup admin-2.cumulus.wrs.com on 192.168.204.1:53: no such host")
sysinv 2022-08-11 21:31:49.489 76553 ERROR sysinv.conductor.kube_app [-] Image registry.local:9001/docker.io/starlingx/stx-snmp:stx.6.0-v1.0.1 download failed from public/privateregistry: 500 Server Error: Internal Server Error ("Get "https://admin-2.cumulus.wrs.com:30093/v2/": dial tcp: lookup admin-2.cumulus.wrs.com on 192.168.204.1:53: no such host"): docker.errors.APIError: 500 Server Error: Internal Server Error ("Get "https://admin-2.cumulus.wrs.com:30093/v2/": dial tcp: lookup admin-2.cumulus.wrs.com on 192.168.204.1:53: no such host")
sysinv 2022-08-11 21:31:49.489 76553 INFO sysinv.conductor.kube_app [-] Failed to download image: registry.local:9001/docker.io/starlingx/stx-snmp:stx.6.0-v1.0.1
sysinv 2022-08-11 21:31:49.490 76553 INFO sysinv.conductor.kube_app [-] Retry docker images download for application snmp after 129 seconds
sysinv 2022-08-11 21:31:49.492 76553 ERROR sysinv.conductor.kube_app [-] Image registry.local:9001/docker.io/starlingx/stx-fm-trap-subagent:stx.7.0-v1.0.3 download failed from public/privateregistry: 500 Server Error: Internal Server Error ("Get "https://admin-2.cumulus.wrs.com:30093/v2/": dial tcp: lookup admin-2.cumulus.wrs.com on 192.168.204.1:53: no such host"): docker.errors.APIError: 500 Server Error: Internal Server Error ("Get "https://admin-2.cumulus.wrs.com:30093/v2/": dial tcp: lookup admin-2.cumulus.wrs.com on 192.168.204.1:53: no such host")
sysinv 2022-08-11 21:32:18.201 76553 INFO sysinv.conductor.kube_app [-] lifecycle hook for application cert-manager (1.0-1) started {'mode': 'auto', 'lifecycle_type': 'check', 'relative_timing': 'pre', 'operation': 'update', 'extra': {'from_app': True}}.
sysinv 2022-08-11 21:32:19.635 76553 INFO sysinv.conductor.kube_app [-] lifecycle hook for application platform-integ-apps (1.0-1) started {'mode': 'auto', 'lifecycle_type': 'check', 'relative_timing': 'pre', 'operation': 'update', 'extra': {'from_app': True}}.
sysinv 2022-08-11 21:32:27.394 77018 INFO sysinv.api.controllers.v1.host [-] controller-0 ihost_patch_start_2022-08-11-21-32-27 patch
sysinv 2022-08-11 21:32:27.394 77018 INFO sysinv.api.controllers.v1.host [-] controller-0 ihost_patch_end. No changes from mtce/1.0.
sysinv 2022-08-11 21:33:18.590 76553 INFO sysinv.conductor.kube_app [-] lifecycle hook for application cert-manager (1.0-1) started {'mode': 'auto', 'lifecycle_type': 'check', 'relative_timing': 'pre', 'operation': 'update', 'extra': {'from_app': True}}.
sysinv 2022-08-11 21:33:20.164 76553 INFO sysinv.conductor.kube_app [-] lifecycle hook for application platform-integ-apps (1.0-1) started {'mode': 'auto', 'lifecycle_type': 'check', 'relative_timing': 'pre', 'operation': 'update', 'extra': {'from_app': True}}.
sysinv 2022-08-11 21:33:58.713 76553 INFO sysinv.conductor.kube_app [-] Image registry.local:9001/docker.io/starlingx/stx-snmp:stx.6.0-v1.0.1 download started from local registry
sysinv 2022-08-11 21:33:58.739 76553 INFO sysinv.conductor.kube_app [-] Image registry.local:9001/docker.io/starlingx/stx-fm-subagent:stx.6.0-v1.0.3 download started from local registry
sysinv 2022-08-11 21:33:58.823 76553 INFO sysinv.conductor.kube_app [-] Image registry.local:9001/docker.io/starlingx/stx-fm-trap-subagent:stx.7.0-v1.0.3 download started from local registry
sysinv 2022-08-11 21:33:58.920 76553 INFO sysinv.conductor.kube_app [-] Image registry.local:9001/docker.io/starlingx/stx-snmp:stx.6.0-v1.0.1 is not available in local registry, download started from public/private registry
sysinv 2022-08-11 21:33:58.976 76553 INFO sysinv.conductor.kube_app [-] Image registry.local:9001/docker.io/starlingx/stx-fm-subagent:stx.6.0-v1.0.3 is not available in local registry, download started from public/private registry
sysinv 2022-08-11 21:33:59.045 76553 INFO sysinv.conductor.kube_app [-] Image registry.local:9001/docker.io/starlingx/stx-fm-trap-subagent:stx.7.0-v1.0.3 is not available in local registry, download started from public/private registry
sysinv 2022-08-11 21:34:18.926 76553 INFO sysinv.conductor.kube_app [-] lifecycle hook for application cert-manager (1.0-1) started {'mode': 'auto', 'lifecycle_type': 'check', 'relative_timing': 'pre', 'operation': 'update', 'extra': {'from_app': True}}.
sysinv 2022-08-11 21:34:19.220 76553 ERROR sysinv.conductor.kube_app [-] Image registry.local:9001/docker.io/starlingx/stx-snmp:stx.6.0-v1.0.1 download failed from public/privateregistry: 500 Server Error: Internal Server Error ("Get "https://admin-2.cumulus.wrs.com:30093/v2/": dial tcp: lookup admin-2.cumulus.wrs.com on 192.168.204.1:53: no such host"): docker.errors.APIError: 500 Server Error: Internal Server Error ("Get "https://admin-2.cumulus.wrs.com:30093/v2/": dial tcp: lookup admin-2.cumulus.wrs.com on 192.168.204.1:53: no such host")
sysinv 2022-08-11 21:34:19.221 76553 ERROR sysinv.conductor.kube_app [-] Image registry.local:9001/docker.io/starlingx/stx-fm-subagent:stx.6.0-v1.0.3 download failed from public/privateregistry: 500 Server Error: Internal Server Error ("Get "https://admin-2.cumulus.wrs.com:30093/v2/": dial tcp: lookup admin-2.cumulus.wrs.com on 192.168.204.1:53: no such host"): docker.errors.APIError: 500 Server Error: Internal Server Error ("Get "https://admin-2.cumulus.wrs.com:30093/v2/": dial tcp: lookup admin-2.cumulus.wrs.com on 192.168.204.1:53: no such host")
sysinv 2022-08-11 21:34:19.222 76553 ERROR sysinv.conductor.kube_app [-] Image registry.local:9001/docker.io/starlingx/stx-fm-trap-subagent:stx.7.0-v1.0.3 download failed from public/privateregistry: 500 Server Error: Internal Server Error ("Get "https://admin-2.cumulus.wrs.com:30093/v2/": dial tcp: lookup admin-2.cumulus.wrs.com on 192.168.204.1:53: no such host"): docker.errors.APIError: 500 Server Error: Internal Server Error ("Get "https://admin-2.cumulus.wrs.com:30093/v2/": dial tcp: lookup admin-2.cumulus.wrs.com on 192.168.204.1:53: no such host")
sysinv 2022-08-11 21:34:19.223 76553 INFO sysinv.conductor.kube_app [-] Failed to download image: registry.local:9001/docker.io/starlingx/stx-snmp:stx.6.0-v1.0.1
sysinv 2022-08-11 21:34:19.224 76553 ERROR sysinv.conductor.kube_app [-] Deployment of application snmp (1.0-1) failed: failed to download one or more image(s).: sysinv.common.exception.KubeAppApplyFailure: Deployment of application snmp (1.0-1) failed: failed to download one or more image(s).
2022-08-11 21:34:19.224 76553 ERROR sysinv.conductor.kube_app Traceback (most recent call last):
2022-08-11 21:34:19.224 76553 ERROR sysinv.conductor.kube_app File "/usr/lib/python3/dist-packages/sysinv/conductor/kube_app.py", line 2740, in perform_app_apply
2022-08-11 21:34:19.224 76553 ERROR sysinv.conductor.kube_app self.download_images(app)
2022-08-11 21:34:19.224 76553 ERROR sysinv.conductor.kube_app File "/usr/lib/python3/dist-packages/sysinv/conductor/kube_app.py", line 929, in download_images
2022-08-11 21:34:19.224 76553 ERROR sysinv.conductor.kube_app raise exception.KubeAppApplyFailure(
2022-08-11 21:34:19.224 76553 ERROR sysinv.conductor.kube_app sysinv.common.exception.KubeAppApplyFailure: Deployment of application snmp (1.0-1) failed: failed to download one or more image(s).
2022-08-11 21:34:19.224 76553 ERROR sysinv.conductor.kube_app
sysinv 2022-08-11 21:34:19.287 76553 ERROR sysinv.conductor.kube_app [-] Application apply aborted!.: sysinv.common.exception.KubeAppApplyFailure: Deployment of application snmp (1.0-1) failed: failed to download one or more image(s).
sysinv 2022-08-11 21:34:19.288 76553 INFO sysinv.conductor.kube_app [-] Deregister the abort status of app snmp
sysinv 2022-08-11 21:34:19.289 76553 ERROR sysinv.openstack.common.rpc.amqp [-] Exception during message handling: sysinv.common.exception.KubeAppApplyFailure: Deployment of application snmp (1.0-1) failed: failed to download one or more image(s).
2022-08-11 21:34:19.289 76553 ERROR sysinv.openstack.common.rpc.amqp Traceback (most recent call last):
2022-08-11 21:34:19.289 76553 ERROR sysinv.openstack.common.rpc.amqp File "/usr/lib/python3/dist-packages/sysinv/openstack/common/rpc/amqp.py", line 435, in _process_data
2022-08-11 21:34:19.289 76553 ERROR sysinv.openstack.common.rpc.amqp rval = self.proxy.dispatch(ctxt, version, method, namespace,
2022-08-11 21:34:19.289 76553 ERROR sysinv.openstack.common.rpc.amqp File "/usr/lib/python3/dist-packages/sysinv/openstack/common/rpc/dispatcher.py", line 172, in dispatch
2022-08-11 21:34:19.289 76553 ERROR sysinv.openstack.common.rpc.amqp result = getattr(proxyobj, method)(ctxt, **kwargs)
2022-08-11 21:34:19.289 76553 ERROR sysinv.openstack.common.rpc.amqp File "/usr/lib/python3/dist-packages/sysinv/conductor/manager.py", line 14250, in perform_app_apply
2022-08-11 21:34:19.289 76553 ERROR sysinv.openstack.common.rpc.amqp app_applied = self._app.perform_app_apply(rpc_app, mode, lifecycle_hook_info_app_apply)
2022-08-11 21:34:19.289 76553 ERROR sysinv.openstack.common.rpc.amqp File "/usr/lib/python3/dist-packages/sysinv/conductor/kube_app.py", line 2740, in perform_app_apply
2022-08-11 21:34:19.289 76553 ERROR sysinv.openstack.common.rpc.amqp self.download_images(app)
2022-08-11 21:34:19.289 76553 ERROR sysinv.openstack.common.rpc.amqp File "/usr/lib/python3/dist-packages/sysinv/conductor/kube_app.py", line 929, in download_images
2022-08-11 21:34:19.289 76553 ERROR sysinv.openstack.common.rpc.amqp raise exception.KubeAppApplyFailure(
2022-08-11 21:34:19.289 76553 ERROR sysinv.openstack.common.rpc.amqp sysinv.common.exception.KubeAppApplyFailure: Deployment of application snmp (1.0-1) failed: failed to download one or more image(s).
2022-08-11 21:34:19.289 76553 ERROR sysinv.openstack.common.rpc.amqp

=== REMOVE FAILED ===
sysinv 2022-08-11 21:34:36.337 76553 INFO sysinv.conductor.kube_app [-] lifecycle hook for application ptp-notification (1.0-1) started {'mode': 'manual', 'lifecycle_type': 'check', 'relative_timing': 'pre', 'operation': 'remove', 'extra': {'force': False}}.
sysinv 2022-08-11 21:34:36.358 76553 INFO sysinv.conductor.kube_app [-] lifecycle hook for application ptp-notification (1.0-1) started {'mode': 'manual', 'lifecycle_type': 'operation', 'relative_timing': 'pre', 'operation': 'remove', 'extra': {}}.
sysinv 2022-08-11 21:34:36.358 76553 INFO sysinv.conductor.kube_app [-] Register the initial abort status of app ptp-notification
sysinv 2022-08-11 21:34:36.397 76553 INFO sysinv.conductor.kube_app [-] Application (ptp-notification) remove started.
sysinv 2022-08-11 21:34:36.445 76553 INFO sysinv.conductor.kube_app [-] lifecycle hook for application ptp-notification (1.0-1) started {'lifecycle_type': 'fluxcd-request', 'relative_timing': 'pre', 'operation': 'delete', 'extra': {}}.
sysinv 2022-08-11 21:34:36.446 76553 INFO sysinv.conductor.kube_app [-] Doing FluxCD operation delete with the following manifest: /opt/platform/fluxcd/22.12/ptp-notification/1.0-1/ptp-notification-fluxcd-manifests
sysinv 2022-08-11 21:34:37.753 76553 INFO sysinv.conductor.kube_app [-] lifecycle hook for application ptp-notification (1.0-1) started {'lifecycle_type': 'fluxcd-request', 'relative_timing': 'post', 'operation': 'delete', 'extra': {'rc': False}}.
sysinv 2022-08-11 21:34:37.904 76553 ERROR sysinv.conductor.kube_app [-] Application remove aborted!.
sysinv 2022-08-11 21:34:37.904 76553 INFO sysinv.conductor.kube_app [-] Deregister the abort status of app ptp-notification
sysinv 2022-08-11 21:34:37.905 76553 INFO sysinv.conductor.kube_app [-] lifecycle hook for application ptp-notification (1.0-1) started {'mode': 'manual', 'lifecycle_type': 'operation', 'relative_timing': 'post', 'operation': 'remove', 'extra': {'app_removed': False}}.

Test Activity
-------------

Workaround
----------
sudo -u postgres psql -U postgres -d sysinv -c "update kube_app set status='uploaded' where name='<app_name>'"

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/config/+/853831

Changed in starlingx:
status: New → In Progress
Changed in starlingx:
assignee: nobody → Leonardo Fagundes Luz Serrano (lfagunde)
Ghada Khalil (gkhalil)
tags: added: stx.apps
Changed in starlingx:
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to config (master)

Reviewed: https://review.opendev.org/c/starlingx/config/+/855516
Committed: https://opendev.org/starlingx/config/commit/7f8a5fb2424dfe7a4f1c3310c34d4c4b24fde4ac
Submitter: "Zuul (22348)"
Branch: master

commit 7f8a5fb2424dfe7a4f1c3310c34d4c4b24fde4ac
Author: Leonardo Fagundes Luz Serrano <email address hidden>
Date: Thu Sep 1 10:13:15 2022 -0300

    AppFwk: remove -f prevents remove-failed state

    Currently, there is no path inside the appfwk to get an app
    from 'remove-failed' state to any other state.

    This commit makes it so that using remove --force
    will prevent the app from being put in remove-failed
    if the operation fails.

    Instead, the app is put in 'uploaded' state
    and a progress message warning about this is set.

    remove --force can also be used to recover the app
    from remove-failed state for a posterior delete.

    Test Plan:
    PASS: remove (without -f) results in remove-failed
          state in case of an error
    PASS: remove --force results in uploaded state
          instead of remove-failed in case of an error
          and the progress message is set.
          (tested for apply-failed and remove-failed)
    PASS: remove --force does not set the warning
          progress message when the remove succeeds

    Related-Bug: 1987115
    Signed-off-by: Leonardo Fagundes Luz Serrano <email address hidden>
    Change-Id: Iba659c05bf9abd28b0319e6c438141f9aa1c9240

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/c/starlingx/config/+/853831
Committed: https://opendev.org/starlingx/config/commit/b168ac952110d74337db4120c80aa1674336e632
Submitter: "Zuul (22348)"
Branch: master

commit b168ac952110d74337db4120c80aa1674336e632
Author: Leonardo Fagundes Luz Serrano <email address hidden>
Date: Fri Aug 19 13:02:30 2022 -0300

    Fixed app remove op when image download fails

    Fixed application-remove cmd putting app in 'remove-failed' state
    when used to remove an app which doesn't have any resources
    in kubernetes.
    (eg.: application-apply failed to download docker images)

    Added some missing error message logging.

    Test Plan:
    PASS: remove cmd changes app state from 'apply-failed' to 'uploaded'
          when apply cmd failed to download docker images

    Closes-Bug: 1987115

    Signed-off-by: Leonardo Fagundes Luz Serrano <email address hidden>
    Change-Id: I30191f9b90c40f6432cf75e141d12319046486a6

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
tags: added: stx.8.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.