platform-integ-apps application apply-failed after host-unlock

Bug #1887541 reported by Peng Peng
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Frank Miller

Bug Description

Brief Description
-----------------
After host-unlock standby controller, platform-integ-apps application apply-failed. armada log shows
Existing deployment likely pending release=stx-rbd-provisioner, status=PENDING_UPGRADE

Severity
--------
Major

Steps to Reproduce
------------------
in host cpu modification test cases, lock/unlock host

TC-name: z_containers/test_isolcpus.py::TestIsolated_2P::test_isolated_2p_2_big_pod[2_0-fill_fill-static-best-effort-HT-AIO]

Expected Behavior
------------------
platform-integ-apps application apply success

Actual Behavior
----------------
platform-integ-apps application apply failed

Reproducibility
---------------
Intermittent 1/5

System Configuration
--------------------
Two node system

Lab-name: R430_3-4

Branch/Pull Time/Commit
-----------------------
2020-07-13_20-00-00

Last Pass
---------
2020-07-12_20-00-00

Timestamp/Logs
--------------
[2020-07-14 05:40:41,386] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[face::1]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne application-list'
[2020-07-14 05:40:42,671] 436 DEBUG MainThread ssh.expect :: Output:
+--------------------------+----------+-----------------------------------+----------------------------------------+----------+-----------+
| application | version | manifest name | manifest file | status | progress |
+--------------------------+----------+-----------------------------------+----------------------------------------+----------+-----------+
| cert-manager | 20.06-5 | cert-manager-manifest | certmanager-manifest.yaml | applied | completed |
| nginx-ingress-controller | 20.06-0 | nginx-ingress-controller-manifest | nginx_ingress_controller_manifest.yaml | applied | completed |
| oidc-auth-apps | 20.06-27 | oidc-auth-manifest | manifest.yaml | uploaded | completed |
| platform-integ-apps | 20.06-10 | platform-integration-manifest | manifest.yaml | applied | completed |
+--------------------------+----------+-----------------------------------+----------------------------------------+----------+-----------+

[2020-07-14 05:40:43,349] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[face::1]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne host-unlock controller-0'

[2020-07-14 05:57:41,800] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://[face::1]:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne application-list'
[2020-07-14 05:57:43,087] 436 DEBUG MainThread ssh.expect :: Output:
+--------------------------+----------+-----------------------------------+----------------------------------------+--------------+----------------------------------------+
| application | version | manifest name | manifest file | status | progress |
+--------------------------+----------+-----------------------------------+----------------------------------------+--------------+----------------------------------------+
| cert-manager | 20.06-5 | cert-manager-manifest | certmanager-manifest.yaml | applied | completed |
| nginx-ingress-controller | 20.06-0 | nginx-ingress-controller-manifest | nginx_ingress_controller_manifest.yaml | applied | completed |
| oidc-auth-apps | 20.06-27 | oidc-auth-manifest | manifest.yaml | uploaded | completed |
| platform-integ-apps | 20.06-10 | platform-integration-manifest | manifest.yaml | apply-failed | operation aborted, check logs for |
| | | | | | detail |
| | | | | | |

armada log:
2020-07-14 05:45:56.526 1029 ERROR armada.handlers.armada [-] Chart deploy [kube-system-rbd-provisioner] failed: armada.exceptions.armada_exceptions.DeploymentLikelyPendingException: Existing deployment likely pending release=stx-rbd-provisioner, status=PENDING_UPGRADE, (last deployment age=308s) < (chart wait timeout=1800s)

Test Activity
-------------
Sanity

Revision history for this message
Peng Peng (ppeng) wrote :
Revision history for this message
Ghada Khalil (gkhalil) wrote :

stx.5.0 / medium priority - intermittent issue related to tiller; may need to look at up-versioning tiller.

tags: added: stx.5.0 stx.containers
Changed in starlingx:
status: New → Triaged
importance: Undecided → Medium
assignee: nobody → Frank Miller (sensfan22)
Revision history for this message
Frank Miller (sensfan22) wrote :

As part of moving to helm v3, armada was changed to run in a container and this issue no longer occurs.

Changed in starlingx:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.