System applications (cert-manager and platform-integ-apps) apply-failed after a failed attempt to lock host with vms

Bug #1881818 reported by Peng Peng
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
High
Dan Voiculeasa

Bug Description

Brief Description
-----------------
5 mins after system host-lock, the host was still in unlock status. After host-lock failed, two applications became apply-failed status

Severity
--------
Major

Steps to Reproduce
------------------
host-lock
check application-list

TC-name: nova/test_lock_with_vms.py::TestLockWithVMs::test_lock_with_vms

Expected Behavior
------------------
host locked and applications are still in applied status

Actual Behavior
----------------
host-lock failed
applications are still in apply-failed status

Reproducibility
---------------
Unknown - first time this is seen in sanity, will monitor

System Configuration
--------------------
Two node system

Lab-name: IP_5-6

Branch/Pull Time/Commit
-----------------------
2020-05-31_20-00-00

Last Pass
---------
unknown

Timestamp/Logs
--------------
[2020-06-02 15:38:54,754] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://192.168.204.1:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne application-list'
[2020-06-02 15:38:56,232] 436 DEBUG MainThread ssh.expect :: Output:
+--------------------------+-----------------------------+-----------------------------------+----------------------------------------+----------+-----------+
| application | version | manifest name | manifest file | status | progress |
+--------------------------+-----------------------------+-----------------------------------+----------------------------------------+----------+-----------+
| cert-manager | 1.0-1 | cert-manager-manifest | certmanager-manifest.yaml | applied | completed |
| nginx-ingress-controller | 1.0-0 | nginx-ingress-controller-manifest | nginx_ingress_controller_manifest.yaml | applied | completed |
| oidc-auth-apps | 1.0-0 | oidc-auth-manifest | manifest.yaml | uploaded | completed |
| platform-integ-apps | 1.0-8 | platform-integration-manifest | manifest.yaml | applied | completed |
| stx-openstack | 1.0-1-centos-stable- | armada-manifest | stx-openstack.yaml | applied | completed |
| | versioned | | | | |
| | | | | | |
+--------------------------+-----------------------------+-----------------------------------+----------------------------------------+----------+-----------+
controller-1:~$

[2020-06-02 15:49:06,877] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://192.168.204.1:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne host-lock controller-0'

[2020-06-02 15:53:52,817] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://192.168.204.1:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne host-show controller-0'
[2020-06-02 15:53:54,403] 436 DEBUG MainThread ssh.expect :: Output:
+-----------------------+-----------------------------------------------------------------------+
| Property | Value |
+-----------------------+-----------------------------------------------------------------------+
| action | none |
| administrative | unlocked |
| availability | available |

[2020-06-02 16:01:18,444] 314 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://192.168.204.1:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne application-list'
[2020-06-02 16:01:19,969] 436 DEBUG MainThread ssh.expect :: Output:
+--------------------------+-----------------------------+-----------------------------------+----------------------------------------+--------------+------------------------------------------+
| application | version | manifest name | manifest file | status | progress |
+--------------------------+-----------------------------+-----------------------------------+----------------------------------------+--------------+------------------------------------------+
| cert-manager | 1.0-1 | cert-manager-manifest | certmanager-manifest.yaml | apply-failed | operation aborted, check logs for detail |
| nginx-ingress-controller | 1.0-0 | nginx-ingress-controller-manifest | nginx_ingress_controller_manifest.yaml | applied | completed |
| oidc-auth-apps | 1.0-0 | oidc-auth-manifest | manifest.yaml | uploaded | completed |
| platform-integ-apps | 1.0-8 | platform-integration-manifest | manifest.yaml | apply-failed | operation aborted, check logs for detail |
| stx-openstack | 1.0-1-centos-stable- | armada-manifest | stx-openstack.yaml | applied | completed |
| | versioned | | | | |
| | | | | | |
+--------------------------+-----------------------------+-----------------------------------+----------------------------------------+--------------+------------------------------------------+
controller-1:~$
[2020-06-02 16:01:19,97

Test Activity
-------------
Sanity

Revision history for this message
Peng Peng (ppeng) wrote :
tags: added: stx.retestneeded
Revision history for this message
Frank Miller (sensfan22) wrote :

Triage of this issue showed the platform-integ-apps & cert-manager applications failed to re-apply because dockerd could not restart the armada app (it was in a bad state). This is the same issue as reported in https://bugs.launchpad.net/starlingx/+bug/1877582

Ghada Khalil (gkhalil)
tags: added: stx.4.0 stx.apps stx.containers
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Duplicate LP (1877582) has been addressed.
https://review.opendev.org/735374
Merged in stx master on 2020-06-15

Changed in starlingx:
importance: Undecided → High
status: New → Fix Released
assignee: nobody → Dan Voiculeasa (dvoicule)
Peng Peng (ppeng)
tags: removed: stx.retestneeded
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.