patch orch failed on sx subcloud with oidc and stx-monitor applied - host not unlock
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
Medium
|
Don Penney |
Bug Description
Brief Description
-----------------
With oidc and stx-monitor apps applied on Distributed cloud system, after using patching orch to apply Large patch on DC, one of SX system patch apply failed by host locked.
Severity
--------
Major
Steps to Reproduce
------------------
applied oidc and stx-monitor app on DC system
apply Large patch on system by using patch strategy
Apply strategy
Expected Behavior
------------------
Patching success on all subcloud
Actual Behavior
----------------
one SX subcloud patching failed
Reproducibility
---------------
Unknown - first time this is seen in sanity, will monitor
System Configuration
-------
DC system
Lab-name: WCP_80-91
Branch/Pull Time/Commit
-------
2020-04-29_20-00-00
Last Pass
---------
2020-03-29_16-39-59
Timestamp/Logs
--------------
[sysadmin@
+------
| application | version | manifest name | manifest file | status | progress |
+------
| cert-manager | 1.0-0 | cert-manager-
| nginx-ingress-
| | | | yaml | | |
| | | | | | |
| oidc-auth-apps | 1.0-0 | oidc-auth-manifest | manifest.yaml | uploaded | completed |
| platform-integ-apps | 1.0-8 | platform-
| stx-monitor | 1.0-1 | analytics-
+------
[sysadmin@
+------
| Alarm ID | Reason Text | Entity ID | Severity | Time Stamp |
+------
| 400.003 | Evaluation license key will expire on 30-sep-2020; there are 152 days remaining | host=controller-1 | minor | 2020-05-01T16: |
| | in this evaluation | | | 43:53.094570 |
| | | | | |
| 400.003 | Evaluation license key will expire on 30-sep-2020; there are 152 days remaining | host=controller-0 | minor | 2020-05-01T16: |
| | in this evaluation | | | 43:49.649907 |
| | | | | |
| 500.101 | Developer patch certificate is enabled | host=controller | critical | 2020-05-01T00: |
| | | | | 06:02.038877 |
| | | | | |
+------
[sysadmin@
+----+-
| id | name | management | availability | deploy status | sync |
+----+-
| 2 | subcloud6 | managed | online | complete | in-sync |
| 4 | subcloud4 | managed | online | complete | in-sync |
| 7 | subcloud7 | managed | online | complete | in-sync |
+----+-
[sysadmin@
2020-04-
[sysadmin@
2020-04-
[sysadmin@
+------
| Field | Value |
+------
| subcloud apply type | parallel |
| max parallel subclouds | 10 |
| stop on failure | False |
| state | initial |
| created_at | 2020-05-
| updated_at | None |
+------
[sysadmin@
+------
| Field | Value |
+------
| subcloud apply type | parallel |
| max parallel subclouds | 10 |
| stop on failure | False |
| state | applying |
| created_at | 2020-05-
| updated_at | 2020-05-
+------
[sysadmin@
+------
| cloud | stage | state | details | started_at | finished_at |
+------
| SystemController | 1 | creating strategy | | 2020-05-02 14:10:28.837943 | None |
| subcloud6 | 2 | initial | | None | None |
| subcloud4 | 2 | initial | | None | None |
| subcloud7 | 2 | initial | | None | None |
+------
[sysadmin@
+------
| cloud | stage | state | details | started_at | finished_at |
+------
| SystemController | 1 | complete | | 2020-05-02 14:10:28.837943 | 2020-05-02 14:55:48.468191 |
| subcloud6 | 2 | failed | Strategy apply failed for subcloud6 - unexpected state abort-failed | 2020-05-02 14:55:58.477515 | 2020-05-02 15:30:22.119729 |
| subcloud4 | 2 | complete | | 2020-05-02 14:55:58.484945 | 2020-05-02 15:40:17.458608 |
| subcloud7 | 2 | complete | | 2020-05-02 14:55:58.495617 | 2020-05-02 15:25:26.728835 |
+------
[sysadmin@
Subcloud6:
[sysadmin@
+------
| Alarm ID | Reason Text | Entity ID | Severity | Time Stamp |
+------
| 400.001 | Service group controller-services failure; dnsmasq(
| | | controller. | | 43:38.264746 |
| | | service_group= | | |
| | | controller-
| | | host=controller-0 | | |
| | | | | |
| 400.002 | Service group controller-services has no active members available; expected 1 | service_domain= | critical | 2020-05-02T15: |
| | active member | controller. | | 03:27.053786 |
| | | service_group= | | |
| | | controller-services | | |
| | | | | |
| 200.001 | controller-0 was administratively locked to take it out-of-service. | host=controller-0 | warning | 2020-05-02T14: |
| | | | | 59:09.530587 |
| | | | | |
| 400.003 | Evaluation license key will expire on 30-sep-2020; there are 151 days remaining | host=controller-0 | minor | 2020-05-02T00: |
| | in this evaluation | | | 59:13.512835 |
| | | | | |
| 500.101 | Developer patch certificate is enabled | host=controller | critical | 2020-05-01T00: |
| | | | | 11:40.719420 |
| | | | | |
+------
[sysadmin@
+----+-
| id | hostname | personality | administrative | operational | availability |
+----+-
| 1 | controller-0 | controller | locked | disabled | online |
Test Activity
-------------
Regression Testing
tags: | added: stx.retestneeded |
tags: | added: stx.up |
tags: |
added: stx.4.0 stx.distcloud stx.update removed: stx.up |
Changed in starlingx: | |
status: | New → Triaged |
importance: | Undecided → Medium |
assignee: | nobody → Bart Wensley (bartwensley) |
description: | updated |
Log added: /files. starlingx. kube.cengn. ca/launchpad/ 1876500
https:/