Distributed Cloud: Delete and re-add subcloud failed at bootstrap after initial configuration failure on controller-0
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
Medium
|
Jessica Castelino |
Bug Description
Brief Description
-----------------
Initially the "dcmanager subcloud add subcloud4" command failed on subcloud because of missing ceph-cluster backend. After removing the subcloud from the DC system, I attempted to re-add the subcloud. Unfortunately, the replay failed early in bootstrapping the subcloud with the error message below:
failed: [subcloud4] (item={
PLAY RECAP *******
subcloud4 : ok=147 changed=41 unreachable=0 failed=1
Preliminary assessment from Tao Lui:
During the first deployment, the mgmt/cluster interfaces had already been re-configured prior to unlock ( no longer on lo).
The bootstrap replay failed at removing the cluster ip from the lo interface.
Severity
--------
Major
Steps to Reproduce
------------------
1) Setup a DC System Controller
2) Boot a subcloud active controller node
3) Add the subcloud to the DC system: "dcmanager subcloud add subcloud4 ...."
4) The subcloud fails at controller-0 configuration because of missing ceph-cluster backend
5) Delete the failed subcloud from DC system ( dcmanager subcloud delete subcloud4)
6) Re-add the subcloud with ceph-cluster backend ( dcmanager subcloud add subcloud4 ....)
8) The replay failed early on bootstrapping with the above error message
TC-name:
Expected Behavior
------------------
Subcloud added to DC system successfully on replay
Actual Behavior
----------------
Subcloud add failed early on bootstrapping
Reproducibility
---------------
Tested once
System Configuration
-------
DC system
Lab-name: wcp_80-91
subcloud4: wcp_85_86
Branch/Pull Time/Commit
-------
2020-02-24_20-23-53
Last Pass
---------
unknown
Timestamp/Logs
--------------
2020-02-25-18-21-02
+----+-
| id | name | management | availability | deploy status | sync |
+----+-
| 1 | subcloud1 | unmanaged | online | complete | unknown |
| 2 | subcloud5 | managed | online | complete | in-sync |
| 4 | subcloud4 | unmanaged | offline | bootstrap-failed | unknown |
summary: |
- Distributed Cloud: Replay on subcloud failed after initial deployment - failure + Distributed Cloud: Delete and re-add subcloud failed at bootstrap after + initial deployment failure |
summary: |
Distributed Cloud: Delete and re-add subcloud failed at bootstrap after - initial deployment failure + initial configuration failure on controller-0 |
description: | updated |
description: | updated |
Changed in starlingx: | |
assignee: | nobody → Tee Ngo (teewrs) |
Changed in starlingx: | |
assignee: | Tee Ngo (teewrs) → Jessica Castelino (jcasteli) |
Logs: /files. starlingx. kube.cengn. ca/launchpad/ 1864756
https:/