DC subcloud bootsrap failed - nginx-ingress-controller apply failed
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
Medium
|
Gerry Kopec |
Bug Description
Brief Description
-----------------
Subcloud Bootstrap failed with 50ms delay on the network, due to the following:
fatal: [subcloud41]: FAILED! => {"attempts": 30, "changed": true, "cmd": "source /etc/platform/
Severity
--------
Major
Steps to Reproduce
------------------
Add 25 subclouds at the same time with 50ms delay on the i/f used for bootstrap
Expected Behavior
------------------
Bootstrap to be successfull
Actual Behavior
----------------
Bootstrap failed for subcloud41 (1 out of 25 subclouds tried at the same time)
Reproducibility
---------------
intermittent
System Configuration
-------
duplex with worker system controller and One node system for subcloud
Branch/Pull Time/Commit
-------
2020-06-11_20-00-00
Last Pass
---------
N/A
Timestamp/Logs
--------------
2020-06-16 19:01:37.453 3659026 Subcloud41 add started
Test Activity
-------------
System Test
From subcloud41: ------- ------- ------+ ------- --+---- ------- ------- ------- ------- ---+--- ------- ------- ------- ------- ------- --+---- ------- ---+--- ------- ------- ------- ------- ------- ----+ ------- ------- ------+ ------- --+---- ------- ------- ------- ------- ---+--- ------- ------- ------- ------- ------- --+---- ------- ---+--- ------- ------- ------- ------- ------- ----+ controller | 1.0-0 | nginx-ingress- controller- manifest | nginx_ingress_ controller_ manifest. yaml | apply-failed | operation aborted, check logs for detail | ------- ------- ------+ ------- --+---- ------- ------- ------- ------- ---+--- ------- ------- ------- ------- ------- --+---- ------- ---+--- ------- ------- ------- ------- ------- ----+ controller- 0 ~(keystone_admin)]$ kubectl get pods -A kube-controller s-5cd4695574- gq9wb 1/1 Running 1 25m 7fc965fbd7- dv8gp 1/1 Running 0 25m ingress- controller- xcwfn 1/1 Running 0 17m ingress- default- backend- 5ffcfd7744- hgdbj 1/1 Running 0 17m controller- 0 1/1 Running 0 25m -manager- controller- 0 1/1 Running 1 25m ds-amd64- x5gwt 1/1 Running 0 15m controller- 0 1/1 Running 1 25m cni-ds- amd64-p7p9n 1/1 Running 0 15m deploy- 5c8dd9fb56- zpnpq 1/1 Running 0 24m
system application-list
+------
| application | version | manifest name | manifest file | status | progress |
+------
| nginx-ingress-
+------
[sysadmin@
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-
kube-system calico-node-kc7n9 1/1 Running 0 16m
kube-system coredns-
kube-system ic-nginx-
kube-system ic-nginx-
kube-system kube-apiserver-
kube-system kube-controller
kube-system kube-multus-
kube-system kube-proxy-67kbc 1/1 Running 0 25m
kube-system kube-scheduler-
kube-system kube-sriov-
kube-system tiller-
Error msg from /var/log/ armada/ nginx-ingress- controller- apply_2020- 06-16-19- 32-12.log lib/python3. 6/dist- packages/ armada/ handlers/ tiller. py:547^ [[00m handlers. armada [-] Chart deploy [nginx-ingress] failed: armada. exceptions. tiller_ exceptions. ReleaseExceptio n: Failed to Install release: ic-nginx-ingress - Tiller Message: b'Release "ic-nginx-ingress" failed: etcdserver: request timed out' handlers. armada Traceback (most recent call last): handlers. armada File "/usr/local/ lib/python3. 6/dist- packages/ armada/ handlers/ tiller. py", line 473, in install_release handlers. armada metadata= self.metadata) handlers. armada File "/usr/local/ lib/python3. 6/dist- packages/ grpc/_channel. py", line 533, in __call...
get_release_status /usr/local/
2020-06-16 19:33:07.861 46 ERROR armada.
2020-06-16 19:33:07.861 46 ERROR armada.
2020-06-16 19:33:07.861 46 ERROR armada.
2020-06-16 19:33:07.861 46 ERROR armada.
2020-06-16 19:33:07.861 46 ERROR armada.