hello-kitty application apply failed after controller swact
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
High
|
Angie Wang |
Bug Description
Brief Description
-----------------
5 mins after hello-kitty application application-apply, the status was end with "apply-failed"
Severity
--------
Major
Steps to Reproduce
------------------
Upload hello-kitty helm charts
Apply hello-kitty
TC-name: z_containers/
Expected Behavior
------------------
Actual Behavior
----------------
Reproducibility
---------------
Seen once
System Configuration
-------
Two node system
Lab-name:IP_5-6
Branch/Pull Time/Commit
-------
stx master as of 20190724T013000Z
Last Pass
---------
Lab: IP_5_6
Load: 20190721T233000Z
Timestamp/Logs
--------------
[2019-07-24 15:09:38,191] 301 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
[2019-07-24 15:09:40,708] 423 DEBUG MainThread ssh.expect :: Output:
+------
| application | version | manifest name | manifest file | status | progress |
+------
| hello-kitty | 1.0 | hello-kitty | manifest.yaml | uploaded | completed |
| platform-integ-apps | 1.0-7 | platform-
| stx-openstack | 1.0-17-
| | versioned | | | | changed from 'applying' to 'apply-failed'. |
| | | | | | |
[2019-07-24 15:08:57,889] 301 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
[2019-07-24 15:09:40,813] 301 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
[2019-07-24 15:09:38,191] 301 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
[2019-07-24 15:09:40,708] 423 DEBUG MainThread ssh.expect :: Output:
+------
| application | version | manifest name | manifest file | status | progress |
+------
| hello-kitty | 1.0 | hello-kitty | manifest.yaml | uploaded | completed |
| platform-integ-apps | 1.0-7 | platform-
| stx-openstack | 1.0-17-
| | versioned | | | | changed from 'applying' to 'apply-failed'. |
| | | | | | |
Test Activity
-------------
Sanity
tags: | added: stx.containers |
summary: |
- hello-kitty application apply failed + hello-kitty application apply failed after controller swact |
The root cause is tiller pod was not ready after host-swact.
hello-kitty app apply was triggered after swact to controller-1, but looks like it takes 20mins to bring up tiller pod after swact.
2019-07- 24T14:58: 46.000 system host swact from controller-0 to controller-1 hello-kitty/ 1.0/hello- kitty-manifest. yaml --values /overrides/ hello-kitty/ 1.0/default- kitty-1. yaml --tiller-host tiller- deploy. kube-system. svc.cluster. local | tee /logs/hello- kitty-apply. log'
2019-07-24 15:00:33.057 447459 sysinv-conductor starts on controller-1
2019-07-24 15:09:43.091 447308 Armada apply command = /bin/bash -c 'set -o pipefail; armada apply --debug /manifests/
From the following tiller log, we know that tiller was ready on 15:22:32
{"log":"[main] 2019/07/24 15:22:32 Starting Tiller v2.13.1 (tls=false) \n","stream" :"stderr" ,"time" :"2019- 07-24T15: 22:32.08652286Z "} n","stream" :"stderr" ,"time" :"2019- 07-24T15: 22:32.086573828 Z"} n","stream" :"stderr" ,"time" :"2019- 07-24T15: 22:32.086585783 Z"} n","stream" :"stderr" ,"time" :"2019- 07-24T15: 22:32.086593709 Z"} :"stderr" ,"time" :"2019- 07-24T15: 22:32.086601303 Z"} n","stream" :"stderr" ,"time" :"2019- 07-24T15: 22:36.126542029 Z"}
{"log":"[main] 2019/07/24 15:22:32 GRPC listening on :44134\
{"log":"[main] 2019/07/24 15:22:32 Probes listening on :44135\
{"log":"[main] 2019/07/24 15:22:32 Storage driver is ConfigMap\
{"log":"[main] 2019/07/24 15:22:32 Max history per release is 0\n","stream"
{"log":"[storage] 2019/07/24 15:22:36 listing all releases with filter\
...
We had a similar tiller issue which fixed long time ago https:/ /review. opendev. org/#/c/ 657087/