Containers: Reapply from controller-1 after swact hung on generating overrides for 10+ minutes due to RPC reponse timeout
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
High
|
Al Bailey |
Bug Description
Brief Description
-----------------
After swact to controller-1, reapply stx-openstack without any charts change, the reapply will hang on "Generating application overrides" for 10+ minutes, with RPC reposnse timeout error in sysinv log.
- The application does eventually apply successfully.
- This issue does not happen on subsequent reapply from controller-1
- This issue does not happen when reapply after swacting to controller-0
Severity
--------
Minor
Steps to Reproduce
------------------
- install and config system, deploy stx-openstack application
- swact active controller to controller-1
- Run "system application-apply stx-openstack" from controller-1 after swact.
Expected Behavior
------------------
- Reapply succeeds very quickly (normally within 1 minute)
Actual Behavior
----------------
- Reapply gets stuck at generating application overrides for about 10 minutes, and eventually completes
Reproducibility
---------------
Reproducible
System Configuration
-------
Multi-node system
Branch/Pull Time/Commit
-------
f/stein as of 2019-02-25
Timestamp/Logs
--------------
[2019-02-26 18:59:55,310] 262 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
[2019-02-26 19:02:14,727] 262 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
[2019-02-26 19:17:47,162] 262 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://
+------
| application | manifest name | manifest file | status | progress |
+------
| stx-openstack | armada-manifest | manifest-
+------
controller-1:~$
sysinv.log:
2019-02-26 19:02:16.477 178122 INFO sysinv.
2019-02-26 19:02:17.007 178122 INFO sysinv.helm.neutron [req-6f73c478-
2019-02-26 19:04:00.408 3710 INFO sysinv.
2019-02-26 19:04:00.417 3710 INFO sysinv.
2019-02-26 19:05:00.478 3710 ERROR sysinv.
2019-02-26 19:05:00.478 3710 TRACE sysinv.
2019-02-26 19:05:00.478 3710 TRACE sysinv.
2019-02-26 19:05:00.478 3710 TRACE sysinv.
2019-02-26 19:05:00.478 3710 TRACE sysinv.
2019-02-26 19:05:00.478 3710 TRACE sysinv.
2019-02-26 19:05:00.478 3710 TRACE sysinv.
2019-02-26 19:05:00.478 3710 TRACE sysinv.
2019-02-26 19:05:00.478 3710 TRACE sysinv.
2019-02-26 19:05:00.478 3710 TRACE sysinv.
2019-02-26 19:05:00.478 3710 TRACE sysinv.
2019-02-26 19:05:00.478 3710 TRACE sysinv.
2019-02-26 19:05:00.478 3710 TRACE sysinv.
2019-02-26 19:05:00.478 3710 TRACE sysinv.
2019-02-26 19:05:00.478 3710 TRACE sysinv.
2019-02-26 19:05:00.478 3710 TRACE sysinv.
2019-02-26 19:05:00.478 3710 TRACE sysinv.
tags: |
added: stx.2.0 removed: stx.2019.05 |
tags: | added: stx.retestneeded |
Changed in starlingx: | |
assignee: | Chris Friesen (cbf123) → Al Bailey (albailey1974) |
Marking as release gating; low priority given the operation complete and is only slow right after a swact operation.