stx-openstack apply takes longer time when lock and unlock on standby controller

Bug #1834083 reported by Anujeyan Manokeran
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Frank Miller

Bug Description

Brief Description
-----------------
     When standby controller locked and unlocked after host changes to available state application apply was taking more than 5 mins apply complete. This was observed in two different regular 2+2 system

2019-06-24 18:39:11,207] 423 DEBUG MainThread ssh.exec_cmd:: Executing command...
[2019-06-24 18:39:11,208] 268 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://192.168.204.102:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne host-show controller-1'
[2019-06-24 18:39:13,281] 387 DEBUG MainThread ssh.expect :: Output:
+---------------------+-----------------------------------------------------------------------+
| Property | Value |
+---------------------+-----------------------------------------------------------------------+
| action | none |
| administrative | locked |
| availability | online |
| bm_ip | 128.224.64.92 |
| bm_type | bmc |
| bm_username | Administrator |
| boot_device | sda |
| capabilities | {u'stor_function': u'monitor', u'Personality': u'Controller-Standby'} |
| config_applied | 6f3cceb1-35a3-4c34-9a20-aae11221e35b |
| config_status | None |
| config_target | 6f3cceb1-35a3-4c34-9a20-aae11221e35b |
| console | ttyS0,115200 |
| created_at | 2019-06-24T14:34:28.008461+00:00 |
| hostname | controller-1 |
| id | 2 |
| install_output | text |
| install_state | completed |
| install_state_info | None |
| invprovision | provisioned |
| location | {} |
| mgmt_ip | 192.168.204.104 |
| mgmt_mac | 2c:44:fd:83:b6:ed |
| operational | disabled |
| personality | controller |
| reserved | False |
| rootfs_device | sda |
| serialid | None |
| software_load | 19.01 |
| task | |
| tboot | false |
| ttys_dcd | None |
| updated_at | 2019-06-24T18:39:06.100219+00:00 |
| uptime | 12697 |
| uuid | 99a714bf-0f57-4db2-b418-e9ec5095f274 |
| vim_progress_status | services-disabled |
+---------------------+-----------------------------------------------------------------------+
[sysadmin@controller-0 ~(keystone_admin)]$
[2019-06-24 18:39:13,282] 268 DEBUG MainThread ssh.send :: Send 'echo $?'

  INFO MainThread host_helper.unlock_hosts:: Unlocking ['controller-1']...
[2019-06-24 18:39:31,987] 1340 DEBUG MainThread ssh.get_active_controller:: Getting active controller client for hp380
[2019-06-24 18:39:31,987] 423 DEBUG MainThread ssh.exec_cmd:: Executing command...
[2019-06-24 18:39:31,987] 268 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://192.168.204.102:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne host-show controller-1'
[2019-06-24 18:39:33,946] 387 DEBUG MainThread ssh.expect :: Output:
+---------------------+-----------------------------------------------------------------------+
| Property | Value |
+---------------------+-----------------------------------------------------------------------+
| action | none |
| administrative | locked |
| availability | online |
| bm_ip | 128.224.64.92 |
| bm_type | bmc |
| bm_username | Administrator |
| boot_device | sda |
| capabilities | {u'stor_function': u'monitor', u'Personality': u'Controller-Standby'} |
| config_applied | 6f3cceb1-35a3-4c34-9a20-aae11221e35b |
| config_status | None |
| config_target | 6f3cceb1-35a3-4c34-9a20-aae11221e35b |
| console | ttyS0,115200 |
| created_at | 2019-06-24T14:34:28.008461+00:00 |
| hostname | controller-1 |
| id | 2 |
| install_output | text |
| install_state | completed |
| install_state_info | None |
| invprovision | provisioned |
| location | {} |
| mgmt_ip | 192.168.204.104 |
| mgmt_mac | 2c:44:fd:83:b6:ed |
| operational | disabled |
| personality | controller |
| reserved | False |
| rootfs_device | sda |
| serialid | None |
| software_load | 19.01 |
| task | |
| tboot | false |
| ttys_dcd | None |
| updated_at | 2019-06-24T18:39:06.100219+00:00 |
| uptime | 12697 |
| uuid | 99a714bf-0f57-4db2-b418-e9ec5095f274 |
| vim_progress_status | services-disabled |
+---------------------+-----------------------------------------------------------------------+

  INFO MainThread host_helper.unlock_hosts:: Unlocking ['controller-1']...
[2019-06-24 18:39:31,987] 1340 DEBUG MainThread ssh.get_active_controller:: Getting active controller client for hp380
[2019-06-24 18:39:31,987] 423 DEBUG MainThread ssh.exec_cmd:: Executing command...
[2019-06-24 18:39:31,987] 268 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://192.168.204.102:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne host-show controller-1'
[2019-06-24 18:39:33,946] 387 DEBUG MainThread ssh.expect :: Output:
+---------------------+-----------------------------------------------------------------------+
| Property | Value |
+---------------------+-----------------------------------------------------------------------+
| action | none |
| administrative | locked |
| availability | online |
| bm_ip | 128.224.64.92 |
| bm_type | bmc |
| bm_username | Administrator |
| boot_device | sda |
| capabilities | {u'stor_function': u'monitor', u'Personality': u'Controller-Standby'} |
| config_applied | 6f3cceb1-35a3-4c34-9a20-aae11221e35b |
| config_status | None |
| config_target | 6f3cceb1-35a3-4c34-9a20-aae11221e35b |
| console | ttyS0,115200 |
| created_at | 2019-06-24T14:34:28.008461+00:00 |
| hostname | controller-1 |
| id | 2 |
| install_output | text |
| install_state | completed |
| install_state_info | None |
| invprovision | provisioned |
| location | {} |
| mgmt_ip | 192.168.204.104 |
| mgmt_mac | 2c:44:fd:83:b6:ed |
| operational | disabled |
| personality | controller |
| reserved | False |
| rootfs_device | sda |
| serialid | None |
| software_load | 19.01 |
| task | |
| tboot | false |
| ttys_dcd | None |
| updated_at | 2019-06-24T18:39:06.100219+00:00 |
| uptime | 12697 |
| uuid | 99a714bf-0f57-4db2-b418-e9ec5095f274 |
| vim_progress_status | services-disabled |
+---------------------+-----------------------------------------------------------------------+

2019-06-24 18:48:17,898] 268 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://192.168.204.102:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne host-list'
[2019-06-24 18:48:19,927] 387 DEBUG MainThread ssh.expect :: Output:
+----+--------------+-------------+----------------+-------------+--------------+
| id | hostname | personality | administrative | operational | availability |
+----+--------------+-------------+----------------+-------------+--------------+
| 1 | controller-0 | controller | unlocked | enabled | available |
| 2 | controller-1 | controller | unlocked | enabled | available |
| 3 | compute-0 | worker | unlocked | enabled | available |
| 4 | compute-1 | worker | unlocked | enabled | available |
+----+--------------+-------------+----------------+-------------+--------------+

8.204.102:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne application-list'
[2019-06-24 18:48:15,612] 387 DEBUG MainThread ssh.expect :: Output:
+---------------------+--------------------------------+-------------------------------+--------------------+----------+---------------------------------------------------------------------+
| application | version | manifest name | manifest file | status | progress |
+---------------------+--------------------------------+-------------------------------+--------------------+----------+---------------------------------------------------------------------+
| platform-integ-apps | 1.0-7 | platform-integration-manifest | manifest.yaml | applied | completed |
| stx-openstack | 1.0-16-centos-stable-versioned | armada-manifest | stx-openstack.yaml | applying | processing chart: osh-kube-system-ingress, overall completion: 4.0% |
+---------------------+--------------------------------+-------------------------------+--------------------+----------+---------------------------------------------------------------------+

2019-06-24 18:51:17,056] 387 DEBUG MainThread ssh.expect :: Output:
+---------------------+--------------------------------+-------------------------------+--------------------+----------+---------------------------------------------------------------------+
| application | version | manifest name | manifest file | status | progress |
+---------------------+--------------------------------+-------------------------------+--------------------+----------+---------------------------------------------------------------------+
| platform-integ-apps | 1.0-7 | platform-integration-manifest | manifest.yaml | applied | completed |
| stx-openstack | 1.0-16-centos-stable-versioned | armada-manifest | stx-openstack.yaml | applying | processing chart: osh-kube-system-ingress, overall completion: 4.0% |
+---------------------+--------------------------------+-------------------------------+--------------------+----------+-------------------------------------------
2019-06-24 18:51:17,056] 387 DEBUG MainThread ssh.expect :: Output:
+---------------------+--------------------------------+-------------------------------+--------------------+----------+---------------------------------------------------------------------+
| application | version | manifest name | manifest file | status | progress |
+---------------------+--------------------------------+-------------------------------+--------------------+----------+---------------------------------------------------------------------+
| platform-integ-apps | 1.0-7 | platform-integration-manifest | manifest.yaml | applied | completed |
| stx-openstack | 1.0-16-centos-stable-versioned | armada-manifest | stx-openstack.yaml | applying | processing chart: osh-kube-system-ingress, overall completion: 4.0% |
+---------------------+--------------------------------+-------------------------------+--------------------+----------+-------------------------------------------
Sysinv logs time when complete
19-06-24 18:53:31.753 106249 INFO sysinv.api.controllers.v1.host [-] Provisioned storage node(s) []
2019-06-24 18:53:33.693 102406 INFO sysinv.conductor.kube_app [-] processing chart: osh-openstack-horizon, overall completion: 73.0%
2019-06-24 18:53:34.938 102406 INFO sysinv.conductor.kube_app [-] processing chart: osh-openstack-gnocchi, overall completion: 81.0%
2019-06-24 18:53:36.176 102406 INFO sysinv.conductor.kube_app [-] processing chart: osh-openstack-ceilometer, overall completion: 88.0%
2019-06-24 18:53:37.533 102406 INFO sysinv.conductor.kube_app [-] Application manifest /manifests/stx-openstack/1.0-16-centos-stable-versioned/stx-openstack-stx-openstack.yaml was successfully applied/re-applied.
2019-06-24 18:53:37.533 102406 INFO sysinv.conductor.kube_app [-] Exiting progress monitoring thread for app stx-openstack
2019-06-24 18:53:38.015 102406 INFO sysinv.conductor.kube_app [-] Application stx-openstack (1.0-16-centos-stable-versioned) apply completed.
2019-06-

Severity
--------
Major
Steps to Reproduce
------------------
1. Check initial state application was in applied state

2. lock and unlock standby controller
2019-06-24 18:39:11,207] 423 DEBUG MainThread ssh.exec_cmd:: Executing command...
[2019-06-24 18:39:11,208] 268 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://192.168.204.102:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne host-show controller-1'
[2019-06-24 18:39:13,281] 387 DEBUG MainThread ssh.expect :: Output:
+---------------------+-----------------------------------------------------------------------+
| Property | Value |
+---------------------+-----------------------------------------------------------------------+
| action | none |
| administrative | locked |
| availability | online |
| bm_ip | 128.224.64.92 |
| bm_type | bmc |
| bm_username | Administrator |
| boot_device | sda |
| capabilities | {u'stor_function': u'monitor', u'Personality': u'Controller-Standby'} |
| config_applied | 6f3cceb1-35a3-4c34-9a20-aae11221e35b |
| config_status | None |
| config_target | 6f3cceb1-35a3-4c34-9a20-aae11221e35b |
| console | ttyS0,115200 |
| created_at | 2019-06-24T14:34:28.008461+00:00 |
| hostname | controller-1 |
| id | 2 |
| install_output | text |
| install_state | completed |
| install_state_info | None |
| invprovision | provisioned |
| location | {} |
| mgmt_ip | 192.168.204.104 |
| mgmt_mac | 2c:44:fd:83:b6:ed |
| operational | disabled |
| personality | controller |
| reserved | False |
| rootfs_device | sda |
| serialid | None |
| software_load | 19.01 |
| task | |
| tboot | false |
| ttys_dcd | None |
| updated_at | 2019-06-24T18:39:06.100219+00:00 |
| uptime | 12697 |
| uuid | 99a714bf-0f57-4db2-b418-e9ec5095f274 |
| vim_progress_status | services-disabled |
+---------------------+-----------------------------------------------------------------------+
[sysadmin@controller-0 ~(keystone_admin)]$
[2019-06-24 18:39:13,282] 268 DEBUG MainThread ssh.send :: Send 'echo $?'

  INFO MainThread host_helper.unlock_hosts:: Unlocking ['controller-1']...
[2019-06-24 18:39:31,987] 1340 DEBUG MainThread ssh.get_active_controller:: Getting active controller client for hp380
[2019-06-24 18:39:31,987] 423 DEBUG MainThread ssh.exec_cmd:: Executing command...
[2019-06-24 18:39:31,987] 268 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://192.168.204.102:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne host-show controller-1'
[2019-06-24 18:39:33,946] 387 DEBUG MainThread ssh.expect :: Output:
+---------------------+-----------------------------------------------------------------------+
| Property | Value |
+---------------------+-----------------------------------------------------------------------+
| action | none |
| administrative | locked |
| availability | online |
| bm_ip | 128.224.64.92 |
| bm_type | bmc |
| bm_username | Administrator |
| boot_device | sda |
| capabilities | {u'stor_function': u'monitor', u'Personality': u'Controller-Standby'} |
| config_applied | 6f3cceb1-35a3-4c34-9a20-aae11221e35b |
| config_status | None |
| config_target | 6f3cceb1-35a3-4c34-9a20-aae11221e35b |
| console | ttyS0,115200 |
| created_at | 2019-06-24T14:34:28.008461+00:00 |
| hostname | controller-1 |
| id | 2 |
| install_output | text |
| install_state | completed |
| install_state_info | None |
| invprovision | provisioned |
| location | {} |
| mgmt_ip | 192.168.204.104 |
| mgmt_mac | 2c:44:fd:83:b6:ed |
| operational | disabled |
| personality | controller |
| reserved | False |
| rootfs_device | sda |
| serialid | None |
| software_load | 19.01 |
| task | |
| tboot | false |
| ttys_dcd | None |
| updated_at | 2019-06-24T18:39:06.100219+00:00 |
| uptime | 12697 |
| uuid | 99a714bf-0f57-4db2-b418-e9ec5095f274 |
| vim_progress_status | services-disabled |
+---------------------+-----------------------------------------------------------------------+

3. As per description openstack application was not up for 5mins after standby controller is in available state

Expected Behavior
------------------
application is applied within 5mins
Actual Behavior
----------------
application apply took more than 5mins
Reproducibility
---------------
Reproducible
System Configuration
--------------------
regular 2+2 system

Branch/Pull Time/Commit
-----------------------
"20190623T233000Z
Last Pass
---------
Not sure
Timestamp/Logs
--------------
2019-06-24T18:39:06.100219+00:0
Test Activity
-------------
Regression test

Numan Waheed (nwaheed)
tags: added: stx.regression
tags: added: stx.retestneeded
Revision history for this message
Ghada Khalil (gkhalil) wrote :

How long did it take the application to apply after lock/unlock?
Given the bugs says it takes longer after lock/unlock, how long does the apply take initially on the same system?

Also a reminder to include the logs when opening bugs

description: updated
tags: added: stx.containers
Changed in starlingx:
status: New → Incomplete
assignee: nobody → Anujeyan Manokeran (anujeyan)
Revision history for this message
Anujeyan Manokeran (anujeyan) wrote :

This was reproduced in load 20190629T013000Z . It took 5mins 33 seconds to unlock standby enabled and available. After available it took around 7 mins for application for apply complete.

Revision history for this message
Ghada Khalil (gkhalil) wrote :

The next action here is for the technical primes for containers to determine if the recovery times noted above are acceptable or not

Changed in starlingx:
assignee: Anujeyan Manokeran (anujeyan) → Frank Miller (sensfan22)
importance: Undecided → High
status: Incomplete → Triaged
tags: added: stx.2.0
Revision history for this message
Ghada Khalil (gkhalil) wrote :

As suggested by Frank Miller, marking as stx.2.0 for now until the primes make a decision. If the above times are deemed acceptable, this bug can be closed.

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Based on further discussion with Brent Rowsell, given that the stx-openstack apply passes (even though it takes longer), this is medium priority vs high priority.

Changed in starlingx:
importance: High → Medium
Revision history for this message
Frank Miller (sensfan22) wrote :

Current plans for this scenario are to avoid the re-apply on a host unlock when no configuration or override changes have been made. This is being tracked via https://bugs.launchpad.net/starlingx/+bug/1837750. In addition, several services are now optional and are not enabled by default when stx-openstack is applied. This will reduce the tme required to apply stx-openstack. Based on these changes, no additional work is planned to reduce the time required to apply stx-openstack. Marking this LP as a duplicate of 1837750.

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Duplicate bug has been addressed in master & the r/stx.2.0 branch. Marking as "Fix Released" to match.

Changed in starlingx:
status: Triaged → Fix Released
tags: removed: stx.retestneeded
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.