platform-integ-apps is being re-applied after locking standby controller

Bug #1846056 reported by Yang Liu
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Bob Church

Bug Description

Brief Description
-----------------
platform-integ-apps is automatically re-applied after standby controller is locked or unlocked.
This slows down the host-lock completion by 1-2 minutes.

Severity
--------
Major

Steps to Reproduce
------------------
- system host-lock <standby_controller>
- Wait for controller to be locked and online
- Observe system application-list

Expected Behavior
------------------
- standby controller-0 is locked and platform-integ-apps is not re-applied

Actual Behavior
----------------
- standby controller-0 is locked
- platform-integ-apps started reapplying after that
- system host-unlock <standby_controller> is rejected until the platform app is applied

Reproducibility
---------------
100% Reproducible

System Configuration
--------------------
Multi-node system
Lab-name:
wcp71-75, wcp63-66

Branch/Pull Time/Commit
-----------------------
stx master as of 2019-09-29

Last Pass
---------
master 2019-09-24

Timestamp/Logs
--------------
# host-lock
2019-09-30T16:04:03.000 controller-0 -sh: info HISTORY: PID=50664 UID=42425 system host-lock controller-1

# app started reapplying
+---------------------+---------+-------------------------------+---------------+----------+--------------------------+
| application | version | manifest name | manifest file | status | progress |
+---------------------+---------+-------------------------------+---------------+----------+--------------------------+
| platform-integ-apps | 1.0-8 | platform-integration-manifest | manifest.yaml | applying | retrieving docker images |
+---------------------+---------+-------------------------------+---------------+----------+--------------------------+
Mon Sep 30 16:04:33 UTC 2019

# app reapplied
+---------------------+---------+-------------------------------+---------------+---------+-----------+
| application | version | manifest name | manifest file | status | progress |
+---------------------+---------+-------------------------------+---------------+---------+-----------+
| platform-integ-apps | 1.0-8 | platform-integration-manifest | manifest.yaml | applied | completed |
+---------------------+---------+-------------------------------+---------------+---------+-----------+
Mon Sep 30 16:05:33 UTC 2019

Test Activity
-------------
Sanity

Yang Liu (yliu12)
summary: - platform-integ-apps is being re-applied upon system host-lock
+ platform-integ-apps is being re-applied after locking standby controller
Frank Miller (sensfan22)
Changed in starlingx:
assignee: nobody → Bob Church (rchurch)
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Marking as stx.3.0 / medium priority - operation optimization

Changed in starlingx:
status: New → Triaged
importance: Undecided → Medium
tags: added: stx.3.0 stx.containers
Yang Liu (yliu12)
tags: added: stx.retestneeded
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (master)

Fix proposed to branch: master
Review: https://review.opendev.org/687229

Changed in starlingx:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/687229
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=6edc4821bdfe040f8988ac1696b31e0af83c77f6
Submitter: Zuul
Branch: master

commit 6edc4821bdfe040f8988ac1696b31e0af83c77f6
Author: Robert Church <email address hidden>
Date: Tue Oct 8 03:36:01 2019 -0400

    Prevent system managed apps from re-applying on host lock

    The platform app was re-applying on host-lock of controller-1. This was
    blocking additional application actions until the application was
    reapplied. With duplex controller installs, typically a controller will
    be locked for a maintenance action and then unlocked. There is no need
    to reapply the platform application to adjust the replica count for this
    scenario.

    This change will only evaluate if the application needs to be reapplied
    once VIM services are enabled after unlock. This also allows manual
    application applies to update the platform apps while one of the duplex
    controllers is locked.

    Test Scenario #1:
    - controller-1 lock:
      - no override eval
      - no app re-apply
    - controller-1 unlock:
      - override eval for platform-integ-app after VIM reports services
        enabled
      - no apply needed as the replica count remained 2

    Test Scenario #2:
    - controller-1 lock:
      - no override eval
      - no app re-apply
    - manual apply of platform-integ-app
      - reapplied: 2 provisioners (one pending) -> 1 provisioner
    - controller-1 unlock:
      - override eval for platform-integ-app after VIM reports services
        enabled
      - re-apply flag generated: replicas change: 1->2
      - k8s audit reapplies app after 60s

    Change-Id: I60fdde2c739f96efb6a5916cc59a1cba43ca5b32
    Closes-Bug: #1846056
    Signed-off-by: Robert Church <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Revision history for this message
Boris Shteinbock (bshteinb) wrote :
Download full text (5.8 KiB)

# Testing Status
PASSED

# Configuration
standard system:
2 controllers
2 computes

# Load Tested
2019-09-11_00-10-00

1. Standby controller was locked
2. Application list was requested.
3. Platform-integ-apps confirmed not to be reapplied and lock was complete:

[sysadmin@controller-1 ~(keystone_admin)]$ system host-list
+----+--------------+-------------+----------------+-------------+--------------+
| id | hostname | personality | administrative | operational | availability |
+----+--------------+-------------+----------------+-------------+--------------+
| 1 | controller-0 | controller | unlocked | enabled | available |
| 2 | controller-1 | controller | unlocked | enabled | available |
| 4 | compute-0 | worker | unlocked | enabled | available |
| 5 | compute-1 | worker | unlocked | enabled | available |
+----+--------------+-------------+----------------+-------------+--------------+
[sysadmin@controller-1 ~(keystone_admin)]$ system host-lock controller-0
+---------------------+-----------------------------------------------------------------+
| Property | Value |
+---------------------+-----------------------------------------------------------------+
| action | none |
| administrative | unlocked |
| availability | available |
| bm_ip | None |
| bm_type | None |
| bm_username | None |
| boot_device | /dev/disk/by-path/pci-0000:04:00.0-sas-0x5000c500761985d5-lun-0 |
| capabilities | {u'stor_function': u'monitor'} |
| config_applied | 3d02805e-5b82-4d10-ae65-bcd72dc017f9 |
| config_status | None |
| config_target | 3d02805e-5b82-4d10-ae65-bcd72dc017f9 |
| console | tty0 |
| created_at | 2019-09-25T18:09:38.153052+00:00 |
| hostname | controller-0 |
| id | 1 |
| install_output | text |
| install_state | None |
| install_state_info | None |
| inv_state | inventoried |
| invprovision | provisioned |
| location | {} ...

Read more...

Yang Liu (yliu12)
tags: removed: stx.retestneeded
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.