Containers: charts were re-downloaded and applied when reapply stx-openstack from controller-1 for the first time

Bug #1816173 reported by Yang Liu
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Angie Wang

Bug Description

Brief Description
-----------------
When reapply stx-openstack application from controller-1 for the first time, it was noticed that the charts were re-downloaded and applied.

Severity
--------
Minor

Steps to Reproduce
------------------
1. Install and configure stx platform
2. Deploy stx-openstack application from controller-0
3. Swact active controller to controller-1
4. Reapply stx-openstack application from controller-1

Expected Behavior
------------------
4. Re-apply completes very quickly, as in less than 1 minute since no changes were made to the helm charts.

Expected sysinv logs look like this:
2019-02-14 15:50:38.518 4849 INFO sysinv.conductor.kube_app [-] All docker images for application stx-openstack were successfully downloaded in 0 seconds
2019-02-14 15:50:50.098 4849 INFO sysinv.conductor.kube_app [-] Application manifest /manifests/stx-openstack-manifest-no-tests.yaml was successfully applied/re-applied.

Actual Behavior
----------------
4. helm-charts seems to be re-downloaded and applied.

sysinv.log:

2019-02-15 16:42:36.798 238118 INFO sysinv.conductor.kube_app [-] Generating application overrides...
2019-02-15 16:42:43.014 238118 INFO sysinv.conductor.kube_app [-] Application overrides generated.
2019-02-15 16:43:33.788 238118 INFO sysinv.conductor.kube_app [-] All docker images for application stx-openstack were successfully downloaded in 50 seconds
# charts were then being generated, which took more than a few minutes. I'm not sure how long it takes eventually, since a standby controller lock/unlock was performed after 6 minutes of processing charts, and application eventually failed to apply (which is not the target for this bug).

Reproducibility
---------------
Reproducible
I was this issue twice on a regular system and a storage system.

System Configuration
--------------------
Multi-node system, Dedicated storage

Branch/Pull Time/Commit
-----------------------
f/stein as of 2018-02-13

Timestamp/Logs
--------------
2019-02-15 16:42:36.798 238118 INFO sysinv.conductor.kube_app [-] Generating application overrides...
2019-02-15 16:42:43.014 238118 INFO sysinv.conductor.kube_app [-] Application overrides generated.
2019-02-15 16:43:33.788 238118 INFO sysinv.conductor.kube_app [-] All docker images for application stx-openstack were successfully downloaded in 50 seconds
2019-02-15 16:43:34.794 238118 INFO sysinv.conductor.kube_app [-] Starting progress monitoring thread for app stx-openstack
2019-02-15 16:44:30.154 238118 INFO sysinv.conductor.kube_app [-] processing chart: osh-kube-system-ingress, overall completion: 5.0%
2019-02-15 16:45:21.168 238118 INFO sysinv.conductor.kube_app [-] processing chart: osh-openstack-ingress, overall completion: 10.0%
2019-02-15 16:45:33.670 238118 INFO sysinv.conductor.kube_app [-] processing chart: osh-openstack-rbd-provisioner, overall completion: 14.0%
2019-02-15 16:45:48.423 238118 INFO sysinv.conductor.kube_app [-] processing chart: osh-openstack-ceph-pools-audit, overall completion: 19.0%
2019-02-15 16:45:54.117 238118 INFO sysinv.conductor.kube_app [-] processing chart: osh-openstack-mariadb, overall completion: 24.0%
2019-02-15 16:46:51.953 238118 INFO sysinv.conductor.kube_app [-] processing chart: osh-openstack-garbd, overall completion: 29.0%
2019-02-15 16:48:06.782 238118 INFO sysinv.conductor.kube_app [-] processing chart: osh-openstack-memcached, overall completion: 33.0%
2019-02-15 16:48:26.114 238118 INFO sysinv.conductor.kube_app [-] processing chart: osh-openstack-rabbitmq, overall completion: 38.0%
2019-02-15 16:49:30.858 238118 INFO sysinv.conductor.kube_app [-] processing chart: osh-openstack-keystone, overall completion: 43.0%
...

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Marking as release gating; related to containers.

Changed in starlingx:
importance: Undecided → Medium
assignee: nobody → Tee Ngo (teewrs)
tags: added: stx.2019.05 stx.containers
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Assigning to Tee to do the initial triage

Changed in starlingx:
status: New → Triaged
Ghada Khalil (gkhalil)
Changed in starlingx:
assignee: Tee Ngo (teewrs) → Angie Wang (angiewang)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to stx-config (master)

Fix proposed to branch: master
Review: https://review.openstack.org/638739

Changed in starlingx:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to stx-config (master)

Reviewed: https://review.openstack.org/638739
Committed: https://git.openstack.org/cgit/openstack/stx-config/commit/?id=cb4b30bf56195456ac6b8bd11abf7e23f90f81a4
Submitter: Zuul
Branch: master

commit cb4b30bf56195456ac6b8bd11abf7e23f90f81a4
Author: Angie Wang <email address hidden>
Date: Fri Feb 22 01:21:07 2019 -0500

    Solve the stx-openstack reapply issue on controller-1

    After stx-openstack applied, the stx-openstack reapply shouldn't
    trigger the charts reinstallation if there has no overrides changed
    for charts. However, the reinstallation happens after swacting active
    controller to controller-1 due to the generated images overrides on
    controller-1 are different from before. The images overrides generation
    requires walking through the stx-openstack charts stored under
    /scratch, but charts do not exist on controller-1's /scratch as it's
    an unreplicated filesystem. This causes the images overrides to differ
    between controller-1 and controller-0.

    This commit updates to walk through charts and get the images for
    charts during application-upload, then save the images list for each
    chart into the existing images file under aramda directory
    /opt/platform/armada. The images file would be used for retrieving
    the images for charts to generate images overrides.

    Closes-Bug: 1816173
    Change-Id: I4f00c3031decb063f8f126d0c837acd4dde56fc3
    Signed-off-by: Angie Wang <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to stx-config (f/stein)

Reviewed: https://review.openstack.org/639129
Committed: https://git.openstack.org/cgit/openstack/stx-config/commit/?id=5e61519ac92822b959dffe63b76956cf0e9d0383
Submitter: Zuul
Branch: f/stein

commit 5e61519ac92822b959dffe63b76956cf0e9d0383
Author: Angie Wang <email address hidden>
Date: Fri Feb 22 01:21:07 2019 -0500

    Solve the stx-openstack reapply issue on controller-1

    After stx-openstack applied, the stx-openstack reapply shouldn't
    trigger the charts reinstallation if there has no overrides changed
    for charts. However, the reinstallation happens after swacting active
    controller to controller-1 due to the generated images overrides on
    controller-1 are different from before. The images overrides generation
    requires walking through the stx-openstack charts stored under
    /scratch, but charts do not exist on controller-1's /scratch as it's
    an unreplicated filesystem. This causes the images overrides to differ
    between controller-1 and controller-0.

    This commit updates to walk through charts and get the images for
    charts during application-upload, then save the images list for each
    chart into the existing images file under aramda directory
    /opt/platform/armada. The images file would be used for retrieving
    the images for charts to generate images overrides.

    Closes-Bug: 1816173
    Change-Id: I4f00c3031decb063f8f126d0c837acd4dde56fc3
    Signed-off-by: Angie Wang <email address hidden>
    (cherry picked from commit cb4b30bf56195456ac6b8bd11abf7e23f90f81a4)

tags: added: in-f-stein
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to stx-config (f/stein)

Fix proposed to branch: f/stein
Review: https://review.openstack.org/639397

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to stx-config (f/stein)
Download full text (3.6 KiB)

Reviewed: https://review.openstack.org/639397
Committed: https://git.openstack.org/cgit/openstack/stx-config/commit/?id=bf0aa2c78d2397073baa12c2efcf3bcf2cc9d84b
Submitter: Zuul
Branch: f/stein

commit 611a68a96ab915dc4e97d39dffa5c379bbffef3d
Author: Mingyuan Qi <email address hidden>
Date: Wed Jan 30 09:41:27 2019 +0800

    Allow user specified registries for config_controller

    Currently docker images were pulled from public registries during
    config_controller. For some users, the connection to the public
    docker registry may be slow such that installing the containerized
    services images may timeout or the system simply does not have
    access to the public internet.

    This change allows users to specify alternative public/private
    registries to replace k8s.gcr.io, gcr.io, quay.io and docker.io.
    Insecure registry is supported if all default registries were
    replaced by one unified registry. It lowers the complexity for
    those who build his own registry without internet access.

    Docker doesn't support ipv6 addr as registry name, instead
    hostname or domain name in ipv6 network is allowed.

    Test:
    AIO-SX/AIO-DX/Standard(2+2):
      Alternative public registry (ipv4/domain) with proxy
        - config_controller pass
      Private registry (ipv4/ipv6/domain) without internet
        - config_controller pass
      Default registry with/without proxy
        - config_controller pass

    Story: 2004711
    Task: 28742

    Change-Id: I4fee3f4e0637863b9b5ef4ef556082ac75f62a1d
    Signed-off-by: Mingyuan Qi <email address hidden>

commit cb4b30bf56195456ac6b8bd11abf7e23f90f81a4
Author: Angie Wang <email address hidden>
Date: Fri Feb 22 01:21:07 2019 -0500

    Solve the stx-openstack reapply issue on controller-1

    After stx-openstack applied, the stx-openstack reapply shouldn't
    trigger the charts reinstallation if there has no overrides changed
    for charts. However, the reinstallation happens after swacting active
    controller to controller-1 due to the generated images overrides on
    controller-1 are different from before. The images overrides generation
    requires walking through the stx-openstack charts stored under
    /scratch, but charts do not exist on controller-1's /scratch as it's
    an unreplicated filesystem. This causes the images overrides to differ
    between controller-1 and controller-0.

    This commit updates to walk through charts and get the images for
    charts during application-upload, then save the images list for each
    chart into the existing images file under aramda directory
    /opt/platform/armada. The images file would be used for retrieving
    the images for charts to generate images overrides.

    Closes-Bug: 1816173
    Change-Id: I4f00c3031decb063f8f126d0c837acd4dde56fc3
    Signed-off-by: Angie Wang <email address hidden>

commit a6934ac9d27e0357d0025018077441d989679409
Author: Bin Qian <email address hidden>
Date: Thu Feb 21 14:46:34 2019 -0500

    Boost sm process priority in VBox environment

    There is an instance that sm claimed its main thread ran sluggish
    as some crit...

Read more...

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to stx-config (master)

Fix proposed to branch: master
Review: https://review.openstack.org/640464

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to stx-config (master)
Download full text (15.0 KiB)

Reviewed: https://review.openstack.org/640464
Committed: https://git.openstack.org/cgit/openstack/stx-config/commit/?id=1b22b5313d0618792732066a8fe47460d8ef06de
Submitter: Zuul
Branch: master

commit 654c05df0e45aa47d18ce72e5ba003195872790f
Author: Al Bailey <email address hidden>
Date: Fri Feb 22 16:35:12 2019 -0600

    The --kubernetes flag no longer has an effect.

    kubernetes mode is always enabled, the flag cannot be used to
    enable or disable it.

    The option in the CLI will be removed completely once the wiki
    and any test tools are updated.

    The code that handles the "else" will also be updated in a
    later commit

    Story: 2004751
    Task: 29756
    Change-Id: I75a81ab852252ee108fefeca5682e5b1a9d7374e
    Signed-off-by: Al Bailey <email address hidden>

commit 03b08b9722e83597797de93abef54f787b93bab5
Author: Mingyuan Qi <email address hidden>
Date: Wed Jan 30 09:41:27 2019 +0800

    Allow user specified registries for config_controller

    Currently docker images were pulled from public registries during
    config_controller. For some users, the connection to the public
    docker registry may be slow such that installing the containerized
    services images may timeout or the system simply does not have
    access to the public internet.

    This change allows users to specify alternative public/private
    registries to replace k8s.gcr.io, gcr.io, quay.io and docker.io.
    Insecure registry is supported if all default registries were
    replaced by one unified registry. It lowers the complexity for
    those who build his own registry without internet access.

    Docker doesn't support ipv6 addr as registry name, instead
    hostname or domain name in ipv6 network is allowed.

    Test:
    AIO-SX/AIO-DX/Standard(2+2):
      Alternative public registry (ipv4/domain) with proxy
        - config_controller pass
      Private registry (ipv4/ipv6/domain) without internet
        - config_controller pass
      Default registry with/without proxy
        - config_controller pass

    Story: 2004711
    Task: 28742

    Change-Id: I4fee3f4e0637863b9b5ef4ef556082ac75f62a1d
    Signed-off-by: Mingyuan Qi <email address hidden>
    (cherry picked from commit 611a68a96ab915dc4e97d39dffa5c379bbffef3d)

commit 7471ef852b7c37c742ef273f0df6b8ccce3bd928
Author: Bin Qian <email address hidden>
Date: Thu Feb 21 14:46:34 2019 -0500

    Boost sm process priority in VBox environment

    There is an instance that sm claimed its main thread ran sluggish
    as some critical timer run behind the scheuled timing.
    The issue could prevent the sm from scheduling services.
    As the result, the controller could fail to enable.

    The issue was found only on vbox labs on AIO-SX, the fix is to boost
    sm process priority to nice value -10 from current -2.

    Closes-Bug: 1816764
    Depends-On: https://review.openstack.org/638664
    Change-Id: Iafa17b1c47d65cc7394552ea1c8e7a78398e4869
    Signed-off-by: Bin Qian <email address hidden>
    (cherry picked from commit a6934ac9d27e0357d0025018077441d989679409)

commit 5e61519ac92822b959dffe63b76956cf0...

Ken Young (kenyis)
tags: added: stx.2.0
removed: stx.2019.05
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.