Comment 1 for bug 2013800

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/c/starlingx/config/+/877724
Committed: https://opendev.org/starlingx/config/commit/8f02a3cf7bc61956f7245ce02ed0c280ca07a75c
Submitter: "Zuul (22348)"
Branch: master

commit 8f02a3cf7bc61956f7245ce02ed0c280ca07a75c
Author: Dan Voiculeasa <email address hidden>
Date: Fri Mar 17 02:32:46 2023 +0200

    Defer certificate install during app downloading images

    It is observed that when the docker registry is in use(eg. app
    download images) if it is restarted, it will wrongly report some
    images as being successfully downloaded, when they are not. No error
    is thrown to the docker API client used, thus the error is silently
    hidden.
    By docker registry in use we mean an image push to the registry is in
    progress.
    Because the failed push is hidden, the error will be propagated and
    the components needing the images will fail.

    This behavior was observed during a particular case: upgrade of the
    system. It is observed that the cause for docker registry restart is a
    manifest that is run [1].

    Defer the logic for installing the certificate (files and manifest).
    Implement file deferral, which is needed.
    Consider the condition for deferral to be the present of apps that
    will have the images downloaded by the framework part of
    restore/upgrade procedure.

    Note: outside of the scope of this work, seems deferrals will be
    forgotten and not attempted after a sysinv-conductor restart.

    Tests:
    PASS: Deploy AIO-DX SystemController DC,
          Deploy AIO-SX Subcloud DC,
          Deploy AIO-SX
    PASS: Observe the new log entries for both deferred and instant
          config type config_update_file filter_mapping ...
          config type config_apply_runtime_manifest filter_mapping ...
          config type ... False (wait)
          config type ... True (continue)
    PASS: Applied a docker certificate and observed the manifest and
          files updated intantly, no app in 'restore-requested' or
          'applying' state
    PASS: Changed one app state to 'restore-requested' and 'applying',
          also alternating between them. Applied a docker certificate
          and observed the manifest and files are deferred until the app
          is moved out of these 2 states.
          Observed the manifest applied after the wait is indeed the one
          restarting the docker registry

    [1]: https://opendev.org/starlingx/config/src/commit/c937f46ecee2802473d786ab8c0addddb9039abc/sysinv/sysinv/sysinv/sysinv/conductor/manager.py#L13449-L13453
    Closes-Bug: 2013800
    Signed-off-by: Dan Voiculeasa <email address hidden>
    Change-Id: Ie0e5d6cee625335431d73114d28edade4cf6663c