Downloaded (prestage) images should be removed from the local registry post archive for stx6 subcloud

Bug #2042797 reported by Tee Ngo
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
In Progress
Undecided
Tee Ngo

Bug Description

Brief Description
-----------------
Upgrade of a stx6 simplex subcloud that was previously prestaged with stx8 data can lead to failure due to insufficient space to backup 22.12 container images.

Severity
--------
Major

Steps to Reproduce
------------------
- Bringup a DC system with stx6
- Upgrade the system controller to stx8
- Upload a prestage images list consisting of stx8 platform images using dcmanager subcloud-deploy upload --prestage-images command
- Prestage a stx6 simplex subcloud using dcmanager subcloud prestage command
- Upgrade the stx6 simplex subcloud

Expected Behavior
------------------
Upgrade completes successfully.

Actual Behavior
----------------
Upgrade failed at "starting upgrade" step due to not enough space in /opt/platform-backup. The stx6 backup playbook treated stx8 platform images as "user" images. Hence, it attempted to generate an images backup and ran out of space. The backup playbook should not be producing any images backup over simplex upgrade especially when the subcloud is already prestaged.

Reproducibility
---------------
This issue is only reproducible where /opt/platform-backup partition on the subcloud does not have enough space to accommodate the backup of images in the prestage list.

System Configuration
--------------------
Distributed Cloud

Branch/Pull Time/Commit
-----------------------
Nov 1st, 2023 master load

Last Pass
---------
Simplex upgrade passed many times before where /opt/platform-backup space is plentiful.

Timestamp/Logs
--------------
The failure can be easily induced.

Test Activity
-------------
Feature Testing

Workaround
----------
Either manually clean up /opt/platform-backup to make space or remove stx8 images from subcloud local registry before retrying the upgrade.

Tee Ngo (teewrs)
description: updated
Changed in starlingx:
assignee: nobody → Tee Ngo (teewrs)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ansible-playbooks (master)
Changed in starlingx:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ansible-playbooks (master)
Download full text (3.4 KiB)

Reviewed: https://review.opendev.org/c/starlingx/ansible-playbooks/+/900150
Committed: https://opendev.org/starlingx/ansible-playbooks/commit/44a8556df1ec725b6e6b50997d79b2150e112e35
Submitter: "Zuul (22348)"
Branch: master

commit 44a8556df1ec725b6e6b50997d79b2150e112e35
Author: Tee Ngo <email address hidden>
Date: Sun Nov 5 22:20:17 2023 -0500

    Purge downloaded images from local registry post prestage (stx6)

    To improve simplex platform upgrade performance ([1]), images
    prestage now produces an archive in a more optimized format than
    in CentOS releases. The prestage images are downloaded from
    configured sources and pushed to the subcloud local registry.
    Subsequently, a snapshot of the local registry is taken and saved
    in the prestage directory. This logic applies to both Debian
    and CentOS subclouds as part of [2].

    In this commit, the logic is slightly updated for stx6 subclouds
    (CentOS). Any downloaded images for prestage will be purged from
    the local registry after the registry snapshot has been taken.
    This change prevents the stx6 backup playbook from trying to
    generate another images backup from the downloaded images over
    stx6 to stx8 upgrade.

    Backup playbook update to avoid generating images backup over
    upgrade when there is already a prestaged registry archive will be
    addressed in a separate commit.

    [1] https://storyboard.openstack.org/#!/story/2010798
    [2] https://storyboard.openstack.org/#!/story/2010611

    Test Plan:
      1. Verify successful subcloud prestage for stx6 release with a
         prestage image list. Images are downloaded from configured
         source(s) and images bundles are generated and saved in the
         prestage directory.
      2. Repeat test case #1 without an image list. If the subcloud is
         running stx6, image bundles are generated from the content of
         the subcloud local registry. If the subcloud is running stx8,
         images prestage is skipped.
      3. Verify successful subcloud prestage for stx8 release with a
         prestage image list. Images are downloaded from configured
         source(s) and pushed to the local registry. Subsequently, the
         local registry snapshot is taken and saved in prestage directory.
         Verify that
            a) on subcloud running stx6, downloaded images are removed
               from the subcloud local registry.
            b) on subcloud running stx8 and higher, downloaded images
               are not removed from the subcloud local registry.
      4. Repeat test case #3 without an image list. If the subcloud is
         running stx6, images prestage is skipped. If the subcloud is
         running stx8, a snapshot of the subcloud local registry is
         taken and saved in the prestage directory.
      5. Verify successful subcloud upgrade from stx6 to stx8 post
         subcloud prestage.
      6. Upload a prestage images list containing a non-existent image,
         verify that downloaded images are purged from stx6 subcloud
         local registry post prestage failure.
      7. Upload a prestage image list containing all i...

Read more...

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.