Subcloud iso path needs to be cleaned up if exists prior to remote install

Bug #1977557 reported by Gabriel Silva Trevisan
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Low
Gabriel Silva Trevisan

Bug Description

Brief Description
-----------------
Subcloud remote install will fail if there are files under subcloud iso path. This arises when a subcloud remote install is abruptly terminated following an uncontrolled swact or process restart.

Severity
--------
Minor

Steps to Reproduce
------------------
While a subcloud is being installed, either restart dcmanager-manager, or swact controllers.
Delete the failed subcloud and re-add.

Expected Behavior
------------------
Subcloud can be re-installed

Actual Behavior
----------------
Subcloud failed to be re-installed due to the existence of subcloud files under the iso path.

Reproducibility
---------------
100% reproducible

System Configuration
--------------------
Distributed Cloud

Branch/Pull Time/Commit
-----------------------
2022-01-19 18:42:01

Last Pass
---------
Not aware of any previous testing for this scenario.

Timestamp/Logs
--------------
From system controller /var/log/dcmanager/dcmanager.log:
==============================================================
2022-05-18 17:53:53.162 3346704 ERROR dcmanager.manager.subcloud_manager Traceback (most recent call last):
2022-05-18 17:53:53.162 3346704 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python2.7/site-packages/dcmanager/manager/subcloud_manager.py", line 736, in run_deploy
2022-05-18 17:53:53.162 3346704 ERROR dcmanager.manager.subcloud_manager payload['install_values'])
2022-05-18 17:53:53.162 3346704 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python2.7/site-packages/dccommon/subcloud_install.py", line 418, in prep
2022-05-18 17:53:53.162 3346704 ERROR dcmanager.manager.subcloud_manager self.update_iso(override_path, iso_values)
2022-05-18 17:53:53.162 3346704 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python2.7/site-packages/dccommon/subcloud_install.py", line 375, in update_iso
2022-05-18 17:53:53.162 3346704 ERROR dcmanager.manager.subcloud_manager raise Exception(msg)
2022-05-18 17:53:53.162 3346704 ERROR dcmanager.manager.subcloud_manager Exception: Failed to update iso ['/usr/local/bin/gen-bootloader-iso.sh', '--input', '/opt/dc-vault/loads/22.06/bootimage.iso', '--www-root', '/opt/platform/iso/22.06' ...],
==============================================================

From system controller /var/log/user.log:
==============================================================
2022-05-18T17:53:53.000 controller-0 gen-bootloader-iso.sh[3473776]: notice Output dir already exists: /opt/platform/iso/22.06/nodes/subcloud3013
==============================================================

Test Activity
-------------
Developer Testing

Workaround
----------
If the subcloud fails to be installed for this reason, files under subcloud iso path will be cleaned up. Deleting and re-adding the subcloud the second time will work.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to distcloud (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/distcloud/+/844629

Changed in starlingx:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to distcloud (master)

Reviewed: https://review.opendev.org/c/starlingx/distcloud/+/844629
Committed: https://opendev.org/starlingx/distcloud/commit/388b82609185479bc7aae8cda70c0784d5a465a7
Submitter: "Zuul (22348)"
Branch: master

commit 388b82609185479bc7aae8cda70c0784d5a465a7
Author: Gabriel Silva Trevisan <email address hidden>
Date: Fri Jun 3 14:41:24 2022 -0300

    Clean up subcloud iso path on remote reinstall

    When remote installing a subcloud after an abrupt termination on a
    previous attempt (e.g.: uncontrolled swact, service restart), a failure
    might occur on gen-bootloader-iso script, due to a remaining ISO
    directory still being present for the subcloud.

    Clean up ISO path for the subcloud to make sure it is empty before
    calling the script.

    Test Plan:

    PASS:
    - Reinstall subcloud with an existing directory in ISO path and make
      sure installation removes it before proceeding.
    - Reinstall subcloud after previous attempt abruptly terminates due to
      service restart.
    - Reinstall subcloud after previous attempt abruptly terminates due to
      host swact.
    - Reinstall two subclouds in parallel after abrupt termination and
      ensure cleanups only delete their respective subcloud directories.

    Closes-bug: 1977557

    Signed-off-by: Gabriel Silva Trevisan <email address hidden>
    Change-Id: I504dd108d6654294a3bbe9747d6aee0a0cec97be

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → Low
tags: added: stx.7.0 stx.distcloud
Changed in starlingx:
assignee: nobody → Gabriel Silva Trevisan (g-trevisan)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.