Brief Description
-----------------
Upgrade subcloud from earlier version of StarlingX to the latest version (July 20, 2023) failed due to failure in finding platform-backup partition.
Severity
--------
Critical
Steps to Reproduce
------------------
Bring up a system controller with latest dev load (July 20, 2023)
Apply the commits: https://review.opendev.org/c/starlingx/metal/+/888327 and https://review.opendev.org/c/starlingx/config/+/888329
Import a starlingx-dev load (July 20, 2023)
Deploy a subcloud with an earlier load
Upgrade the subcloud using DC orchestration
Expected Behavior
-----------------
Subcloud upgrade to new load succeeds
Actual Behavior
---------------
Upgrade failed at start upgrade step due to the following error:
sysinv 2023-07-27 17:19:29.524 83055 ERROR zerorpc.core [-] : subprocess.CalledProcessError: Command '['/usr/bin/validate-platform-backup.sh']' returned non-zero exit status 1. 2023-07-27 17:19:29.524 83055 ERROR zerorpc.core Traceback (most recent call last): 2023-07-27 17:19:29.524 83055 ERROR zerorpc.core File "/usr/lib/python3/dist-packages/zerorpc/core.py", line 167, in async_task 2023-07-27 17:19:29.524 83055 ERROR zerorpc.core functor.pattern.process_call(self._context, bufchan, event, functor) 2023-07-27 17:19:29.524 83055 ERROR zerorpc.core File "/usr/lib/python3/dist-packages/zerorpc/patterns.py", line 30, in process_call 2023-07-27 17:19:29.524 83055 ERROR zerorpc.core result = functor(*req_event.args, **req_event.kwargs) 2023-07-27 17:19:29.524 83055 ERROR zerorpc.core File "/usr/lib/python3/dist-packages/zerorpc/decorators.py", line 44, in __call_ 2023-07-27 17:19:29.524 83055 ERROR zerorpc.core return self._functor(*args, **kargs) 2023-07-27 17:19:29.524 83055 ERROR zerorpc.core File "/usr/lib/python3/dist-packages/sysinv/zmq_rpc/zmq_rpc.py", line 39, in method 2023-07-27 17:19:29.524 83055 ERROR zerorpc.core retval = getattr(self.target, func)(context, **kwargs) 2023-07-27 17:19:29.524 83055 ERROR zerorpc.core File "/usr/lib/python3/dist-packages/sysinv/conductor/manager.py", line 12665, in get_system_health 2023-07-27 17:19:29.524 83055 ERROR zerorpc.core return health_util.get_system_health_upgrade( 2023-07-27 17:19:29.524 83055 ERROR zerorpc.core File "/usr/lib/python3/dist-packages/sysinv/common/health.py", line 580, in get_system_health_upgrade 2023-07-27 17:19:29.524 83055 ERROR zerorpc.core success = self._check_platform_backup_partition() 2023-07-27 17:19:29.524 83055 ERROR zerorpc.core File "/usr/lib/python3/dist-packages/sysinv/common/health.py", line 276, in _check_platform_backup_partition 2023-07-27 17:19:29.524 83055 ERROR zerorpc.core subprocess.check_output(args, stderr=subprocess.STDOUT) # pylint: disable=not-callable 2023-07-27 17:19:29.524 83055 ERROR zerorpc.core File "/usr/lib/python3.9/subprocess.py", line 424, in check_output 2023-07-27 17:19:29.524 83055 ERROR zerorpc.core return run(*popenargs, stdout=PIPE, timeout=timeout, check=True, 2023-07-27 17:19:29.524 83055 ERROR zerorpc.core File "/usr/lib/python3.9/subprocess.py", line 528, in run 2023-07-27 17:19:29.524 83055 ERROR zerorpc.core raise CalledProcessError(retcode, process.args, 2023-07-27 17:19:29.524 83055 ERROR zerorpc.core subprocess.CalledProcessError: Command '['/usr/bin/validate-platform-backup.sh']' returned non-zero exit status 1. 2023-07-27 17:19:29.524 83055 ERROR zerorpc.core
Reproducibility
---------------
100% reproducible when / is mounted on nvme
System Configuration
--------------------
DC lab with at least 1 subcloud
Load info
---------
BUILD_DATE="2023-07-20 22:00:10 +0000"
Last Pass
---------
N/A
Timestamp/Logs
--------------
See above, issue is readily reproducible
Alarms
------
N/A
Test Activity
-------------
Developer Testing
Workaround
----------
Not available yet