Failed to delete backup on system controller when subclouds are offline

Bug #2031670 reported by Gustavo Lyra Pereira
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Low
Gustavo Lyra Pereira

Bug Description

Brief Description
-----------------
Backup delete command fails to delete subcloud backup in central storage when subcloud is offline.

Severity
--------
Provide the severity of the defect.

Minor: System/Feature is usable with minor issue

Steps to Reproduce
------------------
Deploy a subcloud.
Take a subcloud backup.
Shutdown the backed up subcloud.
Run backup delete command.

Expected Behavior
------------------
Subcloud backup directory is removed.

Actual Behavior
----------------
Command fails to execute.

Reproducibility
---------------

100% reproducible

System Configuration
--------------------
DC / subcloud

Branch/Pull Time/Commit
-----------------------

SW_VERSION="22.12"
BUILD_TARGET="Host Installer"
BUILD_TYPE="Formal"
BUILD_ID="2022-12-19_02-22-00"
SRC_BUILD_ID="38"JOB="wrcp-22.12-debian"
BUILD_BY="jenkins"
BUILD_NUMBER="50"
BUILD_HOST="yow-wrcp-lx.wrs.com"
BUILD_DATE="2022-12-19 07:22:00 +0000"

Last Pass
---------

test escape

Timestamp/Logs
--------------
2023-08-09 20:44:52.498 89460 INFO dcmanager.manager.service [req-244974f4-1d48-4a3a-8fae-8d3a42e51c39 a6a63d7b9dff45f1a8c2ee8d9c531a29 - - default default] Handling delete_subcloud_backups request for subcloud ID: 24
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager [-] Failed to prepare subcloud subcloud4 for backup delete: keystoneauth1.exceptions.connection.ConnectTimeout: Request to https://[fdff:719a:bf60:1109::2]:5001/v3/auth/tokens timed out
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager Traceback (most recent call last):
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 169, in _new_conn
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager conn = connection.create_connection(
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/urllib3/util/connection.py", line 96, in create_connection
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager raise err
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/urllib3/util/connection.py", line 86, in create_connection
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager sock.connect(sa)
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/eventlet/greenio/base.py", line 263, in connect
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager self._trampoline(fd, write=True, timeout=timeout, timeout_exc=_timeout_exc)
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/eventlet/greenio/base.py", line 208, in _trampoline
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager return trampoline(fd, read=read, write=write, timeout=timeout,
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/eventlet/hubs/__init__.py", line 159, in trampoline
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager return hub.switch()
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/eventlet/hubs/hub.py", line 298, in switch
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager return self.greenlet.switch()
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager socket.timeout: timed out
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager During handling of the above exception, another exception occurred:
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager Traceback (most recent call last):
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 699, in urlopen
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager httplib_response = self._make_request(
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 382, in _make_request
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager self._validate_conn(conn)
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 1012, in _validate_conn
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager conn.connect()
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 353, in connect
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager conn = self._new_conn()
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 174, in _new_conn
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager raise ConnectTimeoutError(
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager urllib3.exceptions.ConnectTimeoutError: (<urllib3.connection.HTTPSConnection object at 0x7f1292faa700>, 'Connection to fdff:719a:bf60:1109::2 timed out. (connect timeout=10.0)')
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager During handling of the above exception, another exception occurred:
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager Traceback (most recent call last):
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/requests/adapters.py", line 439, in send
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager resp = conn.urlopen(
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 755, in urlopen
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager retries = retries.increment(
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/urllib3/util/retry.py", line 574, in increment
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager raise MaxRetryError(_pool, url, error or ResponseError(cause))
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='fdff:719a:bf60:1109::2', port=5001): Max retries exceeded with url: /v3/auth/tokens (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f1292faa700>, 'Connection to fdff:719a:bf60:1109::2 timed out. (connect timeout=10.0)'))
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager During handling of the above exception, another exception occurred:
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager Traceback (most recent call last):
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/keystoneauth1/session.py", line 1012, in _send_request
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager resp = self.session.request(method, url, **kwargs)
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/requests/sessions.py", line 542, in request
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager resp = self.send(prep, **send_kwargs)
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/requests/sessions.py", line 655, in send
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager r = adapter.send(request, **kwargs)
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/requests/adapters.py", line 504, in send
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager raise ConnectTimeout(e, request=request)
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager requests.exceptions.ConnectTimeout: HTTPSConnectionPool(host='fdff:719a:bf60:1109::2', port=5001): Max retries exceeded with url: /v3/auth/tokens (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f1292faa700>, 'Connection to fdff:719a:bf60:1109::2 timed out. (connect timeout=10.0)'))
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager During handling of the above exception, another exception occurred:
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager Traceback (most recent call last):
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/dcmanager/manager/subcloud_manager.py", line 901, in _delete_subcloud_backup
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager inventory_file = self._create_subcloud_inventory_file(subcloud)
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/dcmanager/manager/subcloud_manager.py", line 991, in _create_subcloud_inventory_file
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager keystone_client = OpenStackDriver(
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/dccommon/drivers/openstack/sdk_platform.py", line 96, in __init__
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager raise exception
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/dccommon/drivers/openstack/sdk_platform.py", line 88, in __init__
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager self.keystone_client = KeystoneClient(region_name, auth_url)
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/dccommon/drivers/openstack/keystone_v3.py", line 40, in __init__
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager self.services_list = self.keystone_client.services.list()
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/keystoneclient/v3/services.py", line 90, in list
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager return super(ServiceManager, self).list(
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/keystoneclient/base.py", line 86, in func
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager return f(*args, **new_kwargs)
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/keystoneclient/base.py", line 448, in list
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager list_resp = self._list(url_query, self.collection_key)
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/keystoneclient/base.py", line 141, in _list
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager resp, body = self.client.get(url, **kwargs)
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/keystoneauth1/adapter.py", line 395, in get
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager return self.request(url, 'GET', **kwargs)
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/keystoneauth1/adapter.py", line 554, in request
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager resp = super(LegacyJsonAdapter, self).request(*args, **kwargs)
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/keystoneauth1/adapter.py", line 257, in request
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager return self.session.request(url, method, **kwargs)
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/keystoneauth1/session.py", line 780, in request
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager auth_headers = self.get_auth_headers(auth)
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/keystoneauth1/session.py", line 1191, in get_auth_headers
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager return auth.get_headers(self, **kwargs)
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/keystoneauth1/plugin.py", line 95, in get_headers
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager token = self.get_token(session)
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/keystoneauth1/identity/base.py", line 88, in get_token
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager return self.get_access(session).auth_token
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/keystoneauth1/identity/base.py", line 134, in get_access
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager self.auth_ref = self.get_auth_ref(session)
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/keystoneauth1/identity/generic/base.py", line 208, in get_auth_ref
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager return self._plugin.get_auth_ref(session, **kwargs)
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/keystoneauth1/identity/v3/base.py", line 187, in get_auth_ref
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager resp = session.post(token_url, json=body, headers=headers,
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/keystoneauth1/session.py", line 1139, in post
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager return self.request(url, 'POST', **kwargs)
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/keystoneauth1/session.py", line 921, in request
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager resp = send(**kwargs)
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager File "/usr/lib/python3/dist-packages/keystoneauth1/session.py", line 1019, in _send_request
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager raise exceptions.ConnectTimeout(msg)
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager keystoneauth1.exceptions.connection.ConnectTimeout: Request to https://[fdff:719a:bf60:1109::2]:5001/v3/auth/tokens timed out
2023-08-09 20:45:12.530 89460 ERROR dcmanager.manager.subcloud_manager
2023-08-09 20:45:12.534 89460 INFO dcmanager.manager.subcloud_manager [req-244974f4-1d48-4a3a-8fae-8d3a42e51c39 a6a63d7b9dff45f1a8c2ee8d9c531a29 - - default default] Processed subcloud subcloud4 for backup delete (operation 100% complete, 0 subcloud(s) remaining)
2023-08-09 20:45:12.535 89460 INFO dcmanager.manager.subcloud_manager [req-244974f4-1d48-4a3a-8fae-8d3a42e51c39 a6a63d7b9dff45f1a8c2ee8d9c531a29 - - default default] Subcloud backup delete operation finished
2023-08-09 20:45:12.536 89460 ERROR dcmanager.manager.subcloud_manager [req-244974f4-1d48-4a3a-8fae-8d3a42e51c39 a6a63d7b9dff45f1a8c2ee8d9c531a29 - - default default] Backup delete failed for all applied subclouds

Test Activity
-------------

Regression Testing.

Workaround
----------

Delete backup manually.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to distcloud (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/distcloud/+/891837

Changed in starlingx:
status: New → In Progress
Changed in starlingx:
assignee: nobody → Gustavo Lyra Pereira (gustavolyrap)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to distcloud (master)

Reviewed: https://review.opendev.org/c/starlingx/distcloud/+/891837
Committed: https://opendev.org/starlingx/distcloud/commit/94b9a609798019cf4141dc5ea5121728c113d313
Submitter: "Zuul (22348)"
Branch: master

commit 94b9a609798019cf4141dc5ea5121728c113d313
Author: Gustavo Pereira <email address hidden>
Date: Thu Aug 17 12:05:33 2023 -0300

    Fix offline subcloud backup delete

    This commit fixes the issue that prevented a removal of a
    subcloud backup stored on central storage when the subcloud
    is offline.

    The _create_subcloud_inventory_file() function queries the
    subcloud for the bootstrap-address to write the ansible
    inventory file, this is not possible when the subcloud is
    offline.
    When the local_only flag is False, the playbook runs locally
    on the system controller, so the subcloud inventory is not
    needed. This commit skips the inventory file creation when
    local_only is False.

    Test Plan:

    PASS - Deploy a subcloud, create a subcloud backup
    stored on central storage and shutdown the subcloud.
    After the subcloud is offline run dcmanager subcloud
    backup delete command.

    PASS - Deploy a subcloud, create a subcloud backup
    stored in local storage with --local-only parameter
    set. Run delete command to remove the created backup.

    Closes-bug: 2031670

    Change-Id: I7f7f3441c5b6cd3aa9f89a676283c9d33d410aaa
    Signed-off-by: Gustavo Pereira <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → Low
tags: added: stx.9.0 stx.distcloud stx.update
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.