CLI shows volume backup deleted, however it is actually not removed

Bug #1764269 reported by lucky
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Cinder
Incomplete
Undecided
Unassigned

Bug Description

When swift-proxy-server is stopped during Delete backup operation, delete completion is displayed irrespective of deletion process not being completed internally.

Source code should be modified in order to implement error processing and retry processing at Cinder side. As preventing measure, it is recommended to delete object from backup container through Swift Delete Object operation.
To improve it, we need to throw an exception from delete() method present in chunkeddriver.py file, If it is unable to generate the list of objects.
In the operation, monitoring of relevant components need to be done and further, take manual measures whenever required.

Changed in cinder:
assignee: nobody → Rajat Dhasmana (whoami-rajat)
Revision history for this message
Rajat Dhasmana (whoami-rajat) wrote :

Hi,
I have tried to reproduce this issue on latest release but not able to validate.
May be it's already fixed on latest but I am not sure about it. Also delete() function has changed to delete_backup() in latest release(Rocky).
Please suggest the conclusion pointer to verify the bug because "cinder backup-list" showing blank data and I am considering it has deleting/erasing completely.

Followed steps by me:
========================================
stack@ubuntu-xenial:~/cinder$ cinder backup-create --name newvol_bak 5d85b172-26f9-4f84-92e8-50941f3e7f1a +-----------+--------------------------------------+
| Property | Value |
+-----------+--------------------------------------+
| id | 918a9142-7ff3-448c-a90e-78c9b5832e96 |
| name | newvol_bak |
| volume_id | 5d85b172-26f9-4f84-92e8-50941f3e7f1a |
+-----------+--------------------------------------+

stack@ubuntu-xenial:~/cinder$ cinder backup-list
+--------------------------------------+--------------------------------------+-----------+------------+------+--------------+---------------+
| ID | Volume ID | Status | Name | Size | Object Count | Container |
+--------------------------------------+--------------------------------------+-----------+------------+------+--------------+---------------+
| 918a9142-7ff3-448c-a90e-78c9b5832e96 | 5d85b172-26f9-4f84-92e8-50941f3e7f1a | available | newvol_bak | 1 | 22 | volumebackups |
+--------------------------------------+--------------------------------------+-----------+------------+------+--------------+---------------+

stack@ubuntu-xenial:~/cinder$ cinder backup-delete 918a9142-7ff3-448c-a90e-78c9b5832e96 & sleep 3; sudo systemctl stop <email address hidden>
[1] 20381Request to delete backup 918a9142-7ff3-448c-a90e-78c9b5832e96 has been accepted.[1]+ Done cinder backup-delete 918a9142-7ff3-448c-a90e-78c9b5832e96

stack@ubuntu-xenial:~/cinder$ cinder backup-list
+----+-----------+--------+------+------+--------------+-----------+
| ID | Volume ID | Status | Name | Size | Object Count | Container |
+----+-----------+--------+------+------+--------------+-----------+
+----+-----------+--------+------+------+--------------+-----------+

I have tried to stop swift-proxy service multiple times(parallely with delete operation of cinder backup operation and with 1~3 seconds of difference) but it is always resulting in complete deletion.
Please let me know where should I check for the internal thing for the occurrence.

Changed in cinder:
assignee: Rajat Dhasmana (whoami-rajat) → nobody
status: New → Incomplete
Revision history for this message
Eric Miller (erickmiller) wrote :
Download full text (5.2 KiB)

We're using Kolla-Ansible deployed Rocky (7.0.2) with Swift as the backup driver. We use Swift emulation on Ceph, not native Swift. Backups work fine, so communication with the Swift emulation layer is working. Backups are stored in the respective project's volumebackups container (chunks, metadata, and sha256file files).

However, I have seen this same issue where deleting volume backups using "openstack volume backup delete <uuid>" results in a successful deletion of the volume backup record, but the objects in Swift are never deleted. There is no error logged in any of the cinder logs on any node (controllers nor compute), only a successful deletion record in the cinder-backup.log file on the compute node. The debug flag is set to "True" in all config files.

I included the cinder-backup.log entries below. This is indicative of the issue - the object list is
generated object list: []

even though running this, as one of the project users, results in a list of all of the backup objects (152 of them, including the chunks, metadata, and sha256file files):
swift list -p volume_01f54377-b694-408c-a64e-54650f076900/20190827075627/az_us-central-1a_backup_51e2d6a1-b171-46cb-96cc-9786a684661b volumebackups

I haven't tried installing native Swift, since it is unsupported alongside Ceph in Kolla Ansible, so can't try to run the swift proxy service to see if it solves the issue.

At this point, we are thinking of going back to using Ceph native for backups, unless someone has an idea why this is happening.

Eric

/var/lib/docker/volumes/kolla_logs/_data/cinder/cinder-backup.log:2019-09-30 13:46:53.131 6 INFO cinder.backup.manager [req-c8187c1a-34be-435b-9200-00f948699899 720763d70a094bf3b11ebabd31eee896 8acb2072057b45ab9da245880af92c93 - default default] Delete backup started, backup: 51e2d6a1-b171-46cb-96cc-9786a684661b.

/var/lib/docker/volumes/kolla_logs/_data/cinder/cinder-backup.log:2019-09-30 13:46:53.137 6 DEBUG cinder.backup.drivers.swift [req-c8187c1a-34be-435b-9200-00f948699899 720763d70a094bf3b11ebabd31eee896 8acb2072057b45ab9da245880af92c93 - default default] Using swift URL http://192.168.1.254:6780/swift/v1/AUTH_8acb2072057b45ab9da245880af92c93 initialize /var/lib/kolla/venv/lib/python2.7/site-packages/cinder/backup/drivers/swift.py:248

/var/lib/docker/volumes/kolla_logs/_data/cinder/cinder-backup.log:2019-09-30 13:46:53.138 6 DEBUG cinder.backup.drivers.swift [req-c8187c1a-34be-435b-9200-00f948699899 720763d70a094bf3b11ebabd31eee896 8acb2072057b45ab9da245880af92c93 - default default] Connect to http://192.168.1.254:6780/swift/v1/AUTH_ in "per_user" mode initialize /var/lib/kolla/venv/lib/python2.7/site-packages/cinder/backup/drivers/swift.py:250

/var/lib/docker/volumes/kolla_logs/_data/cinder/cinder-backup.log:2019-09-30 13:46:53.138 6 DEBUG cinder.backup.chunkeddriver [req-c8187c1a-34be-435b-9200-00f948699899 720763d70a094bf3b11ebabd31eee896 8acb2072057b45ab9da245880af92c93 - default default] delete started, backup: 51e2d6a1-b171-46cb-96cc-9786a684661b, container: volumebackups, prefix: volume_01f54377-b694-408c-a64e-54650f076900/20190827075627/az_us-central-1a_backup_51e2d6a1-b171-46cb-96cc-9786a684661b. delete_backup /var...

Read more...

Revision history for this message
Eric Miller (erickmiller) wrote :

Just a quick update - I was able to deploy the "swift_proxy_server" container using Kolla Ansible, with some hacking, and once this was running, the volume backup delete function worked as expected.

I am attaching a copy of the respective cinder-backup.log entries, which shows that the "generated object list" is now populated, and a "deleted object" log entry is recorded for each object. The objects were, indeed, deleted correctly after verification.

Eric

Revision history for this message
Eric Miller (erickmiller) wrote :

Another situation where the "generated object list" returns empty is when a large number of volume backup deletes are performed. For example (this is an extreme, but possible, example that I tested), when 1,000 commands such as:

openstack volume backup delete <uuid> &

are run in a bash shell, with valid UUIDs of non-deleted backups (those that have objects reported by Swift), many of the backups are marked as deleted in the Cinder database, but the objects are orphaned on the object storage.

One of the volume backups that was deleted and has orphaned objects produced these two cinder-backup.log entries:

/var/lib/docker/volumes/kolla_logs/_data/cinder/cinder-backup.log:2019-09-30 16:20:33.367 6 DEBUG cinder.backup.chunkeddriver [req-9434f1e1-c382-4d86-8f81-55602dc28185 720763d70a094bf3b11ebabd31eee896 8acb2072057b45ab9da245880af92c93 - default default] delete started, backup: 6efd5baf-4914-411c-bc20-d9fb53504c6d, container: volumebackups, prefix: volume_77bd1522-68d6-43d3-8178-04a096c6db95/20190529071528/az_us-central-1a_backup_6efd5baf-4914-411c-bc20-d9fb53504c6d. delete_backup /var/lib/kolla/venv/lib/python2.7/site-packages/cinder/backup/chunkeddriver.py:807

/var/lib/docker/volumes/kolla_logs/_data/cinder/cinder-backup.log:2019-09-30 16:20:33.418 6 DEBUG cinder.backup.chunkeddriver [req-9434f1e1-c382-4d86-8f81-55602dc28185 720763d70a094bf3b11ebabd31eee896 8acb2072057b45ab9da245880af92c93 - default default] generated object list: []. _generate_object_names /var/lib/kolla/venv/lib/python2.7/site-packages/cinder/backup/chunkeddriver.py:232

showing that it wasn't able to retrieve the object list (but did not error in any way) - even though I can list the objects successfully from this user's account using:

swift list volumebackups -p volume_77bd1522-68d6-43d3-8178-04a096c6db95/20190529071528/az_us-central-1a_backup_6efd5baf-4914-411c-bc20-d9fb53504c6d

Eric

Revision history for this message
Eric Miller (erickmiller) wrote :

We have finished our upgrades to Stein and the problem still exists.

Eric

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.