swift backup driver: deleting backup fails when swift returns 404 for object deletion

Bug #2061988 reported by Takashi Kajinami
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Cinder
In Progress
Undecided
Takashi Kajinami

Bug Description

In Puppet OpenStack CI we noticed that a backup is stuck in deleting status occasionally.
Looking into the logs, we found that the issue is caused by the following error.

```
2024-04-17 09:59:56.099 82023 ERROR oslo_messaging.rpc.server [None req-be5cf6cf-f428-43ab-ac71-8fe6871b4fbd f7224af447f14d7cb3e24a13b6593db0 3ad19f84b1e24a30976f937af80d7713 - - - -] Exception during message handling: swiftclient.exceptions.ClientException: Object DELETE failed: http://::1:8080/v1/AUTH_3ad19f84b1e24a30976f937af80d7713/volumebackups/volume_44f5cfbd-2a8e-4491-a6f7-9454e4c27375/20240417095923/az_nova_backup_5896647f-996e-4c06-b216-2d114e73bb80-00012 404 Not Found [first 60 chars of response] b'<html><h1>Not Found</h1><p>The resource could not be found.<' (txn: txb0df610516c649dab1960-00661f9d9a)
2024-04-17 09:59:56.099 82023 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
2024-04-17 09:59:56.099 82023 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.9/site-packages/oslo_messaging/rpc/server.py", line 165, in _process_incoming
2024-04-17 09:59:56.099 82023 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message)
2024-04-17 09:59:56.099 82023 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.9/site-packages/oslo_messaging/rpc/dispatcher.py", line 309, in dispatch
2024-04-17 09:59:56.099 82023 ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args)
2024-04-17 09:59:56.099 82023 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.9/site-packages/oslo_messaging/rpc/dispatcher.py", line 229, in _do_dispatch
2024-04-17 09:59:56.099 82023 ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args)
2024-04-17 09:59:56.099 82023 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.9/site-packages/cinder/backup/manager.py", line 914, in delete_backup
2024-04-17 09:59:56.099 82023 ERROR oslo_messaging.rpc.server self.message_api.create_from_request_context(
2024-04-17 09:59:56.099 82023 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.9/site-packages/oslo_utils/excutils.py", line 227, in __exit__
2024-04-17 09:59:56.099 82023 ERROR oslo_messaging.rpc.server self.force_reraise()
2024-04-17 09:59:56.099 82023 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.9/site-packages/oslo_utils/excutils.py", line 200, in force_reraise
2024-04-17 09:59:56.099 82023 ERROR oslo_messaging.rpc.server raise self.value
2024-04-17 09:59:56.099 82023 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.9/site-packages/cinder/backup/manager.py", line 910, in delete_backup
2024-04-17 09:59:56.099 82023 ERROR oslo_messaging.rpc.server backup_service.delete_backup(backup)
2024-04-17 09:59:56.099 82023 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.9/site-packages/cinder/backup/chunkeddriver.py", line 860, in delete_backup
2024-04-17 09:59:56.099 82023 ERROR oslo_messaging.rpc.server self.delete_object(container, object_name)
2024-04-17 09:59:56.099 82023 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.9/site-packages/cinder/backup/drivers/swift.py", line 418, in delete_object
2024-04-17 09:59:56.099 82023 ERROR oslo_messaging.rpc.server self.conn.delete_object(container, object_name,
2024-04-17 09:59:56.099 82023 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.9/site-packages/swiftclient/client.py", line 1984, in delete_object
2024-04-17 09:59:56.099 82023 ERROR oslo_messaging.rpc.server return self._retry(None, delete_object, container, obj,
2024-04-17 09:59:56.099 82023 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.9/site-packages/swiftclient/client.py", line 1797, in _retry
2024-04-17 09:59:56.099 82023 ERROR oslo_messaging.rpc.server rv = func(self.url, self.token, *args,
2024-04-17 09:59:56.099 82023 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.9/site-packages/swiftclient/client.py", line 1605, in delete_object
2024-04-17 09:59:56.099 82023 ERROR oslo_messaging.rpc.server raise ClientException.from_response(resp, 'Object DELETE failed', body)
2024-04-17 09:59:56.099 82023 ERROR oslo_messaging.rpc.server swiftclient.exceptions.ClientException: Object DELETE failed: http://::1:8080/v1/AUTH_3ad19f84b1e24a30976f937af80d7713/volumebackups/volume_44f5cfbd-2a8e-4491-a6f7-9454e4c27375/20240417095923/az_nova_backup_5896647f-996e-4c06-b216-2d114e73bb80-00012 404 Not Found [first 60 chars of response] b'<html><h1>Not Found</h1><p>The resource could not be found.<' (txn: txb0df610516c649dab1960-00661f9d9a)
2024-04-17 09:59:56.099 82023 ERROR oslo_messaging.rpc.server
```

Further digging into swift logs, we see swift returned 404 to the delete request. This was because that object-replicator run at the middle of delete request process and two object servers returned 404.

proxy-server.log
```
Apr 17 09:59:55 np0037304716 proxy-server[77646]: ::1 ::1 17/Apr/2024/09/59/55 DELETE /v1/AUTH_3ad19f84b1e24a30976f937af80d7713/volumebackups/volume_44f5cfbd-2a8e-4491-a6f7-9454e4c27375/20240417095923/az_nova_backup_5896647f-996e-4c06-b216-2d114e73bb80-00012 HTTP/1.0 404 - python-swiftclient-4.5.0 gAAAAABmH50iSSHf... - 70 - txb0df610516c649dab1960-00661f9d9a - 1.8413 - - 1713347994.151212454 1713347995.992491245 0
```

object-server.log
```
Apr 17 09:59:54 np0037304716 object-server[83879]: ::1 - - [17/Apr/2024:09:59:54 +0000] "DELETE /1/168336/AUTH_3ad19f84b1e24a30976f937af80d7713/volumebackups/volume_44f5cfbd-2a8e-4491-a6f7-9454e4c27375/20240417095923/az_nova_backup_5896647f-996e-4c06-b216-2d114e73bb80-00012" 204 - "DELETE http://[::1]:8080/v1/AUTH_3ad19f84b1e24a30976f937af80d7713/volumebackups/volume_44f5cfbd-2a8e-4491-a6f7-9454e4c27375/20240417095923/az_nova_backup_5896647f-996e-4c06-b216-2d114e73bb80-00012" "txb0df610516c649dab1960-00661f9d9a" "proxy-server 77646" 0.0102 "-" 83879 0
Apr 17 09:59:55 np0037304716 object-server[83909]: Successful rsync of /srv/node/1/objects/168336/... to [::1]::object/3/objects/168336 (0.139)
Apr 17 09:59:55 np0037304716 object-server[83909]: Successful rsync of /srv/node/1/objects/168336/... to [::1]::object/2/objects/168336 (0.093)
Apr 17 09:59:55 np0037304716 object-server[83879]: ::1 - - [17/Apr/2024:09:59:55 +0000] "DELETE /2/168336/AUTH_3ad19f84b1e24a30976f937af80d7713/volumebackups/volume_44f5cfbd-2a8e-4491-a6f7-9454e4c27375/20240417095923/az_nova_backup_5896647f-996e-4c06-b216-2d114e73bb80-00012" 404 70 "DELETE http://[::1]:8080/v1/AUTH_3ad19f84b1e24a30976f937af80d7713/volumebackups/volume_44f5cfbd-2a8e-4491-a6f7-9454e4c27375/20240417095923/az_nova_backup_5896647f-996e-4c06-b216-2d114e73bb80-00012" "txb0df610516c649dab1960-00661f9d9a" "proxy-server 77646" 0.0007 "-" 83879 0
Apr 17 09:59:55 np0037304716 object-server[83879]: ::1 - - [17/Apr/2024:09:59:55 +0000] "DELETE /3/168336/AUTH_3ad19f84b1e24a30976f937af80d7713/volumebackups/volume_44f5cfbd-2a8e-4491-a6f7-9454e4c27375/20240417095923/az_nova_backup_5896647f-996e-4c06-b216-2d114e73bb80-00012" 404 70 "DELETE http://[::1]:8080/v1/AUTH_3ad19f84b1e24a30976f937af80d7713/volumebackups/volume_44f5cfbd-2a8e-4491-a6f7-9454e4c27375/20240417095923/az_nova_backup_5896647f-996e-4c06-b216-2d114e73bb80-00012" "txb0df610516c649dab1960-00661f9d9a" "proxy-server 77646" 0.0005 "-" 83879 0
```

Though we might be able to improve the swift's behavior, in general 404 in delete can be safely ignored. We should avoid leaving the backup in the deleting status but can contuse processing the deletion request.

Changed in cinder:
assignee: nobody → Takashi Kajinami (kajinamit)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/cinder/+/916055

Changed in cinder:
status: New → In Progress
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.