Successful DELETE of "ghost" object returns 404
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Object Storage (swift) |
New
|
Undecided
|
Unassigned |
Bug Description
I'm running swift version 2.26.0 (from Debian, package version 2.26.0-10).
For some reason (I've not got to the bottom of this yet), we have a number of "ghost" objects - they appear in container listings but if you GET/HEAD them you get 404.
If you DELETE one of these objects, though, swift still says 404, but does in fact successfully delete the object - it no longer appears in the container listing. I think swift SHOULD return 200/202/204 in this case, since the DELETE has in fact occurred successfully.
We found this out because the client we were using for some maintenance errors out on 404.
[one might quibble about whether 404 is the most correct response to GET/HEAD for this ghost objects, but that's a separate question]
One way you can get a ghost listing is you have an expired object (includes x-delete-at metdatadata) that hasn't yet been reaped by the object-expirer daemon (maybe becaues it's running behind or mis-configured ... do you monitor you .expiring_objects queue?). In this scenario using swift-get-nodes you would be able to discover the on-disk location and validate if there is still an object .data file or a tombstone with .ts
The other way you can get a ghost listing is you had a swift-container -server node with a containers database that was isolated from the cluster for longer than the configured reclaim_age (the default is only 7 days). After a reclaim_age the connected database servers will "reclaim" any "tombstone rows" indicating that an object was deleted. If an isolated database with a record of PUT at t1 rejoins the cluster and finds that all database records of the DELETE at t2 have been relcaimed the "missing" PUT at t1 will be replicated to the connected database servers w/o any tombstone row to prevent it. In this scenario the tombstone files in the object-data layer are likely also reclaimed, so you want find any object-data. But you might be able to determine which database were connected and isolated by examining the ROW_ID of the ghost rows.
FWIW s3api always returns 2XX for DELETE even if the object exists - but the swift API has always returned the status code that seemed most appropriate from what it observed when writing down the tombstone object for the delete (either there was a .data file => 2xx, or there was a tombstone or no data => 404). I agree it that we could "quibble" about the status - but it wouldn't be obviously helpful, and there may be barriers to changing the expected response from the swift v1 API to consider. Clients can probably infer, it's not such a big problem for DELETE to return "not found", but swiftclient at least may not be setting the best example here:
(nvidia) clayg@banana: ~/Workspace/ nvidia$ swift list test ~/Workspace/ nvidia$ swift delete test foo /stg-swiftstack -maglev. ngc.nvidia. com/v1/ AUTH_clayg/ test/foo 404 Not Found [first 60 chars of response] b'<html><h1>Not Found</h1><p>The resource could not be found.<' (txn: txe9e3279a72184 e69ba1f6- 0063d2a9f5) ~/Workspace/ nvidia$ echo $?
test
(nvidia) clayg@banana:
Error Deleting: test/foo: Object DELETE failed: https:/
(nvidia) clayg@banana:
1