finalize_durable traceback and fragment lost during PUT with old x-timestamp
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Object Storage (swift) |
Fix Released
|
Undecided
|
Alistair Coles |
Bug Description
We've seen tracebacks from PUTs of EC fragments:
Jun 25, 2021 @ 15:50:31.205 object-server - - 2021-06-
Traceback (most recent call last):
File "/opt/ss/
res = getattr(self, req.method)(req)
File "/opt/ss/
resp = func(ctrl, *args, **kwargs)
File "/opt/ss/
File "/opt/ss/
timestamp)#012 File "/opt/ss/
File "/opt/ss/
File "/opt/ss/
raise exc
DiskFileError: Problem making data file durable /srv/node/
Jun 25, 2021 @ 15:50:31.199 object-server - - 2021-06-
Traceback (most recent call last):
File "/opt/ss/
files = os.listdir(
OSError: [Errno 2] No such file or directory: '/srv/node/
On investigation, the diskfile dir does not exist, and in some cases the object has less than ndata fragments due to multiple fragments experiencing this error during finalize_durable.
The proxy server should return 500 to client if less than ndata fragments are made durable. However, a HEAD of the object will return success because there are some durable fragments.
We were first alerted to this by seeing reconstructor warnings such as:
2021-07-
My hypothesis for the cause is:
* an agent PUTs EC object *with x-timestamp in the past, older than reclaim_age* (container-sync, reconciler or other agent might do this)
* object-server write non-durable to disk
* reconstructor (or other daemon, maybe auditor) passes over the hash dir and calls DiskFileManager
* cleanup_
* object-server call finalize_durable which generates the traceback and fails to make data durable
*data file lost
This is similar to previously reported bugs [1] [2], but I believe the scenario is different - particularly that this bug manifests when the PUT x-timestamp is already older than reclaim age.
[1] https:/
[2] https:/
Changed in swift: | |
status: | New → Confirmed |
assignee: | nobody → Alistair Coles (alistair-coles) |
Changed in swift: | |
status: | Confirmed → In Progress |
https:/ /review. opendev. org/c/openstack /swift/ +/800974